Appendix 1. Reweighting the Family Resources Survey: Sources and Methods
Our modelling uses data from the Family Resources Survey ( FRS) ( DWP 2017a) and (for child poverty measure 4) Understanding Society ( USoc) (Social and Economic Research 2018). An important task for this project is to take these datasets and adjust them so that they better resemble forecasts of what the population will look like in future years.
There are three techniques we can use in producing the forecast datasets:
1. re-weighting. In most microsimulation work, some units (households, family units or individuals) are given a higher weight in final outputs than others. The typical starting point for a single-period model is to use weights that are inversely related to the probability of selecting the individual in a random sample, with some adjustment for non-response. In our case, we go further, using weights to simulate the populations in future years. If we believe there will be more pensioners 10 years from now, for example, we can capture this by giving each pensioner in the dataset a higher weighting for that year;
2. uprating. If we expect incomes or housing costs to rise, we increase the values recorded for these things in the dataset;
3. artificial ageing. An alternative technique to reweighting is to progressively rewrite the data: for each simulated year, you increment the ages of the individuals in the dataset, and then impute a new state by applying statistical models giving the probability of events such as death, illness, unemployment, etc. Artificial ageing is most useful when we want to track individuals or households over many years, for example to model pension contributions or cumulative payments for social care. However, the technique is hard to implement and extremely data-intensive, requiring modelling of such varied items as spells of employment, childbirth, education choices, improvement and deterioration of health and much else; each these could be a major research project in itself. Given the inherent difficulties in this technique we decided not to use it in this project, and we are not aware of any other UK-based poverty forecasting research which has used it.
Our forecasting therefore uses a combination of weighting and uprating. However, both weighting and uprating still pose considerable challenges, which we now turn to.
There are well-established techniques for reweighting micro datasets.  If we wish to reweight to match just one characteristic, for example age, we can just weight by the target population totals divided by the sample totals; for example, if we expect there to be 100,000 five-year olds in 2030, and there are 100 five-year-olds in the data, we can weight each sample unit by 1,000. For this project, however, we want to capture a population evolving in multiple dimensions; for example the current HBAI grossing regime for Great Britain as specifies private household population by region, age and sex, number of benefit units with children, number of lone parents, households by tenure type, households by council tax band and number of households containing "very rich" people ( DWP 2014).
Reweighting for multiple characteristics involves finding a set of weights that allow the weighted sample to sum to all the target totals, with the weights being as close to uniform as possible, on some measure of closeness. (Creedy 2003).
Forecast Data Sources
The Scottish and UK Government produce forecasts of population, employment, numbers of households, and the wider economy. These are used in planning housing, education, transport and much else. It is important that our forecasts are are as far as possible consistent with the official ones. This is not absolutely straightforward, however: creating a forecast dataset that is sufficiently rich requires merging together data from different sources, produced at different dates, and additional data processing is needed to make everything internally consistent. Further, since we can only use things that are officially forecast, our set of population targets are necessarily somewhat less rich than would be possible when weighting to hit current targets, as in the HBAI exercise. We also need UK-wide forecasts since our poverty lines are calculated for the whole UK.
Our final set of forecasts run annually  from 2017 to 2038, and has data on
- population (by age and sex);
- household composition; and
- employment and unemployment.
See the final section of this appendix for the full list.
Projections of the Scottish Population are available from National Records for Scotland ( NRS) ( NRS 2017), and for the UK from the Office for National Statistics ( ONS) ( ONS 2017). We use the latest 2016-based projections, since these are used in the Scottish Fiscal Commission's macroeconomic forecast (Commission 2018).
As we discussed in section 1, NRS and ONS produce population forecasts on a variety of assumptions - high fertility, low migration, as well as some variants that attempt to capture various post-Brexit scenarios  . We have constructed sets of weights consistent with all of them.
Household composition is also important. In particular, there is a strong association between single parenthood and poverty  . NRS produces projections for household composition (Scotland 2017). However, at the time of writing these are based on earlier, 2014-based population projections, and for some but not all the population projection variants. We make a correction for the differing base year by weighting the household projections by changes in the population projections between our 2016 edition and the 2014 edition.
For our UK-wide forecasts there is an additional issue to address. Household forecasts are produced by the four devolved governments ( ONS 2016b; Government 2017; Agency 2016). Each uses a different set of household compositions, and the Northern Irish series is based on earlier (2012) population projections. The different breakdowns unfortunately have little overlap: the only consistent breakdown common to all four is simply:
- one adult households;
- two adult households;
- all other households, including all households with children.
In constructing our weights we have focused on hitting the correct household composition totals for Scotland (e.g. the correct numbers of single parents in Scotland). This does imply that we are not guaranteed to hit the equivalent totals for the UK, but given that our focus is on producing accurate child poverty forecasts for Scotland, we decided that this was an acceptable trade-off.
Employment and Unemployment
The SFC produces projections of labour force participation, employment and unemployment, as a percentage of the over 16s ( SFC 2017, Table S2.3). We apply the percentages in that table to each of the NRS's population forecasts discussed above  . The forecasts run till 2022/23; we hold the end period percentages constant for all later periods. OBR produce comparable data for the UK as a whole (Responsibility 2017 Supplementary Economy Tables 1.6) which we apply similarly to the ONS UK population forecasts.
Table A.1 below gives an example of how our reweighting algorithm performs on the FRS data.
The second and fourth columns show our forecast numbers for each of our targets for the years 2017 and 2031 respectively. The third and fifth column show how far our away from this our pooled FRS dataset is, when we weight each FRS household equally  . Cases that are underrepresented by more than 10% are coloured red, and those overrepresented are coloured green.
Table A.1. Performance of weighting algorithm on the pooled Scottish FRS dataset
We see three main potential weaknesses of our approach. None of these invalidates the results but they are worth mentioning in the interests of full disclosure.
First, although our reweighting algorithm ensures that we can weight our data to hit our target set, there is no guarantee that we thereby get the distribution right of other characteristics that might be important but for which we have no forecast. We don't weight by disability status, for instance, and there's no a priori reason that the weights we use will move our numbers of disabled in the right direction, especially in the longer run. However, without additional forecast data on the other population characteristics (e.g. disability) this approach is the best that can be achieved with the data we have.
Second, we don't explicitly try to capture changes in the distribution of income. This could be important as the poverty measures are highly sensitive to small changes in the income distribution around the poverty lines, even if mean and median incomes are unchanged. The combination of our weighting and uprating do change the income distribution in the model somewhat, however (if we have a higher proportion of employed people in future, for instance), but is not something we are explicitly modelling.
Third, as we push out further into the future, the weights get increasingly dispersed, so some households have a very high weight, and others a very low one. Figures A1.1 and A1.2 illustrate this: they show the distribution of the weights needed to gross up our pooled 2012-15 FRS Scottish subsample so as to meet our target set in 2016/17 and 2031/32.
Figure A1.1. Distribution of weights required to meet population targets for pooled FRS Scotland sample in 2016/17
Figure A1.2. Distribution of weights required to meet population targets for pooled FRS Scotland sample in 2031/32
It is clear that the dispersion is increasing with time. This is an additional source of uncertainty in our forecasts, but one that is difficult to quantify.
2016 - Based Scottish Projections
- principal projection;
- high population;
- high fertility;
- low population;
- low fertility;
- high life expectancy;
- moderately high life expectancy;
- moderately low life expectancy;
- low life expectancy;
- high migration;
- low migration;
- 0% future EU migration (not National Statistics);
- 50% future EU migration (not National Statistics);
- 150% future EU migration (not National Statistics);
- zero net migration (natural change only).
Final Weighting Target List
- One Adult Male Household
- One Adult Female Household
- Two Adults Household
- One Adult, One Child Household
- One Adult Two Plus Children Household
- 2+ Adults with Children Household
- Three+ Adults, no children Household
- Employed, Inc. Self-Employed
- ILO Unemployed
- 0-4 Male
- 5-10 Male
- 11-15 Male
- 16-19 Male
- 20-24 Male
- 25-29 Male
- 30-34 Male
- 35-39 Male
- 40-44 Male
- 45-49 Male
- 50-54 Male
- 55-59 Male
- 60-64 Male
- 65-69 Male
- 70-74 Male
- 75-79 Male
- 80+ Male
- 0-4 Female
- 5-10 Female
- 11-15 Female
- 16-19 Female
- 20-24 Female
- 25-29 Female
- 30-34 Female
- 35-39 Female
- 40-44 Female
- 45-49 Female
- 50-54 Female
- 55-59 Female
- 60-64 Female
- 65-69 Female
- 70-74 Female
- 75-79 Female
- 80+ Female
Note: for UK-wide weighting, the households targets are simplified to:
- One Adult Household
- 2 Adult Household
- All other households, including all households with children.