Scottish Household Survey 2021: methodology and fieldwork outcomes

Methodology of the 2021 Scottish Household Survey and information on fieldwork targets and outcomes.

Data processing

Social data processing

The raw data was initially split into 3 files. Data from the 'other (write in)' variables and open-ended data was extracted for coding separately. Additionally, the variables used to produce NS-SEC variables were extracted into a separate file for coding[3].

The main data file was subject to checks and editing involving:

  • Range checks, confirming that all variables were within the acceptable limits established for the question concerned.
  • Simple logic checks ensuring the relationships between questions were logical. For example, that the number of people answering a filtered question is equal to the number of people giving the appropriate response at the filtering question.
  • Complex logic checks. These involved examining the relationships between variables and assessing the logic of combinations of responses. Combinations of age and working status, age and relationships to other household members, for example, were checked to assess the logic of someone aged over 60 years and coded as the child of another household member.

The data then underwent two additional processes. Firstly, the calculation of derived variables such as the household type, and secondly, the imputation of household income, housing costs and childcare costs. Details of the derived and imputed variables are provided in the supporting documents to this report. The edited data was delivered to the Scottish Government, who ran further checks on the data. Any data issues identified by the Scottish Government were discussed and, where necessary, corrected, and the data processing routines were amended.

Physical survey data validation

The data from the physical survey forms were uploaded into the physical survey validation system together with the photographs of each dwelling.

The validation system worked by applying a set of rules (the same rules as used in previous years) provided by the Scottish Government, to the raw data, to ensure the accuracy and validity of each item of data entered. This included range checks on all fields, detailed consistency checks making use of the redundancy built into the survey schedule and plausibility checks on all appropriate items. Rules cross-reference different parts of the survey form (e.g. if the dwelling is a house, then aspects of common dwelling section should not be completed; if the house is a flat, then details for common parts should be present).

Surveyors were shown a list of all the errors picked up by the validation program. Additionally, they were shown a list of all the entered data, with a description of the variable next to each bit of data, and with the data split into representations of each page of the form. The validation system showed the data and the failed edits as well as showing the photographs of the property.

Corrections were then made and each form rechecked until it passed all edits. Changes to the data were made simply by overtyping the incorrect data where it was displayed. Once a surveyor had completed validation, the data was forwarded to their Regional Manager for sign-off. Validation of each form was completed when all errors had been eliminated or a supervisor had determined that the dwelling genuinely falls outside the validation criteria. An audit trail of changes made to the data was kept.



Back to top