Scottish Household Survey 2020: methodology and impact of change in mode

The methodology report for the Scottish Household Survey 2020 telephone survey which discusses the impact of the change in mode.

This document is part of 2 collections

Chapter 3: Summary of previous literature on mode effects

This chapter gives a short summary of previous literature on mode effects and survey error by way of an introduction to how we explore the impact of the change of approach to the SHS on the quality of the survey estimates.

When we refer to mode it is important to note that there is a distinction between the mode that people are approached to take part in the survey and the mode of interview.

Table 3.1 Summary of mode difference by approach
Traditional SHS approach Revised Push-to-Telephone/video approach
Mode of approach Face-to-face (repeated calls, preceded by an advance letter and leaflet) Opt-in only: Postal invite (letter with leaflet plus two reminders) AND Telephone matched sample: Telephone recruitment (letter and leaflet followed by telephone recruitment where possible).
Mode of interview Face-to-face (in-home, CAPI) Telephone OR Video (Microsoft Teams).

The revised design for the SHS relied on approaching respondents in a different way from previously. Instead of interviewers visiting addresses face-to-face and persuading people to take part in conversation on the doorstep, either a) people opted-in via an online portal in response to advance letters or b) interviewers attempted to get agreement by telephone for the portion of the sample for which telephone numbers had been successfully matched to the sampled address.

The mode by which interviews were undertaken also changed. All interviews pre-lockdown were conducted face-to-face in-home. With no interviewer travel allowed, interviews in the revised design were conducted either by telephone or by video (one-way Microsoft Teams - so that the respondent could see the interviewer, but the interviewer could not see the respondent). The change in mode of interview may have shaped how people responded to questions. This is likely to have had the greatest impact on questions that relied heavily on showcards.

Mode effect and the Total Survey Error Framework

Mode effects can impact the quality of survey estimates in a number of ways. In assessing this, it is useful to refer to the Total Survey Error (TSE) Framework, the generally accepted approach for assessing survey quality. The TSE approach identifies all possible errors that can arise at each stage of the survey process and provides a systematic basis for structuring consideration of mode effects. The survey process is divided into two main strands: a representation strand and a measurement strand. The relationship between survey process and error type is shown in Figure 3.1.

Figure 3.1 Total Survey Error framework

This figure displays the Total Survey approach, identifying all possible errors in two strands. Firstly, measurement - this includes validity, measurement error and processing error. And secondly, representation - this includes coverage error, sampling error, non-response error and adjustment error.

Mode effects tend to impact survey estimates because of the difference they make to who responds and on what they report. That is, different modes of data collection often differ both in terms of coverage and nonresponse, on the one hand, and in terms of measurement error, on the other. We discuss each in turn.

Non-response error

Social survey samples are normally designed so that if everyone responded, the sample would be an accurate representation of the whole population of interest. Non-response bias is where those who take part in a survey are different from those who do not. This can mean that the survey participants are not representative of the whole population of interest. An example of this would be if interviewers only approached households during working hours. In this case, the likelihood of obtaining interviews with retired people would be considerably higher than the likelihood of interviewing the employed population, leading to skewed data.

Research that is dependent upon voluntary participation is always vulnerable to this type of bias and surveys such as the Scottish Household Survey are designed to reduce the potential for non-response bias. This is done by maximizing the response rate and trying to ensure that it is not more difficult for some groups than others to take part. The traditional face-to-face methodology required interviewers to make at least six visits to each address, on different days and at different times, to establish contact. Moreover, most cases that were unproductive at first issue were then reissued to a second and potentially a third interviewer to try to convert to a successful interview.

The SHS response rate has been consistently higher than the average achieved by other comparable surveys (See Figure 3.2) over the last decade.

Figure 3.2 Scottish Household Survey response rate over time compared to trend in all random probability surveys in Scotland/ UK

A graph showing the trend of response rate of the SHS (yellow) and all random probability surveys in Scotland/UK (black) over time between 2001 and 2019. The black line has declined more sharply (down 12%) than the yellow line (down 4%) in that time. The yellow line varies more from a straight line than does the black line but is consistently above the black line.

The wider literature on non-response bias and mode effects has emphasised that a high response rate does not necessarily create a quality, unbiased survey sample. Instead, it depends on the patterns of who participates. For example, Groves and Peytcheva (2008) make the distinction between three types of missing data: 'missing at completely random', 'missing at random', or 'non-ignorable'.

'Missing at completely random' means there is no consistent reason for nonresponse, and the reliability of the data is upheld, as the sample still maintains its random nature. An example would be if someone does not respond to a survey because it got lost in the mail. Provided every case had an equal chance of getting lost in the mail, then this is missing at completely random.

Data is 'missing at random' when there is a common cause for both nonresponse and key output variables. For example, being young may cause nonresponse, and it may also mean a person is likely to participate in sport. Therefore, if young people are less likely to respond, people who participate in sport will be under-represented.

'Non-ignorable' missing data happens when there is a consistent reason for non-response, and therefore a danger of excluding this subgroup from the sample, creating non-response bias. For example, if the reason for non-response is because some of the respondents cannot read, then this is non-ignorable, as illiterate people are now excluded from the sample. Similarly, if people who participate in sport are less likely to be contacted by interviewers (because they are at home less often) then this would also be 'non-ignorable'.

Overall, research concerning non-response bias generally agrees on the demographics of those who respond less frequently to surveys. They tend to be young, single, and in employment (Luiten, 2013; Foster, 1998; Lynn and Clark, 2002; Hall et al, 2011). This is mainly because these types of people are harder to contact. Good weighting strategies help to correct for patterns of differential response. However, weighting can only correct data 'missing at random', not 'non-ignorable' missing data.

These different types of missing data exemplify why higher response rates do not necessarily mean there will be less bias. A survey can have a low response rate without impacting on the accuracy of its estimates, as long as the unit non-response is missing at completely random or missing at random (provided weighting strategies are used to correct for the latter).

However, the higher the response rate, the less potential there is for non-response bias. While the traditional SHS approach is subject to non-response bias, weighted has ensuring that estimates appear to have been fairly robust. Moreover, because of the consistency of the SHS approach over time, and the relative consistency of the achieved response rate, the effect of non-response bias is likely to be reasonably consistent between waves. This means that changes in estimates are unlikely to be the result of changing non-response bias.

Face-to-face fieldwork almost always has a considerably higher response rate than other modes, such as telephone and postal. This is clearly seen in the SHS push-to-telephone-video approach. The overall response rate for the push-to-telephone/video approach was 20%.

The response rate for the opt-in only sample was 14.5%. The addresses without telephone numbers were entirely reliant on householders opting-in in response to the advance letters. With no possibility of interviewers visiting properties to persuade people to take part, it was inevitable that that there would be a considerable drop in the response rate. The design of the advance materials, and the introduction of incentives, became more central to encouraging response.

Where a telephone number had been matched to an address, interviewers were required to make least six telephone calls to establish contact. While this is similar to the face-to-face approach, the response rate was considerably lower compared with face-to-face response rates for the SHS, at 37%.

Previous research on both the Scottish Household Survey and the Scottish Crime and Justice Survey emphasises these points. Two recent methodological papers have examined the impact that lower response rates would have on SHS and SCJS estimates (Hutcheson, Martin and Millar (2020) & Martin (2020)). Both papers found that a response rate change of 5-10 percentage points would have made very little impact on the estimates themselves – both in terms of the absolute level and also as a share of normal survey error[10]. These findings echo previous findings[11], that the link between response rate and non-response bias is weak.

However, these papers explored the impact of varying only the response rate by a relatively small amount and keeping all other aspects the same[12]. Contrasting findings emerge from an earlier study on the Scottish Crime Survey. In 2003, following a "Fundamental Review" of the survey, McCaig and Leven (2003) suggested "that the revised SCS should contain a significant telephone survey element if the necessary scale of survey is to be acquired in a practicable way at an acceptable cost". The survey moved from face-to-face to a telephone approach, and this model was tested by running parallel face-to-face and telephone fieldwork. The calibration exercise found considerable evidence of substantial differences between the approaches that could not be accounted for, and concluded that "we have not found sufficient evidence to conclude that the telephone survey is likely to be accurately measuring victimisation. We have been unable to devise a weighting approach that satisfactorily corrects the many demographic biases that are observable in the data" (Hope 2005). The telephone element of the Scottish Crime Survey was subsequently dropped, and it returned to a traditional face-to-face approach.

The potential impact of non-response and other sources of error on SHS results has been examined in two other ways in the past. Firstly, by linking the census directly to the survey. The SHS was included in the Census-linked study of survey non-response carried out by ONS following the 2001 Census. This compared the census characteristics of different categories of responding and non-responding households to identify variables that are independently associated with non-response (Freeth and Sparks, 2004). It found that non-response overall was associated with particular local authorities, living in a flat, not containing a married or cohabiting couple, and having no educational qualifications and suggested that the weighting approach was updated to adjust for these effects. Since refusals accounted for a major part of non-response, the characteristics associated with total non-response were more similar to those associated with refusal than those associated with non-contact. For example, tenure was a significant predictor of non-contact but was not a significant predictor of non-response overall.

Secondly, by comparing estimates from the survey to estimates from other robust sources. Alternative high-quality sources are scarce, and the Census has been the main source used. The 2012 SHS Methodology and Fieldwork Outcomes report (Scottish Government, 2014) compared SHS estimates for tenure and property characteristics with the 2011 Census. It concluded that "the sample appears to be fairly robust in terms of variables associated with accommodation/property characteristics".

Figure 3.3 below shows the housing tenure trend in the SHS against census estimates and two administrative sources[13]. Note that the estimates for social rented data are based on dwellings rather than households and will include vacant stock. Additionally, some of the households who respond to the SHS or Census as "living rent-free" may be actually in social housing dwellings but may have interpreted having their housing costs fully covered by housing benefit as being "rent free" as opposed to renting from a social landlord. Overall, the gradual decrease of the size of the social rented sector and the growth of the private rented sector is seen in both the SHS estimates and the administrative data.

Figure 3.3 Comparison of tenure trends since 1999 from various different data sources.

This shows that the SHS estimates of tenure over time are close to those from administrative sources and to the census estimates in both 2001 and 2011.

Overall, while response rate should not be taken as a simple proxy for survey quality, the estimates from the standard face-to-face approach are likely to be more robust and less affected by non-response bias than estimates from the push-to-telephone approach.

Coverage error

Coverage error, like non-response error, has the potential to affect the representativeness of the survey data. It is bias that occurs when the sampling frame does not coincide with the target population.

For the normal face-to-face approach, the likelihood that bias is introduced from this type of error is very low. The target population of the SHS is all adults living in private households in Scoltand. The survey uses the small user Postcode Address File (PAF) as the sampling frame. Overall, the PAF is a good record of all private households in Scotland. It has previously been estimated that the number of addresses that should be on the PAF but are missing is small. In 1991, this was estimated at 2.2% in Scotland, and there is evidence that its coverage has improved over time (Loud, 2014)[14].

For the revised push-to-telephone/video approach, no new sample was drawn. The sample used consisted of addresses that had been drawn for the 2020 wave but had not been fully worked face-to-face before lockdown. In most local authorities, the sample was randomly assigned to months. As such the face-to-face sample and the sample used for the revised approach should both have been broadly representative and the change in approach should not have had an impact on the coverage error of the push-to-telephone/video approach. However, in a small number of the local authorites, the allocation of batches to months was undertaken with some manual intervention to aid fieldwork practicalities. This was to help ensure that the more remote addresses were allocated to Quarter 2 and 3. This means that the sample worked proir to lockdown under-represented remote rural areas. This is discussed further in Chapter 4.

However, the revised approach involves two linked samples – the opt-in only sample and the telephone matched sample – depending on whether a telephone number could be linked to an address. Given that it was possible to find telephone numbers for only 23% of addresses, and that some types of areas had considerably higher matching rates than others (as detailed in the next chapter), there is considerable potential for coverage error among the telephone-matched sub-sample. In other words, there is considerable likelihood that the telephone-matched sample does not accurately coincide with the population the SHS aims to sample (all private households in Scotland). Additionally, as the opt-in only sample is composed of only the addresses where we did not get a matching telephone number, it it is also likely to be subject to coverage error, with bias in the exact reverse direction to that in the telephone matched sample.

There is not an extensive literature on the interplay between mode and coverage error. Telephone surveys tend to be more prone to coverage error than face-to-face surveys because they tend to rely on Random Digit Dialing. This was highlighted in the Scottish Crime Survey experiment with telephone surveying[15] (Hope 2003). Indeed, one of the barriers to the greater use of telephone as the mode of approach for random pre-selected surveys is the lack of a sampling frame that has similar coverage to the PAF.

Measurement error

Measurement error is the difference between a respondent's answer and a true value. In survey research, responses are shaped by a number of factors: the skills of interviewers, the profile of respondents, the wording of survey questions, and the mode of data collection (Biemer and others, 1991). In the context of the change in approach to data collection on the SHS, the question of interest is whether the change in mode led to any changes in the way that respondents answered the interview questions.

Prior to lockdown, all interviews were conducted face-to-face in-home. Interviews in the revised design were conducted either by telephone or by one-way video interviewing, where the respondent could see the interviewer, but the interviewer could not see the respondent.

A number of potential mode effects are detailed in the literature. First, there is a social-desirability effect, where answers are adjusted to what respondents expect the interviewer wants to hear. These are strongest in face-to-face interviews, and weaker in online interviews. They also differ by type of question, and are stronger where a question covers topics perceived to be sensitive (Kreuter, Presser, & Tourangeau 2008).

Second, another difference is between interviewer-administered and self-completion surveys in relation to "don't know" response categories. These tend not to be read out to respondents or included on showcards in face-to-face or telephone surveys, but have to be either explicitly included or excluded in self-completion questionnaires (Dillman & Christian 2005). Given that both approaches were interviewer-administered, this is of less relevance to the SHS's change of approach.

Thirdly, and perhaps most importantly, are differences relating to whether information is transmitted visually or not. For example, interviewing by telephone normally involves the question and all possible answer categories being read out before respondents give their answer. This means that later answer categories are more likely to be remembered and chosen. This is known as a recency effect. In internet surveys and pen and paper self-completion, the opposite is the case, where respondents are more likely to choose the first answer category that appears on screen (Dillman & Christian 2005). This is known as a primacy effect. The SHS has traditionally used a sizeable number of showcards, which help mitigate recency effects. Questions that previously used showcards are potentially liable to be affected by the change in approach, particularly when interviews were undertaken by telephone and no visual cues were available.

As well as primacy and recency effects, other factors related to the interviewer-respondent interaction could shape responses. Although both the traditional SHS approach and the revised approach were interviewer-administered, the interaction between interviewer and respondent will have been quite different – for example, in relation to: the level of trust built; how much respondents retain full attention throughout the hour-long interview; how easy it is for interviewers to pick up visual cues that questions have been misinterpreted or have not been fully understood; and whether other people in the household are influencing what answers are given.

A common concept used to understand survey response effects is 'satisficing' (Kronsick 1991). This is based on the idea that answering survey questions requires a significant amount of cognitive work. Depending on the respondent's ability, their motivation and the complexity of the question, respondents may take shortcuts in responding (de Leeuw, 2005).

Separating the impact of measurement error from differences in sample composition is not straightforward. This has been done in a variety of ways in the past, all of which have advantages and disadvantages:

  • Using an experimental design, where some respondents change mode during an interview (Heerwegh 2009). This approach is not suitable for studies of the general population like the SHS.
  • Comparison of estimates with external 'gold-standard' estimates (de Leeuw 2005; Kreuter, Presser & Tourangeau 2008). This approach relies on the availability of such estimates, from sources such as the census or unbiased administration records.
  • Statistical modelling, with the aim of taking out any differences in sample composition and then comparing the results. This can be done by using regression modelling (Dillman et al 2009) or Propensity Score Matching (Lugtig et al, 2011).

In Chapter 6, we explore the impact of the change from face-to-face interviewing to using telephone and video on a range of different estimates in the SHS.


Mode of approach shapes patterns of response, which in turn influences the representativeness of the achieved sample. Lower response rates mean there is more potential for bias. However, the literature emphasises that this is not a given or a linear/straight forward relationship, that this differs between different types of survey, and that non-response bias can differ considerably between different types of estimate within the same survey.

The mode of interview, on the other hand, will shape how people respond to survey questions and how accurate their answers area. It is hard to quantify measurement error without using an experimental design. In the previous literature, no mode is favoured as a low measurement error mode, and different modes are better suited for some types of question than others.



Back to top