We are testing a new beta website for gov.scot go to new site

Scottish Environmental Attitudes and Behaviours Survey 2008 - Technical Report

Listen

2 SAMPLING APPROACH

2.1 This chapter details the sampling approach used, the weighting methodology, and summarises how the achieved sample matches estimates from the Scottish Household Survey ( SHS). However, before turning to the detail of the sampling approach, it is worth outlining the long running debate on the relative merits of quota and random pre-selected sampling approaches.

Random versus quota sampling

2.2 The main difference between random and quota sampling is the way in which respondents are selected to be included in a survey. In a random sample, the responding unit - the household and/or person within the household - is chosen at random. Therefore, each member of the study population has a known chance of being selected. It is generally assumed that a representative sample will be achieved if this method of sampling is used. As the selection of respondents is determined in advance and independent of respondent characteristics, the sample will be representative and provide reliable estimates on all variables. However, this is dependent on a high response rate, which in practice is not always achieved.

2.3 Quota samples use a different approach to selecting respondents and achieving representativeness. In quota sampling, representativeness is defined as achieving a sample that matches the population on a relatively small number of known population characteristics such as age, sex and working status. It is then assumed that a sample that is representative on these characteristics is also representative on other unknown characteristics - which for this survey are attitudes and behaviours in relation to the environment.

2.4 Whereas random samples should provide representativeness across all variables, quota samples can really only ensure representativeness on the variables used as quota controls and on variables that are closely correlated with them. On other variables, quota samples rely on an assumption of more general representativeness. Since respondents are selected into a quota sample on the basis of a relatively small number of characteristics, it is possible for a quota sample to match the population on these characteristics but to vary from population characteristics on other, non-quota characteristics. The extent to which this happens can be checked by comparing non-quota variables with external sources such as census data.

2.5 The advantage probability sampling has over quota sampling is that it underpins classical survey sampling theory which requires no further assumptions to be made about the sampling process. This theory can be used to demonstrate why probability samples are free of sample selection bias and can be used to calculate standard errors, confidence intervals, and the significance of differences between estimates.

2.6 If we were to appeal to sampling theory, we would suppose that a random sample would provide more reliable estimates than a quota sample. However, sampling theory is based on a perfect survey - one unaffected by problems of sample coverage, non-response, interviewer errors, sampling errors and so on. In practice, random samples are affected by these issues and these need to be considered when comparing sampling methods.

2.7 In addition, while sampling theory does not acknowledge the potential of non-random sampling designs to provide reliable estimates, the research industry knows from decades of practice - such as repeated comparison of surveys of voting intention with actual election outcomes - that quota samples are capable of providing estimates that match actual population figures with a high degree of accuracy.

2.8 A number of studies have compared results obtained from quota surveys with those from random probability surveys and other trusted data sources 3. The overwhelming message from these studies is that data from quota and random probability samples are, in the main, comparable: most comparisons reported in the referenced studies showed no or small differences between sample types. These studies present evidence suggesting that the number of significant differences arising from comparisons between probability sample results and quota sample results are in-line with chance expectation. While some real differences were found, most observed differences were not large enough to be of major practical concern given the purposes of the surveys.

2.9 Quota sampling methods may not have the full theoretical underpinnings of probability sampling methods, but they do have considerable empirical backing: in practice, they do generally work.

2.10 Quota samples also have one particular advantage over random pre-selected samples that is especially relevant to SEABS'08. As there is no requirement to make repeat visits to the same address, the number of miles that the interviewers need to travel - normally by car - to obtain the interviews is much less for quota surveys than for random pre-selected surveys. This means that a quota survey will have a much smaller carbon footprint than a random pre-selected survey of the same size.

Sampling approach

2.11 The survey was undertaken among a quota sample of the Scottish adult population (aged 16+), using a rigorous approach to ensure representativeness. The first stage of the sampling approach, the stratification and selection of Primary Sampling Units ( PSU) mirrors the processes used in probability designs. It is only at the second stage, where addresses would be selected in a probability sample, that the SEABS'08 sampling approach differs from a classic probability design.

2.12 The sample was drawn from the small user file from Postcode Address File ( PAF), expanded using the Multiple Occupancy Indicator to equalise the probability of dwellings being selected within properties appearing only once on the PAF. This contains addresses to which the Post Office delivers fewer than 25 items of mail a day and is the best available source for Scotland's household population.

2.13 Datazones were employed as the primary sampling units because of their links with the urban/rural classification. A proportionate sample was drawn from all Datazones in mainland Scotland and the larger islands of Skye, Mull, Uist, Lewis and Harris, Islay, mainland Orkney and mainland Shetland. Sampling units were selected with probability proportionate to household population, stratified by region and within region by urban/rural classification.

2.14 In order to minimise the clustering effect on the achieved sample, a relatively large number of primary sampling units were selected, with a relatively small target number of interviews set in each. Namely, 388 Datazones were selected with a target of 8 interviews in each 4.

2.15 Each individual Datazone was allocated a unique sample point number; and, in each, interviewer quotas were based on three demographic variables, and one key behavioural variable:

  • Sex (two bands: male and female)
  • Age (four bands: 16 to 24; 25 to 34; 35 to 54; and 55 and over)
  • Working status (two bands: working full time and not working full time)
  • Car ownership (two bands: car owned by household or no car in household) 5

Weighting

2.16 The data was weighted to ensure that the achieved sample on the quota variables was in line with the population in the sample frame using rim weighting. A small number of cases were missing information on the quota variables (N < 10 for each variable). In these instances, a weight of 1 was given to the case.

2.17 Table 2.1 shows the weighted and the non-weighted profile of the achieved sample on the four quota variables and compares these with the characteristics of the Scottish population given by the 2001 census. Women, those without a car available to the household, and those aged 25 to 34 years were slightly under-represented in the unweighted profile compared to the weighted profile.

2.18 Overall, as the quotas were almost always met, the effect of the weights was small, with the weights ranging from 0.81 to 1.41. As noted in the annex to this technical report, the effect of the weighting on the precision of the survey estimates is very small, and would increase the standard errors of survey estimates by 1% (a design factor of 1.01).

Table 2.1: Weighted versus unweighted profile of the sample

SEABS 2008
Unweighted profile

SEABS 2008
Weighted profile

2001 census data

%

%

%

Sex

Male

49

47

47

Female

51

53

53

Car in household

Car in household

71

66

66

No car in household

29

34

34

Age

16-24

13

14

14

25-34

14

17

17

35-54

36

36

36

55 and over

36

33

33

Economic status

Working full time 6

42

42

42

Not working full time

58

58

58

2.19 The survey was weighted to estimates from the 2001 census rather than any other potential sources of population information such as the Scottish Household Survey ( SHS). The SHS suggests that car availability in Scotland has increased since 2001 with the latest published SHS figures 7 estimating car availability at 70%, compared to a figure of 66% in the 2001 census.

2.20 An alternative weighting strategy would have been to use the SHS estimate of car availability across Scotland to weight the data. The effect of using the SHS estimate of car availability in the weighting approach was tested on over 50 findings in SEABS'08.

2.21 Almost all estimates would be completely unaffected by such a change to the weighting approach. Where change would occur, with the exception of patterns of travel by car, this would be by one per cent at the most. The alternative strategy would change the estimate for the proportion of workers who drive to work from 56% to 58%, the proportion of people who drive to do their main grocery shopping from 54% to 57%, and the proportion of people who drive most days from 45% to 47%.

2.22 The weights used for the main report did not take account of any or differential non-response to the CASI section of the questionnaire by the quota variables. Overall, 13% of respondents did not complete the CASI section of the SEABS'08 questionnaire. However, analysis undertaken suggested that the effect of differential non-response was minimal, with almost all estimates unaffected and the maximum effect being less than one per cent 8.

Achieved sample

2.23 The target number of interviews for the survey was 3,000 and the total number achieved was 3,054.

2.24 As noted previously, since respondents are selected into a quota sample on the basis of a relatively small number of characteristics, it is important to confirm not only that the distribution of the quota variables match the population, but also that non-quota characteristics are in line with other estimates. Table 2.2 illustrates the weighted profile of the sample on six other variables - tenure, method of travel to work, highest qualification, economic status, dwelling type and household type - that can be compared against the latest published results in the Scottish Household Survey 9. Overall, there is little difference in the estimates from each source. As Table 2.2 shows, the sample appears particularly robust with regard to highest qualification achieved, economic status and dwelling.

2.25 Compared to the SHS, the largest differences are seen by tenure, travel to work, and household type. With regard to tenure, while the SHS gave an estimate of 66% for owner occupation, the estimate for SEABS'08 was slightly lower at 62%. Additionally, SEABS'08 under-estimates people who drive to work (56% versus 63%), and single pensioner households (11% versus 16%).

2.26 The annex to the technical report presents further analysis of the representativeness of the achieved sample. It suggests that, "the quotas have gone a long way to making the sample representative" and, that while some differences remain, "these differences are small and unlikely to impact greatly on the results".

Table 2.2: Percentages on selected characteristics in achieved sample compared with SHS 2007 estimates

SEABS 2008
Weighted profile

SHS
2007

%

%

Tenure

Owner occupied

62

66

Social rented

26

24

Private rented

9

8

Other

3

2

Travel to work

Walk

16

12

Drive

56

63

Lift

6

6

Bike

2

2

Bus

12

12

Rail

5

4

Other

2

2

Highest qualification level

Degree, Professional Qualification

25

26

HNC/ HND or equivalent

9

11

Higher, A level or equivalent

14

17

O Grade, Standard Grade or equivalent

26

23

None/other

24

22

Unknown

2

1

Economic status

Self employed

5

6

Full time employment

37

35

Part time employment

9

11

Looking after home/family

8

7

Permanently retired from work

24

27

Unemployed and seeking work

6

3

At school

1

1

Higher/further education

5

4

Government work/training scheme

*

0

Permanently sick or disabled

4

5

Unable to work due to short term ill-health

1

1

Other

0

1

Dwelling type

House

68

67

Flat

31

32

Other

1

1

Household type

Single adult

14

16

Small adult

17

17

Single parent

5

6

Single pensioner

11

16

Small family

17

13

Older smaller household

14

16

Large adult household

13

9

Large family

9

7