Publication - Statistics

Scottish Surveys Core Questions 2017

Published: 2 Apr 2019

Official statistics publication on equality groups across a range of measures from harmonised questions in the major Scottish Government population surveys.

57 page PDF

3.1 MB

57 page PDF

3.1 MB

Scottish Surveys Core Questions 2017
6 Technical Notes

57 page PDF

3.1 MB

6 Technical Notes

6.1 Source Surveys and Core Questions

Results from the three large-scale Scottish Government population surveys are published separately as National Statistics:

Further information on Population Surveys in Scotland can be found on the SG website.

Since the beginning of 2012 each of the surveys included a set of 20 core questions that provide information on the composition, characteristics and attitudes of Scottish households and adults across a number of topic areas including equality characteristics, housing, employment and perceptions of health and crime. Responses on these questions from all three surveys have been pooled to provide the Scottish Surveys Core Questions (SSCQ) dataset with a sample size of around 20,000 responses.

Full details of the harmonised questions are available online and questionnaires are provided on the websites of each of the individual surveys.

Due to the different sampling nature of each survey, which is necessary to meet their primary aims, the number of respondents varies between different SSCQ questions. The questions were hence batched into three groups: household questions, individual questions and crime questions, and three different sets of weights calculated to ensure representative results. Sampling, weighting and pooled sample numbers are described separetely for each survey below.

Scottish Crime and Justice Survey (SCJS) technical notes

Sampling, survey response and weighting are described in full in the SCJS technical report. Briefly, the survey consists of a simple random sample, designed to achieve a robust sample at national and subgroup level. The target samples size at national level is 6,000 interviews per year. One random adult per household is interviewed and asked all SSCQ and SCJS questions.

Scottish Health Survey (SHeS) technical notes

Sampling, survey response and weighting are described in full in the SHeS 2017 technical report. The SHeS sample is clustered in each calendar year and unclustered over four years. All adults and up to two children in each household are eligible for interview. Only one adult in each household was asked the crime and household questions, to remain in line with the SCJS sampling procedure. The SHeS sample is boosted by participating health boards. It is further boosted to interview children in further households. These households are excluded from the SSCQ dataset.

Scottish Household Survey (SHS) technical notes

Sampling, survey response and weighting are described in full in the SHS technical report. The SHS consists of a simple random sample with a target minimum effective sample size of 250 per local authority. The SSCQ household questions are answered by the highest income householder or their spouse/partner, and one adult is randomly selected to answer the individual and crime questions, in line with the other two surveys.

6.2 Weighting

Datasets from the three source surveys were combined into three new SSCQ datasets: SSCQ household variables (19,220 responses), SSCQ individual variables (18,984 responses) and SSCQ crime variables (17,756 responses), see Table 19.

Each variable response category in each of the surveys carries a different design effect. If we were solely seeking the most efficient estimate for each variable separately, then separate scale factors could be derived for each one. However, this would restrict the use of the dataset. Rather, for each constituent survey dataset the design effects were estimated for each category and then the median design effect over all categories was used as the representative design effect of that survey. These design effects were then used along with the sample sizes to calculate the effective sample sizes (neff) and scaling factors for combining the three datasets.

Table 19: Numbers of sample and effective sample pooled from the source surveys

sample neff sample neff sample neff sample neff
Household responses[1] 5,475 4,970 3,062 1,879 10,683 8,793 19,220 15,642
Individual responses[2] 5,475 4,110 3,697 2,107 9,812 6,430 18,984 12,647
Crime responses[3] 5,475 3,959 2,469 1,135 9,812 6,264 17,756 11,358

To combine the data the scale factors were applied to the grossing weights for the individual surveys (described in section 6.1). The neff of each survey contribution formed the basis for the scaling factors: 

survey A weight scaling factor = neff (surveyA) / (sum of three survey neffs).

The weights were then re-scaled to be proportionate to effective sample size contribution of each survey and used as pre-weights. The three pooled SSCQ datasets were then weighted again to be representative of population estimates. See SSCQ Weighting tables.

6.3 Confidence Interval Calculations

All three source surveys are stratified to ensure sufficient sample sizes in smaller local authorities. SHeS is clustered in each annual fieldwork period and, while this effect cancels out over each four-year period, must be accounted for in producing annual results.

Confidence intervals have been calculated using a method to account for stratification and clustering (surveyfreq in SAS). Confidence intervals across all subgroup estimates in SSCQ are provided in the accompanying supplementary tables.

Confidence intervals are plotted on point estimates on all charts and figures in this report. If the intervals surrounding two different point estimates do not overlap then there is a significant difference between the two points, but if they do overlap it does not necessarily mean there is no significant difference (see further guidance).  In the report text the term “significant” refers to “statistically significant” differences.

A comparison of estimates of key variables across the three constituent surveys and the SSCQ are provided in Annex A.

6.4 Statistical Disclosure Control

All estimates based on one or two respondents and displayed in main and supplementary tables have been denoted with ‘*’ to safeguard the confidentiality of respondents with rare characteristics. Cells with true zero counts are denoted with ‘.’ throughout, unless denoted ‘*’ as part of disclosure control. 

For individual variables crossed with individual variables (e.g. Ethnic group by Religion), further cells with zero or low respondent numbers in the same row and column as the single response have also been suppressed with ‘*’ to ensure confidentiality. For household and geographic variables, only one further cell in the same row was suppressed, as these cross-tabulations are not transposed. 

6.5 Presentation of Data on Country of Birth

Due to errors in coding survey fieldwork, the country of birth for individuals outside of the UK countries and Ireland were not recorded for ~400 respondents of the Scottish Crime and Justice Survey in 2017. This complicated their assignment to country of birth in the “Rest of the EU” or in the “Rest of the World”. We assigned respondents with “White: Polish” ethnicity to the “Rest of the EU” category, based on the country of birth of nearly all other survey respondents with this charachteristic. We imputed the remaining 331 respondents’ country of birth category with a logistic regression model based on correllating variables (ethnic group, religion, tenure, age, urban-rural area). Those born in the Ireland were excluded from the “Rest of the EU” group prior to the logisitc regression as they had been correctly coded.

6.6 Presentation of Data on Religion

Table 20: Grouping of religion in the SSCQ 2017

Base Collection Categories Sample SSCQ Groups Sample
None 8845 None 8845
Church of Scotland 5237 Church of Scotland 5237
Roman Catholic 2577 Roman Catholic 2577
Other Christian 1637 Other Christian 1637
Muslim 201 Muslim 201
Buddhist 55 Other 423
Sikh 28
Jewish 30
Hindu 63
Pagan 26
Another religion 221

6.7 Presentation of Data on Ethnic Group

Table 21: Grouping of ethnic group in the SSCQ 2017

Base Collection Categories Sample SSCQ Groups Sample
A - White - White Scottish 14908 White: Scottish 14908
A - White - Other British 2428 White: Other British 2428
A - White – Polish 281 White: Polish 281
A - White – Irish 171 White: Other 709
A - White - Gypsy/Traveller 4
A - White - Any other white ethnic group 534
C - Asian, Asian Scottish or Asian British - Pakistani, Pakistani Scottish or Pakistani British 109 Asian 355
C - Asian, Asian Scottish or Asian British - Indian, Indian Scottish or Indian British 111
C - Asian, Asian Scottish or Asian British - Bangladeshi, Bangladeshi Scottish or Bangladeshi British 9
C - Asian, Asian Scottish or Asian British - Chinese, Chinese Scottish or Chinese British 65
C - Asian, Asian Scottish or Asian British - Other Asian, “Asian” Scottish or “Asian” British 61
B - Mixed or Multiple Ethnic Group - Any mixed or multiple ethnic groups 42 All other ethnic groups 269
D - African - African, African Scottish or African British 73
D - African - Other African background 21
E - Caribbean or Black - Caribbean, Caribbean Scottish or Caribbean British 7
E - Caribbean or Black - Black, Black Scottish or Black British 7
E - Caribbean or Black - Other Caribbean or Black background 2
F - Other Ethnic Group - Arab, Arab Scottish or Arab British 30
F - Other Ethnic Group – Other 87

6.8 Mental Wellbeing Scoring

Wellbeing is measured in the Scottish Health Survey using the Warwick–Edinburgh Mental Wellbeing Scale (WEMWBS) questionnaire[2]. It has 14 items designed to assess: positive affect (optimism, cheerfulness, relaxation) and satisfying interpersonal relationships and positive functioning (energy, clear thinking, self-acceptance, personal development, mastery and autonomy).[3] The scale uses positively worded statements with a five-item scale ranging from '1 - none of the time' to '5 - all of the time'. The total score is the sum of these responses across the 14 questions. The scale therefore runs from 14 for the lowest levels of mental wellbeing to 70 for the highest.

SWEMWBS is a shortened version of WEMWBS which is Rasch compatible. This means the seven items included have undergone a more rigorous test for internal consistency than the 14 item scale and have superior scaling properties. The seven items relate more to functioning than to feeling and therefore offer a slightly different perspective on mental wellbeing[4]. However, the correlation between WEMWBS and SWEMWBS is high at 95.4%. The SWEMWBS scale runs from seven for the lowest levels of mental wellbeing to 35 for the highest.

SWEMWBS statements are as follows:

  • I've been feeling optimistic about the future
  • I've been feeling useful
  • I've been feeling relaxed
  • I've been dealing with problems well
  • I've been thinking clearly
  • I've been feeling close to other people
  • I've been able to make up my own mind about things

Peaks at multiples of seven are produced by column effects, where respondents are more likely to place answers down a column giving the same response for each question. SWEMWBS scores undergo a metric conversion[5] to correct somewhat for this effect and produce a distribution that is closer to normal, also reducing the boundary effect at the scale maximum of 35. 

6.9 Age Standardisation

When comparing sub-groups for a variable on which age has an influence, differences in age distributions are likely to affect any observed differences between groups. Age standardisation enables groups to be compared after adjusting for the effects of differences in their age distributions. 

Age standardisation was carried out using the direct standardisation method: the age distribution of sub-groups was adjusted was the mid-2017 population estimates for Scotland. All age standardisation has been undertaken separately for each gender. 

The age-standardised proportion p' was calculated as follows, where pi is the age specific proportion in age group i and Ni is the standard population size in age group i: 

Mathematical Equation

Therefore p' can be viewed as a weighted mean of pi using the weights Ni

Age standardisation was carried out using the age groups: 16-24, 25-34, 35-44, 45-54, 55-64, 65-74 and 75 and over broken down by gender. 

The variance of the standardised proportion can be estimated by:

Mathematical Equation

The populations used for age standardisation are the same as those used for weighting. See the associated Weighting Base tables for details.

6.10 Statistical Tests

Statistical tests are used throughout this publication to determine whether apparent differences are statistically significant. 

For ordinal or categorical variables, a logistic regression model is used to determine whether differences between subgroups are statistically significant. Testing is relative to a “reference group” which is always the largest subgroup (see Guide to this report). This is performed using proc surveylogistic in SAS to account for the complex design of SSCQ.

To determine changes over time we use a similar technique, coding data years as a continuous integer variable. 

  • Change “from 2016” excludes data prior to 2016 and regresses year against the indicator variable overall or within subgroup domains or geographical areas. 
  • Change “from 2012” (or 2014) retains all data years (i.e. not testing 2012 (or 2014) against 2016) and indicates whether a trend exists over the longer time base.

To determine whether a change over time is statistically significant, we examine adjusted chi-squared statistics and odds ratio confidence limits. We require 95% confidence. Odds ratio confidence intervals, which indicate the strength of the signal, are required to exclude the value of 1 (either to lie above or below equal odds) with the same 95% confidence bounds. In cases where the two indicators disagree (i.e. where the odds ratio interval includes the value of 1 but the p-value is below 0.05, or p-value exceed 0.05 but the signal is strong) are taken not to be statistically significant.

SWEMWBS is the only continuous indicator variable in SSCQ. A regression analysis is implemented using SAS proc surveyreg to account for the complex survey design. Testing is relative to a reference category which is always the most populated subgroup in the domain.

Formal testing between subnational geographies is produced using contrasts to compare the area in question with the combined total of all other areas.