CHAPTER 1: METHODOLOGY AND RESPONSE
Joan Corbett, Shanna Dowling, Kevin Pickering and Lisa Rutherford
1.1.1 The Scottish Health Survey series
The Scottish Health Survey (SHeS) series was established by the Scottish Office in 1995 to provide data about the health of the population living in private households in Scotland. The 1995 and 1998 surveys were carried out by the Joint Health Surveys Unit of the National Centre for Social Research and the Department of Epidemiology and Public Health, University College London Medical School (UCL). In 2003, a third organisation, the MRC Social and Public Health Sciences Unit at the University of Glasgow (MRC SPHSU) joined the consortium.
During 2005 and 2006 a comprehensive review of the survey was carried out by the then Scottish Executive.1 One of the key recommendations to emerge from the review was that the survey should be carried out on a more frequent basis. This recommendation was adopted and the survey began running continuously in 2008. A consortium made up of ScotCen Social Research, UCL and MRC SPHSU carried out the 2008-2011 surveys.
Each survey in the series consists of a set of core questions and measurements (for example, anthropometric and, if applicable, blood pressure measurements and analysis of blood and saliva samples), plus modules of questions on specific health conditions. As with the earlier surveys in the series, the principal focus of the 2011 survey was cardiovascular disease (CVD) and related risk factors. CVD is one of the leading contributors to the global disease burden. Its main components are ischaemic heart disease (IHD) and stroke. IHD is the second most common cause of death in Scotland after cancer.2 The SHeS series means that there are now trend data going back 16 years; providing the time series is an important function of the survey.
1.1.2 Key features of the survey methodology 2008-2011
A number of changes to the survey methodology were proposed during the series review and were adopted for the 2008-2011 surveys. The key changes to the survey methodology introduced in 2008 were:3
- Move to a continuous format
- Reduced Stage 2 nurse interview
- Core and modular questionnaire structure
- Unclustered sample design
- Optional NHS Health Board boost
1.1.3 The 2011 survey
The 2011 SHeS was designed to provide data at the national level about the population living in private households in Scotland. The age range for the survey in 2011 was those aged 0+.
An initial sample of 10,431 addresses was drawn from the Postcode Address File (PAF). These addresses were comprised of three sample types: 7,971 formed the main sample, at which adults and up to two children per household were eligible to be selected for interview; 1,944 addresses formed an additional child boost sample at which only households containing children aged 0-15 were eligible to participate (up to two children at these households were eligible to be interviewed); the remaining 516 addresses formed the Health Board boost sample at which only adults were eligible for interview. Fife and Grampian Health Boards opted to boost the number of adults (16+) interviewed in their area in 2011.
The 10,431 addresses were grouped into 439 interviewer assignments, with around 37 assignments being issued each month to interviewers between January 2011 and December 2011.
|Sample type||Number of addresses issued in 2011|
|Health Board Boost||516|
Data collection involved a Stage 1 interview, and if applicable, adults also had a follow-up interview from a specially trained nurse. Of the 7,971 main addresses issued, 2,180 were flagged as the 'nurse sample'. At these addresses all adults (16+) interviewed at Stage 1 were eligible to take part in the Stage 2 follow-up nurse interview. There were no nurse interviews at the remaining addresses or for the child boost or health board boost samples.
1.1.4 The 2011 reports
The 2011 SHeS report consists of three volumes, published as a set as 'The Scottish Health Survey 2011'. Volume 1 presents results for adults; Volume 2 presents results for children and Volume 3 provides methodological information and survey documentation. All three volumes are available on the Scottish Government's SHeS website along with a short summary report of the key findings from Volumes 1 and 2 (www.scotland.gov.uk/scottishhealthsurvey).
1.1.5 Comparisons with previous surveys in the SHeS series
This report is based on data collected in all the survey years to date (1995, 1998, 2003, and 2008 to 2011). It takes advantage of the continuous sample design since 2008 to include analysis based on a number of pooled datasets:
- The 2008, 2009, 2010 and 2011 surveys combined - this enables more detailed analysis of sub-groups to be conducted, for example by age group or socio-economic groups.
- The 2008/2009 and 2010/2011 surveys combined - these enable short-term trends to be examined, while still providing greater precision for the estimates than is the case with the single years' figures.
- The 2009 and 2011 surveys combined - some topics, such as accidents, were only included in the 2009 and 2011 survey years. The combined sample allows more detailed reporting of sub-group differences.
1.1.6 Health Board level analysis
The sample for the 2008-11 surveys was designed to produce a large enough sample to allow analysis at NHS Health Board level for all Boards every four years. The publication of the 2011 data gives us the first opportunity since 2003 to publish results for all fourteen NHS Boards in Scotland. This report is accompanied by a set of web tables and an interactive mapping tool breaking down the key results by NHS Board. The web tables can be accessed via the Scottish Health Survey website (www.scotland.gov.uk/scottishhealthsurvey).
1.1.7 Access to data
The 2011 SHeS data will be deposited at the Data Archive at the University of Essex, from where earlier years' datasets and combined years datasets can also be obtained (www.data-archive.ac.uk).
1.2 sample design
1.2.1 Overview of the main sample
The 2011 survey followed the methodology of the 2008, 2009 and 2010 surveys in using a two-stage stratified probability sampling design with datazones selected at the first stage and addresses (delivery points) at the second.
Three samples were selected for the survey:
1. a general population (main) sample in which all adults (16+) and up to two children (aged 0-15) were eligible to be selected in each household;
2. a child boost sample in which up to two children (aged 0-15) were eligible to be selected in each household; and,
3. a Health Board boost sample in which all adults (16+) were eligible to be selected in each household (in Grampian and Fife).
The sample of addresses was selected from the small user Postcode Address File (PAF). This is a list of nearly all the residential addresses in Scotland and is maintained by The Royal Mail. The population surveyed was therefore people living in private households in Scotland. People living in institutions, who are likely to be older and, on average, in poorer health than those in private households, were not covered. The very small proportion of households living at addresses not on the PAF was not covered.
All areas of Scotland where fieldwork could feasibly be carried out were covered, but some inhabited islands with very small populations were excluded. The inhabited islands that were included were mainland Orkney, mainland Shetland, Lewis, Harris, Skye, Bute, Islay, Mull and Arran.
1.2.2 Selecting the core sample
Twenty-five strata were created - each of the three island Health Boards (Orkney, Shetland, and Western Isles) was a stratum, and 22 other strata were constructed by dividing the 11 mainland Health Boards into separate strata containing "deprived" and "non-deprived" data zones. A deprived area was defined as being within the 15% most deprived of areas according to the 2009 Scottish Index of Multiple Deprivation.4 Having these separate strata allowed us to over-sample deprived areas.
The sampling was constructed so that each year's sample is clustered but the four-year sample 2008-2011 is unclustered. This meant that the design of the four-year sample would need to be considered at the start of the four-year period. However, it was not possible to select a sample of addresses at the start of the period. Had this been done then it is likely that the sample in the later years would have had a high level of ex-residential addresses (i.e. demolitions and conversion to other uses), and any new residential properties built over the four-year period would not have been included. The solution was to sample datazones for the four-year period and sample addresses each year. We use the following sampling procedure:
- (i) Firstly, the numbers of addresses needed to be issued in each stratum over a four-year period were calculated.
- (ii) Next, the number of addresses needed to be sampled in each datazone over the four-year period to achieve the numbers in (i) was calculated.
- (iii) To ensure that each year's sample was geographically clustered the datazones were put into batches, with each batch containing datazones geographically close to each other. A quarter of the batches (approximately) were randomly assigned to each of the four survey years.
- (iv) In each year addresses were selected from the batches assigned to that year and once the addresses were chosen they were clustered into interviewer assignments. Each assignment consists of approximately 20 addresses.
- (v) Finally, each assignment was allocated at random to a quarter, and then to a survey month. (Year 4 consisted of 12 survey months - January to December).
The random assignments (of batches of datazones to years, and of interviewer assignments to quarters and months) were not implemented using simple random samples, but by using systematic random list samples chosen after ordering the list by the SIMD 2006 variable. This ensured an even spread of addresses (by deprivation variable) among the four years, and within each year by month.
The next sections describe the process of (i) - (v) in more detail.
(i) and (ii) Sample sizes
The survey was designed to allow analysis at Health Board level and SIMD15 level every four years. In order to do this the sampling fraction (the proportion of addresses sampled) varied by Health Board and SIMD15 area. Smaller Health Boards and the SIMD15 areas were over-sampled. The sampling fraction also varied according to expected response rate (areas with an expected low response rate were over-sampled).
The number of addresses initially planned to be issued over a four-year period is given below. These figures were calculated based on assumptions made about response rates, and were therefore modified once the Year 1 data were collected.
Figure 1A: Number of main-sample addresses selected in each Health Board (initial 4-year allocation)
|Health Board||Non-deprived datazone||Deprived datazone||Total|
|Ayrshire & Arran||1285||298||1583|
|Dumfries & Galloway||809||67||876|
|Greater Glasgow & Clyde||4020||2652||6672|
The number of addresses that needed to be sampled from each datazone was proportional to the size of the datazone (typically 3-5 addresses would be chosen in each datazone). Choosing the number proportional to the size ensured that within each stratum each address had an equal probability of being chosen. The selection probabilities varied by stratum.
(iii) Assigning datazones to batches
The datazones were then grouped into 1865 initial batches, each consisting of datazones geographically close to each other. Each batch was chosen so that it was small enough to form an assignment. The typical batch contained approximately 13-18 addresses (typically consisting of 3-5 datazones).
The mean SIMD 2006 score of the datazones in each batch was used as a measure of deprivation of the batch and within each Health Board the batches were ordered according to their deprivation measures and put into groups of four batches. One batch from each group was then randomly allocated to each of the four years. This ensured that each year's sample would be representative of Scotland as a whole.
Due to a lower than expected response rate in Year 1, the number of addresses chosen in the Year 2, 3 and 4 batches was increased slightly and the fourth year's sample consisted of 7,971 addresses (compared to 6,947 in Year 1), allocated as shown in Figure 1B.
Figure 1B: Number of main-sample addresses selected in each Health Board (Year 4)
|Health Board||Non-deprived datazone||Deprived datazone||Total|
|Ayrshire & Arran||372||98||470|
|Dumfries & Galloway||179||15||194|
|Greater Glasgow & Clyde||1155||800||1955|
(iv) Selection of addresses and assignments
Once the fourth year's batches of datazones were chosen, a sample of addresses was selected from these datazones using the small user Postcode Address File (PAF). There can be small discrepancies between different versions of the PAF, and some addresses assigned to one Health Board in one version may be assigned to a different Health Board in another.
Addresses were then combined into interviewer assignments (points). It would have been possible to make each interviewer assignment a batch. However, this would have created interviewer assignments on the basis of the chosen datazones and it is more efficient to create them on the basis of the chosen addresses, so once the addresses had been obtained the interviewer assignments were created from the sampled addresses.
(v) Allocating assignments to months
The Year 4 assignments were then ordered according to their SIMD 2006 scores and randomly allocated to quarters of the year so that the sample for each quarter was representative of the population. The sample within each quarter was then randomly allocated to fieldwork months.
One issue when sampling addresses in Scotland is the presence of tenement blocks and other multi-residence buildings, some of which have only one address entry in the PAF but contain a number of different flats (dwelling units). Such addresses are identified in the PAF by the Multiple Occupancy Indicator (MOI) which is an estimate of the number of dwelling units at an address. To ensure that households in tenement blocks that do not have an individual entry in the PAF were given an equal chance of selection to other households the likelihood of selecting each address was increased in proportion to the MOI.
Where interviewers found more than one dwelling unit at an address they chose one dwelling unit at random.5 If the chosen dwelling unit contained two or more households they chose one of them at random for inclusion in the survey.
In most cases this meant that every household in a stratum had the same probability of selection - the exceptions being households at addresses with an incorrect MOI or at a dwelling unit containing two or more households. In these cases equal probability could be restored by applying a corrective weight at the analysis stage.
Sampling individuals within households
For the main sample all adults aged 16 years and over at each household were selected for the interview (up to a maximum of ten adults). However, in order to limit the burden on households with three or more children (aged 0-15), two of the children were randomly selected for inclusion in the survey. No interviews were attempted with the other children in the household.
1.2.3 Selecting the child boost sample
In addition to the main sample, a child boost sample of 1,944 addresses was issued in 10 of the 14 Health Boards. Whereas the main sample had been chosen to allow analysis of Health Boards in each four year period, the child sample is designed only to allow national estimates. Because of this, addresses were not issued in the smaller Health Boards that had been over-sampled in the main sample.
The following numbers of addresses were chosen:
Figure 1C: Number of addresses selected for the child boost in each Health Board (Year 4)
|Health Board||Non-deprived datazone||Deprived datazone||Total|
|Ayrshire & Arran||102||30||132|
|Dumfries & Galloway||25||3||28|
|Greater Glasgow & Clyde||317||244||561|
1.2.4 Selecting the Health Board boost sample
In addition to the main sample, two of the Health Boards (Fife and Grampian) opted to boost the number of adults interviewed in their areas. The sampling scheme for the Health Board boosts differed slightly from that of the main sample. In order to minimize fieldwork costs a two-stage system with postcode sectors selected in the first stage and addresses in the second was used. Twelve postcode sectors were chosen in each Health Board, these formed the primary sampling units, and addresses selected from each postcode sector (21 addresses per postcode sector were chosen in Fife and 22 were chosen in Grampian). Thus, the Health Board boost consisted of 252 addresses in Fife and 264 in Grampian. In Grampian the postcode sectors chosen for the Health Board boost were chosen via a simple random sample, but the sampling scheme in Fife differed slightly. Fife addresses were stratified by Community Health Partnership (CHP) before selection, and selection probabilities were chosen to enable analysis of data at the CHP level data at the end of the four-year period.
The method of selecting households and individuals within households followed that of the main sample.
1.2.5 Selecting the nurse sample
Some addresses from the main sample were selected as nurse addresses. At these addresses all adults interviewed in the main interview were eligible to take part in a follow-up nurse interview. A total of 2,180 addresses were sampled, as shown in section 1.1.3.
Figure 1D: Number of addresses selected for nurse interviews in each Health Board (Year 4)
|Area||Non-deprived datazone||Deprived datazone||Total|
|Ayrshire & Arran||112||30||142|
|Dumfries & Galloway||48||4||52|
|Greater Glasgow & Clyde||349||242||591|
The addresses assigned to the nurse interview were selected using the following randomisation schemes:
- In each year, in the island Health Boards (Orkney, Shetland and Western Isles) clustered samples were used. Two points (interviewer assignments) were chosen from each Health Board and addresses were selected at random from these points to be eligible for a nurse interview. These six points were chosen at random while ensuring that each Health Board's points had been assigned to consecutive months (to help reduce costs), and the six points covered all seasons of the year.
- In mainland Scotland, an unclustered sample of addresses was taken over four years, but was clustered within each year (as was the main sample).
Every adult in these addresses that participated in the Stage 1 interview was eligible for a nurse interview.
1.2.6 Selecting the knowledge, attitudes and motivations to health (KAM) sample
Between 2008 and 2011, NHS Health Scotland funded a module of questions on knowledge, attitudes and motivations to health (the KAM module). The 7,971 addresses selected for the main sample were classified as being either version A (the Scottish Government rotating module), or version B (the KAM module) addresses - 2,831 were version A addresses, 5,140 were version B. Random allocation was used to choose the version assigned. Core questions were asked of all participants in both version A and B. In addition, participants at version A addresses were also asked module A questions. At version B addresses, in addition to the core questions a single adult, chosen at random, was also asked the KAM module of questions.
1.3 Topic coverage
As part of the SHeS review a consultation on which questions should be included in the survey was carried out in 2007.6 Many of the topics included in previous years have been included again in the 2011 survey and, as with previous years, the survey had a focus on cardio-vascular disease (CVD) and its risk factors.
Copies of all the survey data collection documents are included in Appendix A. Full copies of the Stage 1 and Stage 2 questionnaire documentation are included in Appendix A. Protocols for measurements and for the collection of saliva, urine and blood samples are included in Appendix B. A summary of the content of both stages is summarised below.
1.3.3 Stage 1 interview
Information was collected at both the household and individual level. The table that follows summarises the content of the individual level interviews for all participants. The topics a participant was asked depended on both their age and the sample type they were allocated to. The age criteria for each topic is included in brackets after the topic name.
Figure 1E: Content of the 2011 Stage 1 interview
|CORE SAMPLE - Stage 1 interview outline|
|Version A||Version B|
|Household questionnaire including household composition|
|General health including caring (0+)|
|General CVD (16+)|
|Use of health services (0+)|
|Physical activity adults (16+) and children (2-15)|
|Eating habits children (2-15)|
|Fruit and veg consumption (2+)|
|Smoking and Drinking (16+) [16-19 in a self completion]|
|Dental health (16+)|
|Dental services (16+)|
|Social capital (16+)|
|Discrimination and harassment (16+)||-|
|Economic activity (16+)|
|Stress at work (16+)||-|
|Ethnic background, national identity and religion (0+)|
|Family health background (16+)|
|Self-completions (13+ & parents of 4-12 yr olds)|
|Height (2+) and Weight (0+)|
|Data linkage and follow-up research consents (0+)|
|-||Attitudes to Health (16+) |
- 1 adult per household
The core topics (those that span both version A and version B in Figure 1E), including the questions on CVD did not change between 2008 and 2011. The topics in the Core Version A interview were: core interview topics plus accidents, dental services, social capital, discrimination and harassment, and stress at work.
Children aged 13-15 were interviewed directly, and parents/guardians of children aged 0-12 were asked to answer on behalf of their children.
Participants aged 13 and over were asked to fill in a self-completion booklet during the interview. There were four different booklets for different age groups (listed below). The booklet for young adults aged 16-17 asked about smoking and drinking behaviour and interviewers also had the option of using the booklet for those aged 18-19 if they felt that it would be difficult for anyone in this age group to give honest answers in the face to face interview with other household members present.
|Booklet for adults||CAGE questions on drinking experiences, GHQ12, Warwick Edinburgh Mental Well-being scale (WEMWBS), use of contraception and sexual orientation|
|Booklet for young adults||Smoking, drinking, CAGE questions on drinking experiences, GHQ12, WEMWBS, use of contraception and sexual orientation|
|Booklet for 13-15 year olds||GHQ12|
|Booklet for parents of 4-12 year olds||Strengths and Difficulties Questionnaire (SDQ) designed to detect behavioural, emotional and relationship difficulties in children.|
Interviewers measured the height and weight of all participants aged 2 and over.
1.3.4 Stage 2 interview
Nurse interviews were offe red to adults (aged 16+) at a sub-sample of households in the main sample.
In the nurse interview, participants were asked about their use of prescribed medication, vitamin supplements, nicotine replacement therapy, and about recent experiences of food poisoning. A module of questions about depression, anxiety, suicidal attempts and self-harm (taken from the Adult Psychiatric Morbidity Survey) has been included since 2008.7 The nurse also took the following measurements: blood pressure; waist and hip circumference; and arm-length (demi-span) for those aged 65 and over. Lung function was measured via a spirometer. With written agreement, a small sample of blood was taken by venepuncture and was analysed for Total and HDL-cholesterol, C-reactive protein, fibrinogen, glycated haemoglobin and vitamin D. Nurses also sought agreement for the storage of a small sample of blood for possible future analysis. Written agreement was also sought to take samples of saliva (for the analysis of cotinine, a derivative of nicotine) and spot urine samples (for the analysis of dietary sodium).
Figure 1F: Content of the 2011 Stage 2 nurse interview
|Outline of Stage 2 nurse interview|
|Prescribed medicines (age 16+)|
|Vitamin supplements (age 16+)|
|Nicotine replacement therapy (age 16+)|
|Blood pressure (age 16+)|
|Depression, anxiety, suicidal attempts and self-harm (age 16+)|
|Food poisoning (age 16+)|
|Waist and hip measurements (age 16+)|
|Demi-span (arm length) (age 65+)|
|Lung function (age 16+)|
|Blood sample (age 16+)|
|Saliva sample (age 16+)|
|Urine sample (age 16+)|
1.4 fieldwork procedures
1.4.1 Advance letters
Each sampled address was sent an advance letter that introduced the survey and stated that an interviewer would be calling to seek permission to interview. There were two versions of the advance letter; one for the main and Health Board boost addresses in the sample and a separate version for the child boost addresses. A copy of the survey leaflet was included with every advance letter. The survey leaflet introduced the survey, described its purpose in more detail and included some summary findings from previous surveys.
1.4.2 Making contact
At initial contact, the interviewer established the number of dwelling units (DUs) and/or households at an address and made any necessary selections (see Section 1.2).
The interviewer then made contact with each household. In the main sample they attempted to interview all adults (up to a maximum of ten) and up to two children aged 0-15 (see Section 1.2). At child boost sample addresses, interviewers first screened for children aged 0-15 and within such households up to two children were selected for interview. The interviewer sought parents' and children's consent to interview selected children. Interviewers attempted to interview a maximum of ten adults at selected households in the Health Board boost sample.
1.4.3 Collecting data
Both interviewers and nurses used computer assisted interviewing.
At each co-operating eligible household in all sample types, the interviewer first completed a household questionnaire, information being obtained from the household reference person8 or their partner wherever possible. This questionnaire obtained information about all members of the household, regardless of age. The program created individual questionnaires for adults in the main and Health Board boost samples, and for selected children in the main and child boost samples.
An individual interview was carried out with all selected adults and children. In order to reduce the amount of time spent in a household, interviews could be carried out concurrently, the program allowing for up to four participants to be interviewed in a session.
Height and weight measurements were obtained towards the end of the interview.
In addition to an advance letter and general survey leaflet, participants were also given a more detailed leaflet describing the contents and purpose of the Stage 1 interview. Adults in households eligible for a nurse interview were given a longer version of this leaflet which also included an explanation of the purpose of the Stage 2 nurse interview. There was a separate version of this leaflet for children in main and child boost households. Parents at child boost addresses were also given a leaflet containing background information on the survey. Stage 1 leaflets are included in Appendix A.
1.4.4 Introducing the Stage 2 nurse interview
Only a sub-sample of adults in the main sample was eligible to take part in the Stage 2 nurse interview in 2011. At the end of the Stage 1 interview, adult participants at the 'nurse sample' addresses were asked for their agreement to take part in the second stage of the survey. Wherever possible an appointment was made for the nurse to interview within a few days of the interview. At this interview the nurse carried out the measurements described in Section 1.3.4 and obtained the saliva, blood and urine samples from those adults eligible and willing to provide these samples.
Before blood, saliva and urine samples were taken, written consent was obtained from the participant. Nurses also asked participants for consent to store part of the blood sample for additional analyses at some future date. If the participant agreed, written consent was obtained.
1.4.5 Interviewing and measuring children
Children aged 13-15 were interviewed directly by interviewers, permission having first been obtained from the child's parent or guardian. Interviewers were instructed to ensure that the child's parent or guardian was present in the home throughout the interview. Information about younger children was collected directly from a parent/guardian. Whenever possible, younger children were present while their parent/guardian answered questions about their health. This was partly because the interviewer had to measure their height and weight and it also ensured that the child could contribute information where appropriate.
1.4.6 Feedback to participants
If participants wished, interviewers recorded height and weight measurements on their information leaflet.
At the Stage 2 nurse interview each participant was given a Measurement Record Card in which the nurse entered the participant's waist and hip measurement, demi-span measurement (if applicable), blood pressure measurements and lung function results.
If they wished, participants were sent the results of their blood sample analyses. They were also given the option of having their blood pressure, lung function readings and blood sample analyses sent to their GP. Written consent for results to be passed on to GPs was required for each of the measurements.
Nurses were issued with a set of guidelines to follow when commenting on participants' blood pressure readings (see Appendix B for details). If an adult's blood pressure reading was severely raised, nurses were instructed to contact the Survey Doctor at the earliest opportunity. Where permission had been given for results to be sent to a participant's GP, the Survey Doctor contacted the GP if any blood pressure, lung function or blood sample results were abnormal. In the absence of permission to contact GPs, the Survey Doctor contacted participants directly if they had abnormal results.
1.5 Fieldwork quality control and ethical clearance
1.5.1 Training interviewers and nurses
Interviewers were fully briefed on the administration of the survey, including screening for households with children in the child boost sample. They were given training in measuring height and weight, including practice sessions.
All nurses were professionally qualified and proficient in taking blood before joining the Health Survey team. They attended a one and a half day training session at which they received equipment training and were briefed on the specific requirements of the survey with respect to taking blood pressure, anthropometric and lung function measurements, and taking blood, saliva and urine samples.
Full sets of written instructions, covering both survey procedures and measurement protocols, were provided for both interviewers and nurses (Appendix B contains a copy of the measurement protocols).
All nurses and interviewers who had not previously worked on SHeS were accompanied by a nurse or interviewer supervisor during the early stages of their work to ensure that interviews and protocols were being correctly administered.
1.5.2 Checking interviewer and measurement quality
A large number of quality control measures were built into the survey at both data collection and subsequent stages to check on the quality of interviewer and nurse performance.
Recalls to check on the work of both interviewers and nurses were carried out at 10% of productive households.
The computer program used by interviewers had in-built soft checks (which can be suppressed) and hard checks (which cannot be suppressed) which included messages querying uncommon or unlikely answers as well as answers outside an acceptable range. For example, if someone aged 16 or over had a height entered in excess of 1.93 metres, a message asked the interviewer to confirm that this was a correct entry (a soft check), and if someone said they had carried out an activity on more than 28 days in the last four weeks the interviewer would not be able to enter this (a hard check). For children, the checks were age specific. Some infants were weighed by having an adult hold them; the weight of the adult on their own was entered into the computer followed by the combined weight of the infant and adult. A hard check was used to ensure that the weight entered for the adult alone did not exceed the weight of the infant and adult combined.
At the end of each survey month, the measurements made by each interviewer and nurse were inspected. Any problems (such as higher than average proportions of measurements not obtained, insufficient samples and so on) were discussed with the relevant nurse or interviewer by their supervisor.
1.5.3 Ethical clearance
Ethical approval for the 2008-2011 surveys was obtained from the Multi-Centre Research Ethics Committee for Wales (REC reference numbers: 07/MRE09/55 and 08/MRE09/62).
1.6 survey response
1.6.1 Introduction to response analysis
This section looks at the response for sampled households (section 1.6.2), and then at the response of eligible individuals within those households, firstly for adults (section 1.6.3) and then for children (section 1.6.4). Individual response for adults and children is examined in two ways: overall response for all eligible individuals in the 'set' sample, and response for individuals within co-operating households.
Participants were asked to co-operate in a sequence of operations, beginning with a face-to-face interview, height and weight measurements, and if applicable, progressing to a nurse interview and ending with requests for blood, saliva and urine samples. Individual non-response accumulated through the survey stages.
Not every measurement obtained by an interviewer or a nurse was subsequently considered valid for analysis purposes. Full details of the numbers of measurements used for analysis, the number of exclusions and the reasons for them are given at the start of each relevant chapter.
1.6.2 Household response
Tables 1.1 and 1.2 show household response by Health Board, for the main and Health Board boost samples combined (Sample A) in 2011 and in 2008-2011 combined. Table 1.3 shows the child boost sample (Sample B) household response in 2011. The interviews conducted as part of the two Health Board boost samples have been integrated into the main 2011 datafile as they were not intended to form stand alone samples in their own right. For this reason separate analysis of their response rates was not conducted. The row labelled 'Total eligible households' shows the number of private residential households found at the selected addresses (after selection of a single dwelling unit and up to three households when necessary).
Households described as 'co-operating' are those where at least one eligible person was interviewed at Stage 1, the interviewer stage. Households described as 'all interviewed' are those where all eligible persons were interviewed, and 'fully co-operating' are those where all eligible persons were interviewed, had height and weight measured and, if applicable, agreed to a nurse interview. Households where a participant was ineligible for a height or weight measurement because of a functional impairment or pregnancy are not counted as fully co-operating for this response analysis.
66% (5,010) of eligible households in sample A took part in the 2011 Scottish Health Survey. This is slightly higher than the average household response for the four years combined (2008-2011). Between 2008 and 2011, 63% of eligible households in sample A responded to the survey. In 2011 all eligible adults and children were interviewed at 49% of households in this sample. This is similar to the four year average which was 50% of eligible households. In sample B, the child boost sample, 65% of eligible households (299) co-operated with the survey, and in all but five of these households, all eligible children were interviewed. Table 1.1-Table 1.3
There were 7,544 individual interviews with adults in the 2011 SHeS. A sub-sample of adults in the main sample were eligible to take part in the Stage 2 nurse interview. 972 adults saw a nurse and 725 gave a blood sample.
To calculate the response rate for individuals, rather than households, the total number of productive individual interviews should be expressed as a proportion of the total number of adults in the sampled households. However, as not all sampled households participated in the survey the total number of adults in the sampled households is not known, and must be estimated. There are three groups of households to consider:
- Co-operating households (9,110 adults in 5,010 households, average 1.82 per household),
- Non co-operating households where information on the number of adults is known (3,106 adults in 1,896 households, average 1.64) and
- Non co-operating households about which nothing is known (731 households).
The most reasonable assumption is to attribute to the last group the same average number of adults (1.77) as for all households where the number is known (the sum of the first two groups). This assumption gives an estimated total of 13,509 eligible adults, known as the 'set' sample.
Evidence suggests that unproductive households tend to be smaller on average than productive households, so this estimate of the total number of eligible adults is likely to be too large, and response rates based on it will therefore be underestimates.
A further assumption is needed to provide separate 'set' samples for men and women. In non co-operating households where the number of adults was known, the numbers of men and women were not usually obtained. However, it can be assumed that the proportion of men and women in the estimated total sample is the same as for the adults in the 5,010 co-operating households. The proportions are 47.7% men and 52.3% women. Applying these proportions to the estimated total of adults gives 'set' samples of 6,437 men and 7,072 women.
Using the estimated total number of adults in sampled households, the adult 'set' sample, as a denominator, minimum response rates for the various stages were as follows:
|Saw a nurse||26||30||28|
|Waist and hip measured||25||29||27|
|Blood pressure measured||26||29||27|
|Agreed to give a blood sample||21||24||23|
|Blood sample obtained||20||21||21|
Response to the interview was 60% among women and 51% among men. Table 1.4
Adult response in co-operating households
As adults' ages and other personal characteristics are not known in non co-operating households, indications of differences in response by these characteristics are confined to co-operating households. Tables 1.5 and 1.6 show the proportion of men and women in co-operating households who participated in the key survey stages, by age. These are summarised below:
|Saw a nurse||40||47||44|
|Blood pressure measured||39||46||43|
|Blood sample obtained||31||34||33|
|Saliva sample given||38||44||41|
|Lung function measured||39||45||42|
|Urine sample given||36||42||39|
In co-operating households, response was lowest among those aged 16-24 for both sexes though young men stand out as having particularly low cooperation rates, (52% for men and 66% for women aged 16-24). Among men, response increased with age. Response for men aged 25-64 ranged from 70-81% and rose further to its highest rate among those aged 75 and over (93%). There was a more even pattern among women with a consistently high response rate of over 90% achieved among women aged 35 and over (ranging between 92% and 96%).
It should be noted that the lower levels of response to the height and weight measurements, and agreement to nurse interviews, among men is largely a result of the fact that fewer men than women took part in the survey overall. Based on those participating, women's refusal rates for participating in the height and weight measurements were actually slightly higher than men's, and the proportions of women and men who were interviewed and refused a nurse interview or could not be contacted by the nurse were very similar (18% and 17% respectively).
1.6.4 Individual response for children (0-15)
Overall response among children
Interviews were carried out with 1,987 children aged 0-15. This includes 1,538 children interviewed in the main sample, and 449 interviewed in the child boost sample.
To calculate the response rate for children, the number of eligible children in sampled households (the 'set sample') is needed as the denominator. This was estimated by assuming that the households where the numbers of children were not known had the same average number of boys and girls as those where it was known (and that the proportion of boys and girls was the same). This results in a 'set' sample of 3,392 children in total, comprising 2,683 in the main sample and 709 in the child boost. This is likely to be an over-estimate, since non-contacted households have fewer children, on average, than those contacted. Response rates computed for children, like those for adults, are therefore conservative. Most non-responding children were in households where no-one (child or adult) co-operated with the survey. The total number of children in the sampled households would be slightly greater than the set sample as some households would have had more than two children.
In the main sample, response to the interview was 57% among boys and girls, while in the boost response was 64% for boys, 62% for girls and 63% for all children. Combining the two samples, this gives an overall response to the interview of 59% for boys, 58% for girls and 59% for all children. Height measurements were limited to those aged 2 and over. On the assumption that the age distribution of children in the 'set sample' is the same as that of children living in interviewed households, responses to these measurements were: Table 1.8
|Height measured (aged 2 and over)||39||38||39|
|Weight measured (aged 2 and over)||39||38||39|
Child response in co-operating households
Child response rates, like adult response rates, have also been calculated on a co-operating household base. Among selected children aged 0-15 in co-operating households, the proportion who were interviewed was high at 90% of eligible boys and 91% of eligible girls. The proportion interviewed was lower among children aged 11-15 (80% of boys and 82% of girls) than among those aged under 11. This may in part be accounted for by the fact that parents acted as proxy participants for all children aged 12 and under whereas from 13 onwards children were interviewed directly in person.
|Height measured (aged 2 and over)||69||70||69|
|Weight measured (aged 2 and over)||69||70||69|
1.6.5 Regional variations in survey response
As in previous years, response to the main sample (sample A) varied by Health Board. In 2011, household response was highest in Orkney, Western Isles and Dumfries and Galloway. Greater Glasgow and Clyde and then Lanarkshire and Lothian had the lowest response.
1.6.6 Age and sex profile of the sample
According to the 2011 household population estimates, men form 48% of all adults (aged 16 and over) in Scotland and women form 52%, while in the SHeS 2011 men form 43% of all interviewed adults and women form 57%. Men and women aged under 35 are under-represented at the interviewer interview relative to their proportions in the household population estimates, while men 55 and over and women aged 45 and over are over-represented. Men and women aged under 35 were also slightly under-represented in the nurse interview while men and women aged 45 and over were slightly over-represented. Table 1.10
Table 1.11 compares the age and sex profile of responding children at the Stage 1 interviewer interview with the mid-2011 population estimates for Scotland (the estimates for children are based on the total population, not the household population as the two measures are very similar for children and more detailed breakdowns are available for the total population). The proportion of boys aged 0-15 in SHeS 2011 was similar to the total population estimates (50% compared to 51% respectively), and the same was true for girls aged 0-15 (50% in SHeS 2011 compared to the population estimate of 49%). Boys aged under 8 were over-represented, while boys aged 8-9 and 12-15 were under-represented. The proportion of boys aged 10-11 in SHeS 2011 equalled the population estimates for this age group. Girls aged under 6 were over-represented and girls 6-7 and 12-15 were under-represented relative to the population estimates for this group. The proportion of girls aged 8-11 in SHeS 2011 equalled the population estimates for this age group. Table 1.11
1.7 Weighting the data
The SHeS 2011 comprised a general population sample (main sample), a child boost sample of children screened from additional addresses and a Health Board boost sample in two Health Board areas. As a result, several different sets of weights have been provided for the 2011 survey. In addition, weights have been provided to allow analysis of the combined data outlined in Section 1.1.5. This section describes the weighting procedures in more detail.
1.7.2 Adult weights - summary
Weights are provided to allow analysis of adult responders (including responders from both the main sample and the Health Board boost sample). The weighting strategy for the adult sample was:
- calculate weights (w1) for the differential selection of addresses;
- calculate weights for the selection of dwelling units at each address (w2) and for the selection of households at each dwelling unit (w3);
- calibrate the combined household weight (w1×w2×w3) so that the weighted sample of household members matched population estimates for age/sex and health board (w4);
- generate weights for whether an adult within a participating household responded (w5);
- combine (w5) with the household weight and calibrate the combined weight (w4×w5) to the population estimates and scale this to give the final adult interview weight, int11wt.
1.7.3 Address, dwelling unit and household selection weights
Address selection weights (w1)
Selection weights were required to ensure that each area was in the correct proportion for national estimates. The selection weights varied between Health Boards (smaller Health Boards were over-sampled so had smaller selection weights), and within each Health Board they varied by SIMD area (areas in the most deprived 15% of areas based on the 2006 SIMD were over-sampled so also had smaller selection weights).
For each stratum the selection weights were calculated as the number of addresses in the PAF divided by the number of addresses issued.
Dwelling unit and household selection weights (w2 and w3)
In a very small number of addresses the number of dwelling units found was not the equal to the MOI. In these cases a dwelling unit weight was calculated to correct for this discrepancy. A household weight was also calculated to correct for the selection of households. Without these weights households at multi-occupied addresses would be under-represented in the sample.
1.7.4 Calibrating household weights (w4)
To generate the household weights the combined selection weights (w1×w2×w3) were adjusted by using calibration weighting. Calibration weighting was used to ensure that the weighted achieved sample of households matched the National Records of Scotland's (NRS's) estimated age/sex distribution of the household population, while at the same time matching the Health Board totals.
The estimates of the household population were provided by NRS. The household population is the estimated population in private households, so excludes people living in institutions. The household population estimates used are given in Figure 1G and Figure 1H.
In addition to calibrating to the totals given in Figure 1G and Figure 1H, the weights were calibrated to ensure that the number of responding households in the deprivation areas matched the number of issued eligible households. This ensured that the SIMD15 areas were not under-represented because of non-response.
1.7.5 Adult non-response weights (w5)
It is likely that the characteristics of household members that do not take part in surveys are different from those that do. By using logistic regression it is possible to model the difference between responding and non-responding household members and, from that model, obtain weights to reduce the bias from the differential non-response.
Responding households that contained more than one adult were selected and the household weight (w4) was applied. A logistic regression model was then fitted using variables from the household interview to model whether a household member responded or not. The final model included the following variables: the Health Board; the age/sex of the household member; the number of adults in the household; an indicator for whether the household was in an SIMD15 area; an indicator for whether the household reference person was in paid employment or self-employed; a variable indicating how frequently the family ate a main meal together; a variable for the person's marital status; whether anyone regularly smoked inside the dwelling; and whether the household owned or were buying the dwelling.
The parameters in the model were used to estimate the probability of response for each individual. The adult non-response weight (w5) was simply the reciprocal of this probability. (The adult non-response weight in households containing only one adult was set to 1).
1.7.6 Adult interview weights (int11wt)
The final adult interview weights were calculated by combining the household weight with the adult non-response weight (w4×w5) and calibrating to the totals given in Figure 1G and Figure 1H.
Calibrating to these totals ensured that when national estimates are required the age/sex and regional distributions of the adult sample match those of the population. It does not ensure that age/sex proportions are correct within each Health Board. The sample was not designed to allow yearly estimates at Health Board level, but it is likely that it will be used for this in some of the larger Health Boards, so adjusting so that the age/sex distribution was correct in these large Health Boards was investigated. This proved to be possible only in Greater Glasgow and Clyde.
Figure 1G: 2011 Mid-year household population estimates for Scotland by Health Boarda
|Ayrshire & Arran||299,880||62,920||362,800|
|Dumfries & Galloway||122,140||24,240||146,380|
a Total figures may not be exact due to rounding
Figure 1H: 2011 Mid-year household population estimates for Scotland by age and sexa
a Total figures may not be exact due to rounding
1.7.7 Adult nurse interview weights
The sample of adults having a nurse interview was weighted to take account of differential probabilities of selection and non-response. The weighting strategy for the nurse sample was:
- calculate a calibrated household weight (w6) (this was calculated in exactly the same way as w4 but was calculated for the main sample only);
- generate a correction for whether the household was selected to be in the nurse sample, (w7);
- generate non-response weights for whether a responding adult gave a nurse interview, (w8);
- combine the weights with the adult non-response weight (w5) to calculate w9= (w6×w7×w5×w8); and then
- calibrate the combined weight (w9) to the population estimates and scale this to give the final nurse weight, nurs11wt.
Only weights (w7) and (w8) need any description. Weight (w7) was simply the probability a household had been selected for the main sample divided by the probability the household had been selected for the nurse sample. Weight (w8) was calculated by using logistic regression modelling to model non-response. The variables considered for the model included variables from the sampling frame, variables from the household grid and household interview, and variables from the adult interview. The final model used the following variables:
- a health board variable;
- an age/sex variable;
- an SIMD indicator;
- an indicator of the persons marital/cohabitation status;
- a variable for the number of adults in the household;
- an indicator of the working status of the household reference person;
- an variable indicating whether there were any barriers to entry to the household;
- a variable indicating whether the respondent had any long-term illness;
- a variable indicating whether the respondent had done any physical sporting activity in the previous four weeks;
- an indicator of the respondent's work status;
- a variable indicating whether the respondent had done any housework in the previous four weeks;
- a variable indicating whether the respondent had done any gardening, DIY or building work in the previous four weeks;
- an indicator for whether the respondent currently smokes; and
- an indicator for whether the respondent drinks alcohol.
This model was used to estimate the probability that any selected adult would have a nurse interview. The nurse non-response weight (w8) was simply the reciprocal of this probability.
1.7.8 Adult blood weights
A similar method was used to generate the adult blood weights. A blood sample was not obtained from every adult who had a nurse interview so a weight was calculated to correct for non-response. The method used was to start with the nurse sample and use logistic regression to model the probability that a respondent from the nurse sample would give a blood sample. The non-response weight, w10, was combined with the pre-calibration nurse weight, (w9), and then calibrated to population totals. The final adult blood weight (blod11wt) is the calibrated weight scaled to sum to the sample size.
1.7.9 Weights for the knowledge, attitudes and motivations to health (KAM) module
KAM weights were calculated in a similar way to the main adult weights in that they combined selection weights, non-response weights and calibration.
The process was:
- start with the calibrated household weight (w6) already calculated for the nurse sample;
- generate a correction for whether the household was selected to be in the KAM sample (w11);
- generate an additional selection weight, (w12), for whether the respondent was selected to be given the KAM module (this was simply the number of adults in the household);
- generate non-response weights for whether a responding adult gave a nurse interview (w13);
- combine the weights with the adult non-response weight (w5) to calculate w14= (w6×w11×w5×w12×w13); and then
- calibrate the combined weight (w14) to the population estimates and scale this to give the final KAM weight (kam11wt).
1.7.10 Weights for Version A
Weights were also calculated for analysis of core version A data for adults and children. This was calculated by taking the calibrated household weight (w6), defined above, multiplying it by a correction for allocation of the address as a Version A address, multiplying by the adult non-response weight, w5, and calibrating to population totals. The final version A weights are called (vera11wt) (adult) and cvera11wt (children).
1.7.11 Child weights - summary
The weighting strategy for the child sample was:
- calculate weights (cw1) for the differential selection of addresses;
- calculate weights for the selection of dwelling units at each address (cw2) and for the selection of households at each dwelling unit (cw3);
- calculate weights (cw4) for the selection of children within each household;
- calibrate the combined child selection weight (cw1×cw2×cw3×cw4) so that the weighted sample of children matched population estimates for age/sex and Health Board. Scale this to give the final child interview weight (cint11wt)
1.7.12 The child interview weights
Address selection weights, dwelling unit and household selection weights (cw1, cw2 and cw3)
The selection weights for the addresses, dwelling units and households were generated in the same way as for the adult sample.
Weights for the selection of children at each household (cw4)
A maximum of two children were selected in each household so a selection weight (cw4) was calculated as the number of children in the household divided by the number of children selected. Without this selection weight children in larger households would have been under-represented in the final sample.
Child interview weights (cint11wt)
The final child interview weights were calculated by combining child selection weight (cw1×cw2×cw3×cw4) and calibrating to the totals given in Figure 1I and Figure 1J. A high proportion of children in participating households participated in the survey so weighting for non-response was not needed (91% of all children selected for interview participated in the survey). Therefore, the child weight was simply the scaled calibration weight.
Calibrating to these totals ensured that when national estimates are required, the age/sex and regional distributions of the child sample match those of the population. It does not ensure that age/sex proportions are correct within each Health Board. The sample was not designed to allow child estimates at Health Board level at yearly intervals or across the four survey years.
Figure 1I: 2011 Mid-year household population estimates for Scotland by Health Board
|Ayrshire & Arran||62,920|
|Dumfries & Galloway||24,240|
|Greater Glasgow & Clyde||96,330|
Figure 1J: 2011 Mid-year household population estimates for Scotland by age and sex (for children)
1.7.13 Combined weights
Several weights have also been calculated to allow for analysis of various combinations of data from the 2008-2011 surveys.
The weights provided for combined years of data are:
|Weight name||Purpose of combined weight|
|vera0911wt||For analysis of 2009 and 2011 combined version A adult data|
|cvera0911wt||For analysis of 2009 and 2011 combined version A child data|
|int1011_wt||For analysis of 2010 and 2011 combined adult data|
|cint1011_wt||For analysis of 2010 and 2011 combined child data|
|int08091011_wt.||For analysis of 2008 to 2011 combined adult data|
|cint08091011_wt||For analysis of 2008 to 2011 combined child data|
|nurs1011_wt||For analysis of 2010 and 2011 combined nurse data|
|blod1011_wt||For analysis of 2009 and 2011 combined blood data|
|nurs0811wt||For analysis of 2008 to 2011 nurse data|
|blod0811wt||For analysis of 2008 to 2011 blood data|
In each case, the calculation of the weights followed the same procedure. Pre-calibration weights had already been calculated for the individual years. (These took into account selection weighting and (except for the child weights) non-response weighting. The pre-calibration weights for the relevant years were combined and calibrated to Health Board and age/sex population totals. For the population totals the average populations for the relevant years combined were used, so for example the version A weight used the average of the 2009 and 2011 population estimates were used for the combined 2009/2011 version A weight.
1.8 data analysis and REPORTING
SHeS is a cross-sectional survey of the population. It examines associations between health states, personal characteristics and behaviour. However, such associations do not necessarily imply causality. In particular, associations between current health states and current behaviour need careful interpretation, as current health may reflect past, rather than present, behaviour. Similarly, current behaviour may be influenced by advice or treatment for particular health conditions.
1.8.2 Reporting age variables
Defining age for data collection
A considerable part of the data collected in the 2011 SHeS is age specific, with different questions directed to different age groups. During the interview the participant's date of birth was ascertained. For data collection purposes, a participant's age was defined as their age on their last birthday before the interview. The nurse, who interviewed them later, treated them as being of the same age as at the interview, even if they had an intervening birthday.
Age as an analysis variable
Age is a continuous variable, and an exact age variable on the data file expresses it as such (so that, for example, someone whose 24th birthday was on January 1 2010 and was interviewed on October 1 2010 would be classified as being aged 24.75 (243/4)).
The presentation of tabular data involves classifying the sample into year bands. This can be done in two ways, age at last birthday and 'rounded age', that is, rounded to the nearest integer. In this report all references to age are age at last birthday.
Some of the adult data included in the 2011 report have been age-standardised to allow comparisons between groups after adjusting for the effects of any differences in their age distributions. If data reported has been age-standardised this is highlighted in the title to the table or chart. When different sub-groups are compared in respect of a variable on which age has an important influence, any differences in age distributions between these sub-groups are likely to affect the observed differences in the proportions of interest.
It should be noted that all analyses in the report are presented separately for men and women and on some occasions data for all adults is also presented. All age standardisation has been undertaken separately within each sex, expressing male data to the overall male population and female data to the overall female population. When comparing data for the two sexes, it should be remembered that no age standardisation has been introduced to remove the effects of the sexes' different age distributions.
Age standardisation was carried out using the direct standardisation method. The standard population to which the age distribution of sub-groups was adjusted was the mid-year 2011 household population estimates for Scotland. The age-standardised proportionwas calculated as follows, where is the age specific proportion in age group andis the standard population size in age group:
Therefore p' can be viewed as a weighted mean of using the weights. Age standardisation was carried out using the age groups: 16-24, 25-34, 35-44, 45-54, 55-64, 65-74 and 75 and over. The variance of the standardised proportion can be estimated by:
1.8.3 Standard analysis breakdowns
National Statistics Socio-Economic Classification (NS-SEC)
SHeS 2011 measured socio-economic status using the National Statistics Socio-Economic Classification (NS-SEC) which was introduced in 2001. NS-SEC was introduced to SHeS in 2003 and replaced the social class measures used in the two previous rounds of survey, Registrar General's Social Class (SC) and Socio-economic Group (SEG).10
NS-SEC was classified in two ways: on the basis of participants' own current or most recent occupation, and on the basis of the occupation details of the household reference person. The household reference person (HRP) was defined as the householder (the person in whose name the property was owned or rented) with the highest income. If there was more than one householder and they had equal incomes, then the household reference person was the eldest. The identity of the HRP was established in the household questionnaire and details about their occupation were collected at this point. If the HRP occupational details were collected by proxy from another household member these were collected again directly from the HRP during their individual interview (if one took place). Children were assigned the NS-SEC value of the HRP.
NS-SEC is an occupational based classification that uses the Standard Occupational Classification 2000 (SOC 2000) which replaced the Standard Occupational Classification 1990 (SOC 90) schema. The combination of SOC 2000 and information collected about employment status (whether an employer, self-employed or employee; whether a supervisor; number of employees at the workplace) for current or last job generates the following NS-SEC analytic classes:
- Employers in large organisations, higher managerial and professional
- Lower professional and managerial; higher technical and supervisory
- Intermediate occupations
- Small employers and own account workers
- Lower supervisory and technical occupations
- Semi-routine occupations
- Routine occupations.
The remaining categories include those who have never worked, or who gave no occupational details or whose information was inadequately described or unclassifiable for other reasons. Most of the analysis in the 2011 report was based on a five level version of this classification which combined the first two groups and the last two. Analysis is also possible using a three level classification which combines the intermediate and small employers and own account worker categories, and combines the lower supervisory group with the routine categories. All analysis was conduced using the NS-SEC of the HRP.
NS-SEC is a conceptually based schema which was developed from a sociological classification, the Goldthorpe Schema.11,12 The measure used in the 1995 and 1998 surveys, SC, used levels of occupation skill as the basis for its classification, whereas NS-SEC aims to differentiate between positions in the labour market in terms of aspects such as sources of income, job security, career advancement, authority and autonomy. A version of SC, derived from NS-SEC, has been produced by the Office for National Statistics and is available on the dataset.
The 2011 survey included questions designed to measure participants' household income. While household income alone can be used as an analysis variable, the analysis conducted for this report used an adjusted measure which took account of the number of persons within the household. The McClements method was used to equivalise incomes; this is detailed in the Glossary at the end of this report. The equivalised income measure was divided into quintiles for the presentation of analysis within the report, but the full continuous data is available on the dataset.
Scottish Index of Multiple Deprivation (SIMD)
The analysis was based on the 2009 version of the Scottish Index of Multiple Deprivation.13 It is based on 38 indicators in seven individual domains of current income, employment, housing, health, education, skills and training, geographic access to services and crime. SIMD is calculated at data zone level, enabling small pockets of deprivation to be identified. The data zones are ranked from most deprived (1) to least deprived (6505) on the overall SIMD index. The result is a comprehensive picture of relative area deprivation across Scotland. The index was divided into quintiles for the presentation of analysis within the report, a version divided into deciles is also available on the dataset. The full index is not available on the archived dataset due to concerns about its potential for identifying individual respondents or households.
Regression modelling has been used in a number of chapters to examine the factors associated with selected outcome variables, after adjusting for other predictors. For instance in Volume 2 Chapter 1 binary logistic regression analyses have been performed to examine the association between children's strengths and difficulties questionnaire scores (SDQ) and a variety of predictor variables including age, household income, number of children in the household and level of physical activity. Models were run for boys and girls separately. Chapter 7 also uses binary logistic regression to examine the association between being in a high risk health category and various predictor variables. The model was run twice, once for men and a second for women. A wide range of possible predictor variables were tested in each model. This gives an estimate of the independent effect of each predictor variable on the outcome when all the other independent variables were included in the model.
The results of the binary logistic regression analyses are presented in tables showing odds ratios for the final models, together with the probability that the association is statistically significant. The predictor variable is significantly associated with the outcome variable if p<0.05. The models show the odds of being in the particular category of the outcome variable (i.e. having a high SDQ score) for each category of the independent variable (e.g. quintiles of equivalised household income). Odds are expressed relative to a reference category, which has a given value of 1. Odds ratios greater than 1 indicate higher odds, and odds ratios less than 1 indicate lower odds. Also shown are the 95% confidence intervals for the odds ratios. Where the interval does not include 1, this category is significantly different from the reference category.
Missing values were included in the analyses, that is, people were included even if they did not have a valid answer, score or classification in one or more of the explanatory variables. Where this was a large number of people, the missing values were included as a separate category (e.g. income), and where there were few records with a missing value, these individuals were included with the category containing the largest number of cases (e.g. those meeting physical activity recommendations in Volume 2 Chapter 1).The treatment of missing values in the regression models is explained in the footnote section of the relevant tables.
1.8.5 Design effects and true standard errors
SHeS 2011 used a clustered, stratified multi-stage sample design. In addition, weights were applied when obtaining survey estimates. One of the effects of using the complex design and weighting is that standard errors for survey estimates are generally higher than the standard errors that would be derived from an unweighted simple random sample of the same size. The calculations of standard errors shown in tables, and comments on statistical significance throughout the report, have taken the clustering, stratification and weighting into account. The ratio of the standard error of the complex sample to that of a simple random sample of the same size is known as the design factor. Put another way, the design factor (or 'deft') is the factor by which the standard error of an estimate from a simple random sample has to be multiplied to give the true standard error of the complex design. The true standard errors and defts for SHeS 2011 have been calculated using a Taylor Series expansion method. The deft values and true standard errors (which are themselves estimates subject to random sampling error) are shown in Tables 1.13 to 1.27 for selected survey estimates presented in volumes 1 and 2. Tables 1.13 to 1.27
Email: Julie Ramsay