4. Sample Design
Eligible people were identified using an extract from the Community Health Index (CHI) database provided to the Survey Team of Public Health Scotland on the 6th October 2021. Public Health Scotland receive daily updates to the CHI database and therefore the most up to date information available was used for the sample. People eligible to be sampled for the survey were those registered to a Scottish GP practice and were aged 17 or over on 6th October 2021, the date when the sampling procedure commenced. Patients with non-Scottish postcodes were excluded from the sampling frame. All data was accessed, managed and stored in accordance with the data confidentiality protocols described in the privacy notice for the survey.
A small number of special practices, run by NHS Boards to provide primary care services to particular small groups of people (e.g. practices for homeless people and associated with universities) were excluded from the survey.
Sampling Design and Sample Size Calculation
Sampling was done within GP practice lists, to aim for sufficient responses to achieve a reasonably reliable result for each practice. The reliability of the result depends on the number of questionnaires returned, and also the variability of the responses.
The sample size that was calculated for each practice was based on the minimum number of responses that would be required to achieve an estimate of a percentage that has a 95 per cent confidence interval with width +/- eight percentage points, sampled from a finite population.
The formula for the minimum number of responses required (M) is
M = B / (1+(B-1) / N)
- N is the number of people registered with a practice on the sampling frame (i.e. the number of people aged 17 and over);
- B = z2p(1-p) / c2 = 150 using the following definitions:
- o p is the proportion answering in a certain way, assume 0.5 to give maximum variability;
- o z is 1.96 for a 95 per cent confidence interval (using the standard normal distribution);
- o c gives maximum acceptable size of confidence interval, in this case 0.08 (eight percentage points).
Table 4 shows the minimum number of responses required (M) based on the assumptions above for some example practice population sizes.
|Practice List Size (N)||200||500||1,000||2,000||5,000||10,000||20,000|
|Min. required responses (M)||86||116||131||140||146||148||149|
|Percentage of GP practice population required to respond||43%||23%||13%||7%||3%||1%||1%|
In practice, if the underlying proportion is actually higher or lower than 0.5, then these numbers of responses would give narrower confidence intervals (or fewer responses would be required for the same accuracy).
The minimum number of responses required is adjusted upwards to allow for assumed non responses to the survey. Estimated response rates to the 2021/22 survey for each individual GP practice were based on the average of the response rates for the 2017/18 and 2019/20 surveys. Where response rates were not available i.e. for a new practice assumed response rates were used based on the proportion of the eligible population living in the most deprived 15% of data zones (based on the Scottish Index of Multiple Deprivation 2020), which affect the likelihood of a person responding to the survey. Estimated required sample sizes were capped at a maximum of 1,000 for individual practices.
The addresses from CHI were cross checked against the Scottish Postcode Directory to ensure that they were complete. Any instances of invalid, deleted or incomplete postcodes were removed prior to sample selection, as were a small number of people who had requested not to be included in this or other surveys. A total of 538,993 people were sampled for inclusion in the Health and Care Experience Survey 2021/22.
PHS checked for any cases where the same name (first name, middle name and surname) and address appear. Each of these cases within the "duplicate" are removed prior to sample selection. The duplicates may be relatives living together or errors in the source data. Removing them ensures there is no ambiguity as to who is being asked to participate in the survey and reduces the risk of questionnaires being sent out in error.
For the majority of practices in Scotland, a random sample of the required number of people from each practice was taken from the CHI database using the sampling frame by Public Health Scotland. For some practices with very small numbers of eligible people, all were included in the survey in order to meet the minimum sample size requirements identified from the calculation above. The sample was selected using the statistical software package SPSS version 24.0.
Further references for this methodology are: Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
There is a problem
Thanks for your feedback