10 Confidence Intervals and Statistical Significance
SCJS estimates are based on a representative sample of the population of Scotland aged 16 or over living in private households. A sample, as used in the SCJS, is a small-scale representation of the population from which it is drawn.
Any sample survey may produce estimates that differ from the values that would have been obtained if the whole population had been interviewed (the true population value). The magnitude of these differences is related to the size and variability of the estimate, and the design of the survey, including sample size.
It is however possible to calculate a range of values between which the population figures are estimated to lie; known as the confidence interval (also referred to as margin of error). At the 95 per cent confidence level, when assessing the results of a single survey it is assumed that there is a one in 20 chance that the true population value will fall outside the 95 per cent confidence interval range calculated for the survey estimate. Similarly, over many repeats of a survey under the same conditions, one would expect that the confidence interval would contain the true population value 95 times out of 100.
Because of sampling variation, changes in reported estimates between survey years or between population subgroups may occur by chance. In other words, the change may simply be due to which respondents were randomly selected for interview.
Whether this is likely to be the case can be assessed using standard statistical tests. These tests indicate whether differences are likely to be due to chance or represent a real difference. In general, only differences that are statistically significant at the five per cent level (and are therefore likely to be real as opposed to chance) are described in the 2009/10 SCJS Main Findings report and the other supplementary SCJS reports. 86
10.2 SCJS confidence intervals
Confidence intervals around SCJS estimates are based on sampling variation calculations which reflect the stratified and, in some areas, clustered design of the survey, and also the weighting applied. They are often referred to as complex standard errors ( CSEs). The values for these were calculated using the SAS Surveymeans module ( http://www.sas.com ).
Statistical significance for change in SCJS estimates for 'all SCJS crime' cannot be calculated in the same way as for other SCJS estimates. This is because there is an extra stage of sampling used in the personal crime rate (selecting the adult respondent for interview) compared with the household crime rate (where the respondent represents the whole household) (sections 7.2.1 and 8.1). Technically these are estimates from two different, although obviously highly related, surveys. The Office for National Statistics ( ONS) methodology group has provided an approximation method to use to overcome this problem. This method is also used by the British Crime Survey ( BCS).
The approach involves producing population-weighted variances associated with two approximated estimates for overall crime. The first approximation is derived by apportioning household crime equally among adults within the household (in other words, converting households into adults). The second apportions personal crimes to all household members (converting adults into households).
The variances are calculated in the same way as for the standard household or personal crime rates ( i.e. taking into account the complex sample design and weighting). An average is then taken of the two estimates of the population-weighted variances. The resulting approximated variance is then used in the calculation of confidence intervals for the estimate of 'all SCJS crime'. It is then used in the calculation of the sampling error around changes in estimates of 'all SCJS crime'. This enables the determination of whether such differences are statistically significant.
This method incorporates the effect of any covariance between household and personal crime. By taking an average of the two approximations, it also counteracts any possible effect on the estimates of differing response rates by household size.
If confidence intervals are not provided, then an approximation may be used. The standard error should be calculated assuming a simple random sample and the value multiplied by an appropriate design factor to provide the confidence interval. Design factors will differ for different types of crime and characteristics. Examination of the data indicates that most design factors that have been calculated have values of less than 1.2. This suggests that the use of a design factor of 1.2 would provide conservative estimates of confidence intervals for most estimates from the survey, including the main and self-completion data.
Table 10.1 shows the following for the key crime groups:
- The estimates for incidence rates per 10,000 adults / households;
- The 95% confidence intervals;
- The simple random sample ( SRS) standard error;
- The complex, or SCJS sample, standard error;
- The design factor.
Table 10.1: Rates, confidence intervals, standard errors and design factors for key crime groups (incidence rate per 10,000)
Base: Adults (16,036).
Variable name: incidence variables (see section 7.3 for details).
Crime rates per 10,000
SRS Stand. Err.
SCJS Stand. Err.
ALL SCJS CRIME
Motor vehicle vandalism
All mv theft related incidents
Theft of a motor vehicle
Theft from a motor vehicle
Attempted theft of / from mv
Other h'hold thefts inc. bicycles
Other household theft
Personal theft excl. robbery
Theft from the person
Other personal theft