### 9. Analysis techniques

**Significance testing**

9.1 Where this report discusses differences between two percentages (either across time, or between two different groups of people within a single year), this difference is significant at the 95% level or above, unless otherwise stated. Differences between two years were tested using standard z-tests, taking account of complex standard errors arising from the sample design. Differences between groups within a given year were tested using logistic regression analysis, which shows the factors and categories that are significantly (and independently) related to the dependent variable (see below for further detail). This analysis was done in PASW 18, using the CS logistic function to take account of the sample design in calculations.

**Regression analysis**

9.2 Regression analysis aims to summarise the relationship between a 'dependent' variable and one or more 'independent' explanatory variables. It shows how well we can estimate a respondent's score on the dependent variable from knowledge of their scores on the independent variables. This technique takes into account relationships between the different independent variables (for example, between education and income, or social class and housing tenure). Regression is often undertaken to support a claim that the phenomena measured by the independent variables cause the phenomenon measured by the dependent variable. However, the causal ordering, if any, between the variables cannot be verified or falsified by the technique. Causality can only be inferred through special experimental designs or through assumptions made by the analyst.

9.3 All regression analysis assumes that the relationship between the dependent and each of the independent variables takes a particular form. Logistic regression analysis is a method that summarises the relationship between a binary 'dependent' variable (one that takes the values '0' or '1') and one or more 'independent' explanatory variables.

9.4 The significance of each independent variable is indicated by 'P'. A p-value of 0.05 or less indicates that there is less than a 5% chance we would have found these differences between the categories just by chance if in fact no such difference exists, while a p-value of 0.01 or less indicates that there is a less than 1% chance. P-values of 0.05 or less are generally considered to indicate that the difference is highly statistically significant, while a p-value of 0.06 to 0.10 may be considered marginally significant.

### Contact

Email: Donna Easterlow