Chapter 2: Methods
2.1 The Growing Up in Scotland sample
The analysis in this report uses information from families in both birth cohorts predominantly when the cohort child was aged 3. Some families who initially took part in GUS did not do so for all of the subsequent sweeps. There are a number of reasons why respondents drop out from longitudinal surveys and such attrition is not random. All of the statistics have been weighted by a specially constructed longitudinal weight to adjust for non-response and sample selection. Only unweighted sample sizes are given in the tables. Standard errors have been adjusted to take account of the cluster sampling  .
The study has been designed so that the sample of children is representative of all children living in Scotland at age 10 months who were born within a specific 12-month period. For BC1, this is June 2004 to May 2005 and for BC2 it is March 2010 to February 2011. As such, at age 3 the weighted sample is considered to be representative of all children living in Scotland aged 3. Thus BC1 is used interchangeably with 'children aged 3 in 2007/08' and BC2 is used similarly with 'children aged 3 in 2013'.
At each sweep/year of fieldwork, interviews took place around six weeks before the child's next birthday, therefore in the first year of the study, children were 10 months old. For the purposes of this report, beyond the first interview, the child's age is referred to in years. It is worth bearing in mind however that a 3-year-old child was actually 34 months old or just under 3.
Language ability was measured in both GUS birth cohorts via the naming vocabulary subtests of the British Ability Scales. This subtest is part of a cognitive assessment battery designed for children aged between 2 years and 6 months and 17 years and 11 months. Numerous tests of ability and intelligence exist but the BAS is particularly suitable for administration in a social survey like GUS.
The naming vocabulary assessment measures a child's language development. The test requires the child to name a series of pictures of everyday items and assesses the expressive language ability of children. There are 36 items in total in the naming vocabulary assessment. However, to reduce burden and to avoid children being upset by the experience of repeatedly failing items within the scale, the number of items administered to each child is dependent on their performance. For example, one of the criteria for terminating the naming vocabulary assessment is if five successive items are answered incorrectly.
Children in both cohorts have been asked to complete the same assessments when they were aged 3 years old (34 months) and when they were aged 5 years old (58 months - fieldwork undertaken during 2015).
On completion, a range of scores is available for each child: raw score, ability score and standardised or 't-score'. The raw score counts the number of items a child answers correctly. As different children are asked different item sets dependent on their age and performance, the raw score cannot be compared. Thus to allow comparison between children the raw score is converted to an ability score. The range of ability scores vary from one sub-test to the next. To allow comparison of a child's performance on different BAS sub-tests, t-scores are derived. T-scores for each assessment have an average of 50 and a standard deviation of 10. Therefore a child with a t-score of 50 has an average ability across all children in that age group. Those with a t-score greater than 50 scored above average and those with a score of less than 50 scored below average. By using the standardised scores it is possible to compare ability at age 3 and 5 and to consider whether children who scored above, below or about average at age 3 continued to do so at age 5.
Whilst the same BAS assessment - naming vocabulary - was used for both cohorts at the same age, the edition of BAS was different. For BC1, the 2 nd edition assessment was used, whereas for BC2 the 3 rd edition was used. Whilst the assessments are almost identical, there are a small number of differences - for example in the individual items, the order of the items and the stopping points - which would introduce caveats when making a straightforward comparison of ability scores. To allow this, the assessment authors provided a calibration formula which permitted comparison of the standardised ability scores (t-scores). Note that, because of this adjustment, it is not possible to convert differences in average cognitive ability scores to developmental age in months, as has been done in a previous GUS report (Bradshaw, 2011).
2.3 Analytic approach
Much of this report is concerned with exploring changes between the two cohorts both at an overall level and within major socio-economic sub-groups. The relationship between the outcome being examined ( e.g. home learning activities or language ability) and the socio-economic indicator was examined separately for each cohort. This allowed us to identify any noteworthy differences in outcomes - within each cohort - between children in different groups. By then comparing the results for BC1 and BC2 using analysis which combined the cohorts, we were able to assess if there had been any change in the nature of the relationship between the outcome variable and socio-economic indicator across the cohorts. For example, whether there had been a narrowing or widening of the differences between outcomes for children in the different sub-groups.
The cohort or sub-group being examined in each table is clearly described and the numerical base is also shown. While all results have been calculated using weighted data, the bases shown provide the unweighted counts. It should therefore be noted that the results and bases presented cannot be used to calculate how many respondents gave a certain answer.
Many of the factors we are interested in are related to each other as well as being related to participation in home learning activities or early language development. For example, younger mothers are more likely to have lower qualifications, to be lone parents, and to live in areas of high deprivation. Simple analysis may identify a relationship between maternal age and home learning activities - for example, that younger mothers read with their children less often. However, this relationship may be occuring because of the underlying association between maternal age and education. Thus, it is actually the lower education levels amongst younger mothers which is driving the association with frequency of reading rather than the fact that they are younger in age. To avoid this difficulty, multivariable regression analysis has been used. This analysis allows the examination of the relationships between an outcome variable ( e.g. frequent parent-child reading or language ability score) and multiple explanatory variables ( e.g. parental education, parental employment status, child gender, cohort) whilst controlling for the inter-relationships between each of the explanatory variables. This means it is possible to identify an independent relationship between any single explanatory variable and the outcome variable; to show, for example, that there is a relationship between parental employment status and home learning activities that does not simply occur because both education and maternal age are related.
Previous research has shown that socio-economic characteristics such as household income, parental level of education and social class are closely interrelated. Therefore, for analysis purposes we selected only one measure to reflect the child's social background, namely parental level of education. Parental level of education was chosen because previous GUS analysis (Bradshaw, 2011) showed that this was the socio-economic factor most strongly related to language development.
- Parental level of education (highest level in household)
- Number of children in household (one, two or three, or four or more)
- Whether child was first born
- Family type (whether one- or two-parent household)
- Languages spoken in the household (whether English only, English and other language, or other languages only)
- Child's sex
- Employment status of child's main carer (whether working full-time, working part-time, or not working)
It is worth noting that the influence of these factors is likely to vary as the child gets older. For example, a main carer who is 'not working' when the child is aged 10 months may reflect the fact that he or she is on maternity or paternity leave. As such, at this age, 'not working' may arguably be used as an indicator of how much time the main carer has available to spend with the child. At age 3, however, a main carer who is 'not working' may be be indicative of socio-economic disadvantage - something which has often been shown to be associated with a lower frequency of home learning activities. This means that employment status of the child's main carer may have a different relationship with the frequency of home learning activities at different ages. This should be borne in mind when interpreting the results.
The main factors influencing a child's enjoyment of reading at age 8 are likely to be different to those influencing frequency of parent-child activities undertaken with a baby or toddler. Therefore, the multivariable models used in chapter 7 to explore the relationship between early reading and enjoyment of reading at age 8 controlled for parental level of education (highest level in household) and child's sex only.
For certain analyses - for example, to consider whether the relationship between parental education and language ability was different in each cohort or whether the relationship between home learning activities and early language was different for parents with different education levels - 'interactions' were included in the multivariable models. Where an interaction is statistically significant this indicates that the relationship between the explanatory variable ( e.g. home learning activities) and the outcome variable ( e.g. language ability) is different either in each cohort or according to the value of the other explanatory variable ( e.g. parental level of education). This may suggest, for example, that whilst frequent reading with the child is generally associated with improved language ability, the relationship is stronger amongst children whose parents have lower qualifications.
The multivariable analysis uses both linear and logistic regression models. Full results of the models are included in the Technical Annex along with notes on how to interpret them.
The statistical analysis and approach used in this report represents one of many available techniques capable of exploring this data. Other analytical approaches may produce different results from those reported here.