Appendix B – Methodology
Phase 2 of the SSELC was designed to provide baseline data on several specific child and parent outcomes as well as information about socio-economic characteristics, family and household circumstances, characteristics of childcare use and a range of additional circumstances, experiences and behaviours known to be associated with child outcomes. In addition, observations were made to provide a snapshot of the everyday experiences of children in their ELC settings and to generate data in order to control for the effect of settings on children's outcomes in the study.
The cohort consisted of children aged between 4 years 3 months and 5 years 6 months who would be starting P1 in August 2019 ('ELC leavers') and who were receiving up to 600 hours of government-funded or local-authority-funded ELC provision, and the parents of those children. Participants were recruited via ELC settings in 30 local authority areas.
The required sample size was determined by estimating the difference in ASQ scores on the communications domain between four- and five-year-old children living in the least and most deprived areas across Scotland and calculated on the basis of the ability to measure a closing of this difference. A main sample and two reserve samples of ELC settings were drawn separately for deprived and non-deprived areas. As some local authorities were unable to identify which settings were still providing 600 hours at the time of drawing the sample, more settings than expected in the main sample proved to be ineligible for this reason. Consequently both reserve samples were used. This did not affect the geographical representativeness of the achieved sample.
Within those local authorities still offering 600 hours of funded ELC and willing to participate, a two-stage, 'cluster' sampling approach was then taken in order to identify the sample: the first stage involved the selection of settings and the second stage involved the selection of children within settings. Up to 10 children were selected within each sampled setting. In settings with fewer than 10 children, all parents of eligible children were invited to participate. In settings with 10 or more children, 10 children were selected at random by ELC staff following instructions from the research team. Only parents of the selected children were then invited to participate.
Unlike Phase 1, the Phase 2 achieved sample is nationally representative of ELC leavers in settings providing 600 hours of state-funded ELC. To ensure data was collected from a large enough sample of children living in deprived areas, settings in the 20% most deprived areas (based on SIMD score) were deliberately oversampled.
Data were gathered on children in the cohort via three methods: a survey of parents / carers; a survey of the children's ELC keyworkers (primarily to measure child development) and observations of ELC settings attended by sampled children carried out by Care Inspectorate inspectors.
Within participating settings, all children within the specific age range who would be starting P1 in August 2019 were eligible for inclusion in the study, irrespective of whether they received all of their funded entitlement at that ELC setting. Parents were recruited by ELC staff and provided with information about the study before being asked to complete a paper self-administered questionnaire that collected a wide range of information about themselves, their child and their household. Parents were also asked for their permission for the child's keyworker to complete a questionnaire about the child's development. This largely consisted of the Ages and Stages (ASQ) and Strengths and Difficulties (SDQ) questionnaires but also collected information about the number of hours the child attended the ELC setting in the previous week.
Fieldwork was conducted in May and June 2019. Response rates to the surveys are not easy to estimate because information about the eligibility of every setting was not available. Questionnaires were sent to 345 ELC settings and at least one questionnaire was returned from 223 of these. Many of the other 122 reported that they were not eligible for inclusion in the sample. A total of 1,382 questionnaires were received from parents / carers and 1,846 from keyworkers. This gave a total of 1,318 paired questionnaires, 666 from settings in deprived areas and 652 from settings in non-deprived areas, exceeding the target of 600 in each. Nearly all participating settings had 10 or more eligible children, so response rate among keyworkers in these settings was around 83%, while for parents / carers it was around 62%.
Observations were conducted of 150 participating ELC settings using the Early Childhood Environment Rating Scale (ECERS-3). This is a widely recognised and highly regarded instrument designed for use in settings where most children are aged between three and five. It provides an observational measure of the quality of ELC settings for pre-school children across six domains: space and furnishings, personal care routines, language and literacy, learning activities, interaction and programme structure, as well as other observations around numbers of children and staff and access to outdoor space.
Observations were conducted by Care Inspectorate staff seconded to the study and involved a single visit lasting between 2 and 3 hours. It was emphasised to ELC setting managers and staff before and during these observations that they were not formal inspections of the kind routinely undertaken by the Care Inspectorate.
Weights are commonly applied to survey data to make the achieved sample representative of the population it was drawn from, and to help produce unbiased survey estimates. Groups that are under-represented in the achieved sample are given larger weights than those that are over-represented, so that the weighted data matches the population on key characteristics. Estimates produced using the weighted data should then be closer to estimates that would have been gained from a representative sample.
There are two main motivations for weighting: to compensate for unequal sampling probabilities, and to reduce non-response bias. In this survey, nurseries in deprived areas were deliberately oversampled, in order to allow robust estimates for children attending such nurseries. When looking at national figures, it is therefore necessary to weight down those in deprived areas and weight up those in other areas. Non-response bias occurs where there is a differential level of non-response between different groups. In this survey there was a high level of response from certain nurseries and a lower level from others. As children attending the same nursery are likely to have had a more similar experience than those attending different nurseries, children attending nurseries with a high level of response were weighted down, and those with a low level of response were weighted up. Because of different response rates for keyworker questionnaires and parent questionnaires, separate weights were calculated for use with data from each questionnaire.
Calculation of weights happened in two stages. First setting weights were calculated and adjusted for setting non-response. Next at the individual level keyworker and parent weights were calculated to adjust for non-response within settings and then post-stratified to population totals of number of children by quintile of the Scottish Index of Multiple Deprivation.
Setting weights were calculated initially as the inverse of the selection probability for each setting. These were then scaled to have a mean of one for each responding setting. A final setting weight was then calculated to adjust for setting non-response by post-stratifying to strata totals (the strata being the different elements of the sample design – i.e. deprived and non-deprived, with separate strata for deprived in Glasgow and for East Dunbartonshire as these samples were drawn separately).
To produce the keyworker questionnaire weights, each child was initially assigned the setting weight. These were then adjusted for non-response to the keyworker questionnaire within settings. Extreme weights were trimmed and weights were then scaled to a mean of one. A final weight was created by post-stratifying to population totals (four- and five- year olds attending eligible ELC centres) by deprivation quintile of the setting. Parent questionnaire weights were produced in a similar manner.
Data analysis has been conducted using SPSS version 25. All analysis uses weighted data, except where discussing the characteristics of the cohort, and findings from the observations by Care Inspectorate staff. Tests for statistical significance have been conducted through the use of logistic regression, and all differences discussed within the text are statistically significant unless otherwise stated.