Scottish Study of Early Learning and Childcare: Three-year-olds (Phase 3) Report - Updated 2021

Findings from the third phase of the Scottish Study of Early Learning and Childcare (SSELC), a research project established to evaluate the expansion of early learning and childcare in Scotland.

This document is part of a collection


Appendix B: Methodology

Aims

Phase 3 of the SSELC was designed to provide baseline data on several specific child and parent outcomes as well as information about socio-economic characteristics, family and household circumstances, characteristics of childcare use and a range of additional circumstances, experiences and behaviours known to be associated with child outcomes. The aim was to follow up children who had taken part at Phase 1 of the study at age two, to gather data on progress following one year of funded childcare, and to gather data on a nationally representative sample of children of the same age.

Sampling

Sampling was done separately for the two cohorts of children.

The Eligible 2s

At Phase 1 of the study, data was collected about 586 children aged between 2 years and 2 years 6 months who were eligible for and receiving up to 600 hours of government-funded or local-authority-funded ELC provision and their parents. Participants were recruited via ELC settings in 17 local authority areas. The sample of settings was provided by the Scottish Government in consultation with local authority ELC leads. Most settings that met the eligibility criteria in the relevant local authorities were included in the sample. Within participating settings, all children within the specific age range receiving the funded entitlement were eligible for inclusion in the study. The achieved sample was not geographically representative of all eligible 2-year-old children in Scotland and therefore may be best described as a specific cohort of children rather than as nationally representative, even though there are significant similarities between the two.

Phase 1 Fieldwork was conducted between October and December 2018. A total of 428 questionnaires were received from parents / carers and 574 from keyworkers in 151 different settings. Attempts were made to follow up all of these children. No distinction was made depending on the type of questionnaire returned at Phase 1.

As part of the recruitment process for Phase 1, setting heads and parents/carers were informed that they would be contacted again regarding further participation in the study. In August 2019, at the start of the school/nursery term, letters were sent to the heads of all settings that had participated in Phase 1 asking which of the children who had been involved at Phase 1 were still attending the setting, and for those children who had moved to another setting, contact details for the new setting.

Of the 586 children who took part at Phase 1, 416 were believed to be attending the same setting or another setting which took part at Phase 1 (139 separate settings); 133 were traced to new settings (97 settings) and 37 could not be traced (mostly recorded as not attending ELC in Scotland).

The Comparator 3s

The aim of the Comparator 3s sample was to achieve a nationally representative sample of 600 children eligible for and receiving 600 hours of government funded ELC of the same age as the Eligible 2s at the time of the survey. 

The sample of Comparator 3s was drawn from settings which took part at Phase 2 or indicated that they would be happy to take part at Phase 3 even if they were not able to take part at Phase 2. This was for three main reasons:

  • As most of these settings had previously participated, or attended an information session at Phase 2, efficiencies were made by not repeating information sessions for these settings.
  • Similarly, most of the settings involved at Phase 2 had also been observed by the Care Inspectorate and assessed using the Early Childhood Environment Rating Scale (ECERS-3), which was designed for evaluating ELC provision for children from age two and a half to five. Hence further efficiencies were made by not repeating this exercise.
  • The achieved sample at Phase 2 was nationally representative of four- and five-year-olds attending ELC settings, once weighting had been applied to take account of the deliberate oversampling of settings in deprived areas. All of the Phase 2 settings also catered for children from the age of three, and the distribution of children across settings was similar for both age groups. Hence with small adjustments to the weighting of data, the Phase 3 sample could be said to be nationally representative of three-year-olds attending ELC settings.

A small number of settings which participated at Phase 2 had since moved on to providing 1140 hours of funded ELC. These settings were not removed from the sample as the children attending these settings would have only recently started at the setting, and it was assumed that the larger number of hours would not yet have had much impact on their development. This also allowed a closer match with the sample for the Eligible 2s, who were not removed from the sample if they were attending a setting offering 1140 hours at the time of the Phase 3 survey.

At Phase 2, settings in deprived areas were deliberately oversampled. This was not an aim of the Phase 3 sample, so proportionally fewer settings from deprived areas were selected at Phase 3, with the aim of achieving a nationally representative sample. To calculate the size of the issued sample, it was assumed that a response of around 80% of that of Phase 2 would be achieved. This was a rough estimate based on the fact that there would be fewer than half the number of children meeting the age criteria at Phase 3 than at Phase 2, so the proportion of settings with fewer than 10 eligible children would be higher. There was also an expectation that some settings which had participated at Phase 2 would be unwilling or unable to do so at Phase 3.

Based on these sampling assumptions, all 122 settings in the four least deprived quintiles of the Scottish Index of Multiple Deprivation which either took part at Phase 2 or indicated a willingness to take part were invited to take part again. Settings from the most deprived quintile who took part at Phase 2 or indicated a willingness to take part were stratified by size, and around a quarter of them – 31 in total – were selected randomly.

It is recognised that the sample was not as perfectly random as one achieved by resampling from all settings offering 600 hours of funded childcare to three-year-olds, but it is a good approximation of this.

The second stage of the sampling process was to sample within the settings. Up to 10 children were selected within each sampled setting. In settings with fewer than 10 eligible children, all parents of eligible children were invited to participate. In settings with 10 or more children, 10 children were selected at random by ELC staff following instructions from the research team. Only parents of the selected children were then invited to participate.

Two settings were removed from the Comparator 3s sample because they also had ten or more children in the Eligble 2s sample. A number of other settings were included in both samples. No such setting had more than 6 children from the Eligible 2s sample, so they were instructed to complete the survey for all children from the Eligible 2s sample and for up to 10 additional randomly selected children in the same way as the other settings in the Comparator 3s sample.

Data collection

Data were gathered on children in the cohort via two methods: a survey of parents/carers; and a survey of the children's ELC keyworkers (primarily to measure child development). Data about the settings were also available, including observations of ELC settings attended by sampled children at Phase 1 and Phase 2 carried out by Care Inspectorate inspectors[37]

Parents were recruited by ELC staff and provided with information about the study before being asked to complete a paper self-administered questionnaire that collected a wide range of information about themselves, their child and their household. Parents were also asked for their permission for the child's keyworker to complete a questionnaire about the child's development. This largely consisted of the Ages and Stages (ASQ) and Strengths and Difficulties (SDQ)[38] questionnaires but also collected information about the number of hours the child attended the ELC setting in the previous week.

Fieldwork was conducted between October and December 2019. For the Eligible 2s, questionnaires were sent to 236 settings for a total of 549 of the 586 children who took part at Phase 1. 

  • At least one questionnaire was returned for 391 children, including 376 keyworker questionnaires and 269 parent questionnaires; 254 children had both questionnaires completed
  • 372 children had keyworker questionnaires for both Phases – 65% of the 574 keyworker questionnaires returned at Phase 1
  • 228 children had parent questionnaires for both phases – 53% of the 428 parent questionnaires returned at Phase 1
  • In total, 212 children had both questionnaires completed at both phases – 51% of the 416 with both questionnaires completed at Phase 1

For the Comparator 3s, questionnaire packs were sent to 151 ELC settings and at least one questionnaire was returned from 112 of these. Response rates for this group of children are not as easy to estimate because information about the number of eligible children in every setting was not available. 

  • At least one questionnaire was returned for 851 children, including 811 keyworker questionnaires and 565 parent questionnaires; 515 children had both questionnaires completed
  • Based on the limited available evidence[39], response rates among keyworkers in the 112 responding settings was around 90%, while for parents / carers it was around 60%. 

Weighting

Weights are commonly applied to survey data to make the achieved sample representative of the population it was drawn from, and to help produce unbiased survey estimates. Groups that are under-represented in the achieved sample are given larger weights than those that are over-represented, so that the weighted data matches the population on key characteristics. Estimates produced using the weighted data should then be closer to estimates that would have been gained from a representative sample.

There are two main motivations for weighting: to compensate for unequal sampling probabilities, and to reduce non-response bias. Because the Eligible 2s were not a random sample, weighting was not applied. The sample of settings was not geographically representative of Scotland, and because there was no participation at all in 15 local authority areas, it was not possible to compensate for this unequal sampling probability via weighting. Therefore it was most appropriate to treat the Phase 1 sample as a specific cohort of children, rather than weight the data and claim representativeness of children eligible for funded ELC at age two. At Phase 3, this cohort had reduced in number because of non-response and non-contact. However, as Table B1 shows, in terms of most demographics, such as area deprivation and sex of the child, there was no significant bias in the response, so the sample continues to represent the same cohort of children. The proportion of non-white children in the sample had decreased, although this is not thought to significantly affect results. The proportion of children for whom further assessment was needed on the Phase 1 ASQ communication and problem solving domains was also lower among those who participated at Phase 3 than those who did not participate. This does not affect results presented in this document for change between Phase 1 and Phase 3, as analysis has been restricted to only those who participated at Phase 3. However, it is worth bearing in mind for future comparisons with children who completed 1140 hours of ELC as Eligible 2s. The proportion of children on schedule on these domains was similar among participants and non-participants.

Table B1: Characteristics of participants and non-participants at Phase 3, Eligible 2s

Participated at Phase 3 Did not participate at Phase 3 All who participated at Phase 1
Phase 1 characteristics % % %
Sex
  Boys 52 55 53
  Girls 48 45 47
Household type        
  Single parent 53 54 53
  Couple parent 47 46 47
Number of children in household
  One 28 29 28
  Two 41 35 39
  Three or more 31 35 33
Highest qualification of respondent
  None 10 15 12
  Standard Grade or equivalent lower school qualification 37 35 36
  Higher, Advanced Higher or equivalent upper school qualification 16 17 17
  HNC, HND or equivalent post-school, pre-higher education qualification 20 17 19
  Degree, PhD, or other HE qualification, or professional qualification 17 15 17
Area deprivation (Scottish Index of Multiple Deprivation)
  Most deprived 20% 45 48 46
  Other 55 52 54
Equivalised income
  Bottom 10% 48 50 49
  2nd 20 21 20
  3rd 12 14 13
  Top 70% 20 15 19
Ethnicity
  White 97 93 96
  Non-white 3 7 4
Funding type
  Government funded 78 76 78
  Local authority funded (referred) 21 22 21
  Both 1 2 1
Long-term health condition
  Yes 13 13 13
  No 87 87 87
ASQ Communication domain
  Further assessment may be needed 33 42 36
  Monitoring suggested 20 16 18
  Child's development appears on schedule 47 42 45
ASQ Gross motor domain
  Further assessment may be needed 22 25 23
  Monitoring suggested 17 11 15
  Child's development appears on schedule 61 64 62
ASQ Fine motor domain
  Further assessment may be needed 26 29 27
  Monitoring suggested 34 31 33
  Child's development appears on schedule 40 40 40
ASQ Problem solving domain
  Further assessment may be needed 43 48 45
  Monitoring suggested 23 19 22
  Child's development appears on schedule 34 32 34
ASQ Personal-Social domain
  Further assessment may be needed 33 43 37
  Monitoring suggested 27 18 24
  Child's development appears on schedule 40 38 39
SDQ total difficulties score
  Close to average 44 42 43
  Slightly raised 26 25 26
  High 13 15 14
  Very high 16 17 16
Unweighted base (keyworker questionnaire Phase 1) 386 188 574
Unweighted base (parent questionnaire Phase 1) 288 140 428

Base: All children who participated at Phase 1

For the Comparator 3s, an assumption was made that the sampling frame for Phase 2 was complete also for Phase 3. This was not totally true, as settings which only opened in August 2019 or settings which did not cater for four- and five-year-olds would have been excluded, although it was mostly true. A further assumption was made that those who declined to participate at Phase 2 and did not indicate at the time that they were willing to participate at a later phase would also have declined to participate at Phase 3. These assumptions allow us to treat the Phase 3 sample as a random sample. 

Non-response bias occurs where there is a differential level of non-response between different groups. In this survey among settings for the Comparator 3s there was a high level of response from certain nurseries and a lower level from others. As children attending the same nursery are likely to have had a more similar experience than those attending different nurseries, children attending nurseries with a high level of response were weighted down, and those with a low level of response were weighted up. Because of different response rates for keyworker questionnaires and parent questionnaires, separate weights were calculated for use with data from each questionnaire.

Calculation of weights happened in two stages. First setting weights were calculated and adjusted for setting non-response. Next at the individual level keyworker and parent weights were calculated to adjust for non-response within settings and then post-stratified to population totals of number of children by quintile of the Scottish Index of Multiple Deprivation.

Setting weights were calculated initially as the inverse of the selection probability for each setting at Phase 2. These were then scaled to have a mean of one for each responding setting. A final setting weight was then calculated to adjust for setting non-response by post-stratifying to strata totals (the strata being the different elements of the sample design – i.e. deprived and non-deprived, with separate strata for deprived in Glasgow and for East Dunbartonshire as these samples were drawn separately).

To produce the keyworker questionnaire weights, each child was initially assigned the setting weight. These were then adjusted for non-response to the keyworker questionnaire within settings. Extreme weights were trimmed and weights were then scaled to a mean of one. A final weight was created by post-stratifying to population totals (three-year olds-attending eligible ELC centres) by deprivation quintile of the setting. Parent questionnaire weights were produced in a similar manner.

Data analysis

Data analysis has been conducted using SPSS version 25. All analysis uses weighted data for the Comparator 3s, except where discussing the characteristics of the cohort, and unweighted data for the Eligible 2s. Tests for statistical significance have been conducted through the use of logistic regression, and all differences for the Comparator 3s discussed within the text are statistically significant unless otherwise stated. Because the Eligible 2s were not a random sample, it is not meaningful to talk of statistical significance for that group. However, tests have been applied as if they were a random sample, although strict rules for their interpretation have not been followed, particularly given the relatively small sample size of this group.

Contact

Email: socialresearch@gov.scot

Back to top