Information

Scottish Parliament election: 7 May. This site won't be routinely updated during the pre-election period.

Scottish Study of Early Learning and Childcare: Phase 5 Report

This report outlines findings from the 5th phase of the Scottish Study of Early Learning and Childcare (SSELC), focusing on 4- and 5-year-olds who are accessing up to 1140 hours of funded ELC. The SSELC forms a major part of the strategy for the evaluation of the expansion of funded ELC in Scotland.


Appendix B: Methodology

Phase 5 of the SSELC had multiple aims. Firstly, it was designed to provide robust, nationally-representative data on a number of specific child and parent outcomes for those aged between four years three months and five years six months who were receiving up to 1140 hours of funded ELC. It was also intended to collect information about the household circumstances of these children, childcare use, the socio-economic characteristics of the family and a range of additional circumstances, experiences and behaviours known to be associated with child outcomes.

Secondly, it was intended to ensure that, at the final analysis and reporting stage, data from these children was comparable with data from children who took part at Phase 2 of the study, in order to provide a pre- and post-expansion comparison. This full comparative analysis, which controls for relevant factors and assesses whether differences found between 2019 and 2024 are statistically significant, will form part of the final report.

Finally, it was intended to gather information from setting heads about their opinions of the expansion of ELC and how easy or difficult it had been to meet the requirements.

Sampling

A sample of children aged between four years three months and five years six months with a funded ELC place in Scotland was drawn via a two-stage process. First, a stratified sample of ELC settings was drawn. For larger settings, a second stage was involved. If there were more than 10 eligible children at the setting, a sample of 10 children was drawn by setting staff from those eligible. The large majority of settings had no more than 10 eligible children. In these cases, all eligible children at the setting were invited to take part.

Selection of settings

A systematic sample of 499 ELC settings was drawn (including a reserve sample of 84 settings), stratified by local authority (LA) (Glasgow and other LAs) and two deprivation groups (ELC settings in the most deprived 20% of areas, based on SIMD, vs the rest). Settings in the most deprived quintile were oversampled to maximise the ability to analyse data in relation to the poverty-related outcomes gap. Before selection the sampling frame was ordered by setting size, LA, and deprivation score before selection. To give all eligible children an equal chance of being selected, all settings were also given equal selection probabilities, except for those with more than 10 eligible children. These were given a proportionately higher probability of selection. Settings that opted out were excluded.

A list of all ELC group settings in Scotland[26] with four- and five-year olds was provided by the Scottish Government and local authorities, including figures for the number of eligible four- and five-year-olds at each setting.

East Dunbartonshire and North Ayrshire opted not to participate in this phase, so no settings were sampled for Phase 5 from these LAs.

Reserve samples

A reserve sample of 84 settings was also drawn using the same method, in case there were a larger number than expected of settings with no eligible children/otherwise unable to take part identified before the start of the fieldwork period. As East Dunbartonshire and North Ayrshire did not participate in Phase 5, the full reserve sample was issued with the main sample.

Population figures and setting sample sizes by strata
Local authority

Settings with eligible 4/5-year-olds

Total estimated eligible 4/5-year-olds attending

Issued sample (number of settings)[27]
Non-deprived 2,005 38,015 255
Deprived (non-Glasgow) 366 8,395 182
Deprived (Glasgow) 110 2,850 62
All 2,481 49,260 499

Data collection

Data were gathered on children in the cohort via two methods: a survey of parents/carers; and a survey of the children's ELC keyworkers (primarily to measure child development). Data about the settings were also collected, by a short online questionnaire for setting heads and observations of ELC settings attended by sampled children carried out by Care Inspectorate inspectors.[28]

Parents were recruited by ELC staff and provided with information about the study before being asked to complete a paper self-administered questionnaire that collected a wide range of information about themselves, their child and their household. Parents were also asked for their permission for the child's keyworker to complete a questionnaire about the child's development. This largely consisted of the Ages and Stages (ASQ) and Strengths and Difficulties (SDQ)[29] questionnaires but also collected administrative information, including the number of hours the child attended the ELC setting in the previous week and whether the child had Additional Support Needs.

The setting heads questionnaire, introduced at Phase 4, was also included at Phase 5 of the study. It asked about support provided by the setting to families of four and five-year-olds, rather than about specific children. It also asked about food provision by the setting and about challenges faced in relation to the expansion.

Response rates to the surveys are not easy to estimate because information about the exact number of eligible children within each setting was not available (see section on Sampling above). Of the 499 settings sampled, 39 were withdrawn by the local authority before the start of fieldwork due to, for example, concerns about other pressures on the settings. A further 8 withdrew after questionnaires were sent out but before the start of fieldwork. Twenty-five settings informed ScotCen that they were unable to take part, including six who refused, mainly because of the burden on staff, nine who had no eligible children, and 10 for whom no reason was recorded. A further 126 settings did not participate. The proportion of these that were eligible is not known. At least one questionnaire (keyworker or parent) was returned from 300 settings. A total of 1,648 completed questionnaires were received from parents and 2,217 from keyworkers. Forty keyworker questionnaires were removed from the data as it was not clear from the information provided by the settings that the parents were aware that the questionnaires were being completed. This gave a total of 1,533 paired questionnaires. As a rough estimate, the response rate for the parent questionnaire was 37%, for the keyworker questionnaire was 50% and for both was 35%.[30] The setting heads' questionnaire was completed by 271 setting heads.

Observations were conducted of 150 participating ELC settings using the Early Childhood Environment Rating Scale (ECERS). This is a widely recognised and highly regarded instrument designed for use in settings where most children are aged between three and five. It provides an observational measure of the quality of ELC settings for pre-school children across six domains: space and furnishings, personal care routines, language and literacy, learning activities, interaction and programme structure, as well as other observations around numbers of children and staff and access to outdoor space.

Thirty percent of settings in the sample were randomly selected to receive a visit. Where settings were unable to participate in either the observations or the whole study, the selected setting was replaced by another from the sample, as similar as possible in terms of size, service type and location to ensure a representative subsample of 150 settings.

Observations were conducted by Care Inspectorate staff seconded to the study and involved a single visit lasting between 2 and 3 hours. It was emphasised to ELC setting managers and staff before and during these observations that they were not formal inspections of the kind routinely undertaken by the Care Inspectorate.

Data analysis and statistical significance

Data analysis has been conducted using the complex samples package of SPSS version 29. Using this, the clustering of children within settings can be taken into account, without the need to use multi-level models. All analysis uses weighted data for Phase 5, except where discussing the characteristics of the cohort or the characteristics of the settings. Different weights were applied, depending on the variables included in the analysis (see section on Weighting below). Tests for statistical significance have been conducted through the use of regression analysis, and all differences between subgroups at Phase 5 discussed within the text are statistically significant unless otherwise stated.

For the significance tests, categorical outcome variables have been reduced to binary variables, so that logistic regression analysis can be used. For example, ASQ scores have been reduced to "on schedule"/"not on schedule", as whether the child is on schedule with their development is the outcome of interest. This allows us to say that girls were more likely to be on schedule for the communication domain. Conducting a chi-square test using all categories of the ASQ variable would only allow us to say that there was an association between ASQ score and sex, with girls tending to do better. It also allows multiple independent variables to be included in the modelling, so relationships between ASQ scores and sex/area deprivation can be tested. For continuous outcome variables, such as SWEMWBS score, linear regression was used.

Weighting

Weights are commonly applied to survey data so that the achieved sample better represents the population it was drawn from. Groups that are under-represented in the achieved sample are given higher weights than those that are over-represented, with the aim of weighted data matching the population distribution by key characteristics. Survey estimates produced using the weighted data should then be closer to estimates that would have been gained from the whole population of interest.

As Phase 5 included multiple questionnaires, three sets of weights have been produced. These are for analysis of: setting head responses, keyworker responses and parent responses. The same basic weighting approach was used for all three sets of weights, with specific modifications where required. The approach was consistent with the weighting of Phases 2, 3, and 4 of the project.

The basic weighting approach consisted of two elements: selection weighting and non-response modelling. The first stage adjusted for differential probability of selection (for settings and children) resulting from the sample design, which oversampled settings within the most deprived SIMD quintile. The second stage adjusted for differences in the profiles of sampled and responding settings, using logistic regression modelling. Calibration weighting, which adjusts the profile of the weights to match estimates of the population, could not be used due to the absence of detailed population estimates for eligible four-year-olds.

Further details of the methods used to produce each set of weights are provided in the subsections below.

Setting Head Weights

Setting head weights were created for the 271 settings where a head responded. Only one setting head response was allowed from each setting. First, selection weights for the settings were created from the inverse selection probability during sampling. Second, a setting-level logistic regression model was run, weighted by the selection weight. The outcome for this model was response from the setting head and the covariates included were SIMD quintiles, setting type (LA or private/voluntary/non-profit), size band, and whether the setting was in Glasgow. Non-response weights were calculated as the reciprocal of the propensity to respond estimated from this model. Finally, the non-response weights were combined with the setting selection weights and checked for outliers. The four top weights were trimmed to improve efficiency. The design effect of the final setting head weights is 1.77 and the efficiency 57%.

Keyworker Weights

Keyworker weights were created for 2,177 keyworker responses. Up to 10 keyworker responses from each setting were allowed. First, selection weights for the settings were created from the inverse selection probability of each setting during sampling. Second, a setting-level logistic regression model was run, weighted by the selection weight. The outcome for this model was any keyworker responses from the setting and the covariates included were SIMD quintiles, setting type (LA or private/voluntary/non-profit), size band, and whether the setting was in Glasgow. Non-response weights were calculated for the 266 settings with keyworker responses as the reciprocal of the propensity to respond estimated from this model. Third, the setting-level non-response weights were combined with the setting selection weights and matched onto the 2177 keyworker responses.

As the final step, child selection weights were calculated to adjust for children's differential probability of selection between settings. These were calculated from the inverse of number of children selected per setting (if recorded on the response sheet) or number of children sampled (if not available from the response sheet) divided by estimated eligible children at the setting. The setting-level weights were combined with the child selection weights and checked for outliers. The top weights were trimmed at the 99.5th percentile to improve efficiency. The design effect of the final keyworker weights is 1.45 and the efficiency 69%.

Parent Weights

Parent weights were created for 1,648 parent responses. Up to 10 parent responses from each setting were allowed. First, selection weights for the settings were created from the inverse selection probability of each setting during sampling. Second, a setting-level logistic regression model was run, weighted by the selection weight. The outcome for this model was any parent responses from the setting and the covariates included were SIMD quintiles, setting type (LA or private/voluntary/non-profit), size band, and whether the setting was in Glasgow. Non-response weights were calculated for the 295 settings with parent responses as the reciprocal of the propensity to respond estimated from this model. The non-response weights were checked for outliers and the top weight trimmed.

As the third step, setting-level non-response weights were combined with the setting selection weights and matched onto the 1648 parent responses. Finally, child selection weights were calculated to adjust for children's differential probability of selection between settings. These were calculated from the inverse of number of children selected per setting (if recorded on the response sheet) or number of children sampled (if not available from the response sheet) divided by estimated eligible children at the setting. The setting-level weights were combined with the child selection weights and checked for outliers. The two top weights were trimmed to improve efficiency. The design effect of the final parent weights is 1.37 and the efficiency 73%.

Contact

Email: socialresearch@gov.scot

Back to top