2 Research methods
This section summarises the methods used in sampling, surveying and analysing data from ELC providers. It also outlines the key limitations associated with this exercise, that need to be taken into account when interpreting or using the data tables.
2.1 The sample
The Scottish Government provided Ipsos MORI with a list of 966 private and not-for-profit partner providers recorded as providing government-funded places for two, three and four year-olds in the Care Inspectorate's records. This list included contact information (provider name, manager name, e-mail address, phone number, address) and a range of demographic and other details about the provider (e.g. number of funded places for two, three and four year-olds, and number of staff) and their location (e.g. local authority area; Scottish Index of Multiple Deprivation score; Scottish Government six-fold urban-rural classification).
Ipsos MORI carried out some further cleaning and editing of the initial list, including:
- Identifying and dropping duplicate variables and duplicate cases (the final issued sample included 965 cases)
- Filling in missing data (e.g. missing phone numbers) where possible
- Adding additional variables, in particular adding a variable to identify nurseries that appeared to be part of a chain or group of nurseries (since we wanted to give these nurseries the option of completing the survey on behalf of the whole group if easier).
A sample of childminders was provided to Ipsos MORI by the Scottish Childminding Association ( SCMA). The sample was purposive rather than random - we aimed to get a spread of childminders from different areas to test how much possible variation there is in their costs. As such, the 10 interviewed were recruited from a relatively large initial list of contacts.
2.2 Questionnaire design
In order to inform the Scottish Government's planning for the extension of government-funded ELC hours, the survey needed to collect detailed information about private and not-for-profit partner providers' costs, income, capacity and occupancy. The key challenge in designing the questionnaire was how to balance the need for as much detail as possible, with making it as simple and quick as possible for providers to complete. Given the range of partner providers involved - from small charitable playgroups to large private nurseries - the questionnaire also needed to be equally relevant to providers that operate on different models in terms of hours, fee structures, etc. As the level of cost information the survey requested was relatively detailed, providers were advised that it might be useful to have their annual accounts to hand while completing the survey.
Development and piloting
An initial draft questionnaire - drafted following discussions with the Scottish Government and reviewing relevant materials  - was piloted with six partner providers to ensure it was clear and easy to complete. Pilot providers were recruited to reflect a range of different settings (private and not-for-profit, large and small) and areas (in terms of deprivation and urban-rural). Four providers in Edinburgh and two in East Lothian (one accessible rural and one small town) were visited by members of the research team, who reviewed the draft questionnaire with them and identified issues and areas for improvement. Changes made as a result of piloting included:
- Introducing an option to complete the survey on behalf of a group of nurseries, following feedback from people who managed a group of several nurseries that this would be easier than having to fill in questionnaires for each individually
- Adding instructions about how to pause the questionnaire and pass it to someone else to complete, following feedback that more than one person might need to be involved (for example, the day-to-day manager plus the treasurer/finance manager)
- Adding questions to more accurately identify the level of service providers offer (and so they could be routed to more relevant follow-up questions) - for example, checking whether providers only provide the 600 funded hours each year or whether they offer more hours of ELC than this, and asking whether or not their start and end times vary between government-funded and privately-funded places
- Dropping questions that were impossible to answer accurately - for example, feedback from providers was that it was impossible to say what proportion of parents take up their full 600 hours of government-funded ELC, since in some cases parents may split their hours between that provider and a school nursery. Providers also struggled to answer a question about the percentage of staff contact time (spent directly with children) vs non-contact time (for example, time spent on administration), so this was also dropped.
- Changes to terminology, such as: replacing 'pre-school' (generally interpreted as only referring to three to five year-olds) with 'early learning and childcare for children aged 5 and under'  ; amending descriptions of staff categories to better reflect how providers referred to them; and changing the qualification categories to try and better reflect how providers referred to early years qualifications
- Amends to options for the time-period costs relate to, following feedback that for some providers it was easier to give costs/salaries per term, quarter or hour than per week, month or year
- Amends to routing to try and ensure providers were only asked follow-up questions that were definitely relevant to them
- Amends to range-checks (checks built into a computer script which either bring up queries or do not let people move to the next question if they enter a response that appears unlikely).
Feedback was also sought from the National Day Nurseries Association ( NDNA) and Early Years Scotland ( EYS) and minor changes made following this, including clarifying and adding further reassurances about confidentiality and how the data would be stored and used.
Scripting and fieldwork
Once the content of the main providers' questionnaire was finalised, Ipsos MORI's online scripting team transferred it into IBM Dimensions. The online script was subject to further testing and by the research team at Ipsos MORI. All 965 partner providers were then sent an e-mail invitation to complete the survey along with a unique link to their online questionnaire. They were also sent a letter invitation, in case their e-mail address was out of date (see Appendix B). Reminder e-mails were sent 10 days after launch, and again 10 days before the survey closed. The survey was open from 4 th June to 10 th July 2016.
An e-mail address and phone number for the survey was in operation throughout, so that the research team could answer queries and resolve technical problems. Several minor changes to the script were made during fieldwork (identified as a result of comments from respondents). 
Ipsos MORI's telephone centre encouraged providers to respond to the online survey by calling them, checking they had received information about the survey, emphasising the importance of the study to future ELC policy in Scotland, and finding out whether there was anything else they needed to help them take part.
The Childminder's questionnaire is a simplified version of the online survey that was developed with input from the Scottish Childminding Association ( SCMA). The questionnaire includes standard questions on registration, capacity and occupancy as well as questions on various categories of costs derived from an example childminder cash book provided by the SCMA. The questionnaire was formatted in Excel so that the information could easily be pulled together for analysis. Interviews were conducted over the phone by a member of the research team.
2.3 Response rates and achieved sample profile
Table 2.1 shows the profile of the issued and achieved sample. The achieved sample is shown separately for a) all respondents and b) all those respondents who gave sufficient detail at questions about costs to be included in the main analysis of provider costs (18 responses did not include sufficient detail at these questions and were excluded from cost analysis).
Broadly, the profile of the achieved sample - both overall and among those providers who gave sufficient detail to be included in cost analysis - was reasonably close to that of the issued sample of all partner providers in terms of provider type, size, area of Scotland and deprivation. In terms of the urban-rural location of providers, however, the achieved sample included relatively more providers in small towns and rural areas and relatively fewer in large urban and other urban areas compared with the profile of all partner providers. Other smaller variations between the issued and achieved sample included:
- The achieved sample included slightly more not-for-profit providers (30% compared with 24%) and slightly fewer private providers than the issued sample (70% compared with 76%)
- The achieved sample, particularly for those who gave sufficient detail to be included in the cost analysis, included slightly fewer medium-sized providers (with 15-39 funded ELC places) than in the population of all providers (46% compared with 51%)
- There were slightly fewer providers in the least deprived areas of Scotland (18% compared with 24% in the most deprived quintile, as measured by the Scottish Index of Multiple Deprivation), though as there were more in the second least deprived quintile (31% compared with 27%), overall the profile of participating providers broadly reflects that of all partner providers in terms of the deprivation of areas they are located in.
Table 2.1: Sample profile (issued vs. achieved)
|Issued sample||Achieved sample (All)||Achieved sample (all providing detailed cost information)|
|N||%||Number of responses||Number of providers covered||% of providers (excluding 'varies')||Number of responses||Number of providers covered||% of providers (excluding 'varies')|
|Single provider or part of a chain?|
|Part of a chain||224||23%||20||51||23%||15||46||23%|
|Size of provider?|
|Small (<15 funded places)||297||31%||58||58||32%||54||54||33%|
|Medium (15-39 funded places)||494||51%||87||87||48%||76||76||46%|
|Large (40+ funded places)||174||18%||38||38||21%||35||35||21%|
|Varies (chain) ||8||39||8||39|
|Highlands and Islands||138||14%||29||33||16%||28||32||17%|
|North Eastern Scotland||120||12%||18||18||9%||16||16||9%|
|South Western Scotland||346||36%||72||76||37%||66||70||37%|
|SIMD  quintile|
|1 - most deprived||119||12%||22||22||12%||21||21||13%|
|5 - least deprived||230||24%||36||36||19%||31||31||18%|
|1 Large urban||339||35%||49||49||26%||43||43||25%|
|2 other urban||285||30%||52||52||28%||45||45||26%|
|3 small town (access)||87||9%||25||27||14%||23||25||15%|
|4 small town (remote)||43||5%||12||14||7%||12||14||8%|
|5 accessible rural||138||14%||32||32||17%||30||30||18%|
|6 remote rural||73||8%||15||15||8%||14||14||8%|
1. In some cases where respondents answered on behalf of a chain or group of nurseries, it was clear that all members of the chain/group were in the same (type of) geographic area or were of a similar size, but in other cases this was not clear. Responses that fall into these latter categories were therefore coded 'varies'.
2. Scottish Index of Multiple Deprivation
2.4 Data cleaning and analysis
The main aim of this survey was to provide the Scottish Government with an estimated cost per hour of providing ELC for children, supported by detailed information about the costs that feed into this. In order to estimate costs per hour, we needed to calculate:
- Overall costs incurred by ELC providers. This was collected by the survey across various cost headings, including: staff costs, mortgage/rent, utilities, consumables, external catering costs, play and learning equipment, play and learning activities and services, course fees and expenses for staff training, ICT equipment and office supplies, transport costs, maintaining or improving buildings, contracts for building services, business rates, other taxes excluding payroll taxes, and anything else not covered by these. These figures were all taken from responses to sections C and D of the questionnaire.
- The total number of hours of ELC being provided. This was calculated by multiplying: the number of weeks a year providers were open for, by the average hours provided per child per session, by how many children they currently had attending. These component figures were derived using the data provided in response to sections A and B of the questionnaire. Providers were able to give the number of children attending either on a daily basis, or separately for morning and afternoon sessions. They were also asked how long their day or half-day sessions were.
The costs per ELC hour were then calculated by dividing the total annual costs incurred, by the total annual number of ELC hours being provided.
In order to carry out this analysis, a significant amount of data cleaning was required to deal with missing and incomplete data and outliers (data that looks to be outside the plausible range of responses for a particular question). The data cleaning process involved a combination of manually inspecting and making decisions about how to treat missing/unusual data (e.g. whether to include or exclude outliers from calculations based on whether or not they appeared to be within a plausible range of responses), and imputation of missing data where possible and appropriate. Imputation involved estimating a missing value based on what is known about the other characteristics of that provider and the values provided by other providers with similar characteristics - for example, if the number of children per session was missing for a particular provider, we imputed (estimated) this number based on (a) the number they were registered to take (which they had provided) and (b) average occupancy rates for providers of a similar size. As discussed above, 18 of 191 responses were excluded from the main analysis of costs, as they did not provide sufficient information (for example, they provided information about capacity but did not answer questions on costs)
More detailed information about cleaning and data processing is provided where relevant in the notes to individual data tables. However, this section summarises key decisions taken in relation to cleaning (a) costs data and (b) hours data.
Cleaning cost data
- Imputing missing staff costs - in most cases, staff costs were based on responses to a question which asked providers about their actual total staff costs (which they could provide on a weekly, monthly, termly or yearly basis). This question asked them to include costs for all categories of staff, and to include temporary and permanent staff. However, in addition to asking providers to give their total staff costs as a single amount (at question C1a), we also asked them about how many ELC staff they had at different levels, and the average salary paid to staff at these levels (questions C2a to C8c). These more detailed questions were used in two ways:
- As a check on the response to the total staff costs - we used the detailed questions to derive an estimated annual salary bill (by multiplying the number of staff in each category by the average salary for staff in that category, and totalling these together), and compared this with the annual salary bill derived from C1a, which asked for overall staff costs as a single amount. The two figures were not completely comparable - the more detailed questions only asked about average salaries, and may therefore over or underestimate actual salary bills depending on how accurate an average providers were able to give. They also excluded 'other' staff - for example, administrative support staff or drivers - who should have been included in the total staff salary bill. However, being able to compare the two helped identify outliers for further inspection, where the difference between the annual staff costs derived from these two methods looked particularly large.
- To impute overall staff costs where this was not given separately - in 13 cases, providers were able to give numbers of staff and average salaries by level, but did not give an overall figure for their total staff bill. In these cases, the overall staff bill was derived based on the responses given about numbers of staff and average salaries.
- Imputation of other missing costs - non-staff costs were asked about in a standard way - providers were asked to complete a table and for each cost heading (listed above), to enter a value for costs incurred, and to indicate the period this covers. Where costs were not given in annual amounts, they were converted to annual amounts for analysis. There was a sizeable volume of missing data under 'other' costs (where providers had left the cell blank)  . It was unclear whether providers did not incur any costs under these headings, or whether they were simply unable to estimate these costs. We had to make some assumptions about this in order to calculate overall costs.
- For mortgage and rental costs, where no answer was given we have imputed that their costs were the same as the mean mortgage/rental costs for providers in the same tenure
- For all other non-staff costs, where no answer was given costs were imputed for blank cells based on the mean costs for other providers of the same type (private or not-for-profit).
Cleaning ELC hours data
- Number of weeks open each year was calculated as 52 minus the total number of weeks closed (asked at QA2)
- Number of children attending per session - partner providers were asked to say how many children currently attended each session, either per full day session for each day Monday to Friday, or split into morning and afternoon sessions. However, this information was missing for four partner providers. For these four cases, the number of children per session was imputed based on the number of children they were registered to take (collected at QB1a), multiplied by the average occupancy rates (that is, the numbers actually attending, divided by the numbers they are registered to take) for providers of the same size. (In 11 cases where the number of registered places was missing or unknown, we referred to the latest published inspection report for that provider to ascertain how many places they were registered for.)
- Calculating occupancy rates - for every partnership provider, we calculated occupancy rates - that is, what proportion of registered places were actually taken up by children attending - for three age groups (under twos, two year-olds, and three to five year-olds). This was calculated by dividing the number of registered places per week (taken from their response to QB1a) by the number of children they reported attending in a week. For a small number of providers, occupancy rates calculated on this basis were above 100%. In other words, they appeared to have more children attending than they were registered to take. These cases were examined, and the most likely cause of error was inaccurate completion of attendance levels by age group. These cases were visually inspected for any errors and amended. All cases bar one had a final overall occupancy rate of 100% or less (with one having an occupancy rate of 101%). Obvious errors were corrected, leaving no cases where the overall occupancy rate was still over 105%. Where the occupancy rate for under twos or two year olds separately was higher than 100% but the overall occupancy rate was below 100%, it was assumed that this either reflected the current position in the partnership provider or that completion of the attendance figures by respondents had be correct in terms of the overall number but may have been attributed to the wrong age group. These were not edited.
- Hours per child per week - the number of hours of ELC provided per child per week was calculated based on multiplying the number of children attending per session by the session lengths. Based on this, the average hours of ELC provided per child per week ranged from 5 hours to 55 hours across providers, with a median of 23 hours. This is where there most cleaning by visual inspection was undertaken. For example, in some cases providers indicated (in open text responses) that the length of sessions varied on different days of the week (e.g. 2 hours on a Monday, 5 on a Tuesday, etc.), so it was necessary to derive an average session length from this for analysis. It is possible that some over-estimation of ELC hours in total has occurred, since we have had to estimate session lengths in some cases based on opening and closing times, and some children may only attend for a part of this time (particularly where providers offer hourly rates) - although of course, providers will incur costs for the hours they are open even when children are not present.
2.5 Key limitations and issues for interpretation
The analysis carried out for this study gives as accurate information as is possible about provider costs per ELC hour, based on the information we were able to collect. However, there are inevitably some limitations to this data. In particular, in interpreting and using the data for further modelling, it is important to keep the following issues in mind.
- A modest response rate - 22% of partner providers invited to participate in this survey took part. While this is a reasonable response rate for surveys of this type - issued to busy businesses and asking to collect detailed information which they may not have immediately to hand - we cannot be completely sure that there are no differences between average costs based on this survey and average costs incurred by partner providers who did not take part in the survey. However, as described above, the profile of the achieved sample was, overall, broadly similar to that of the issued sample. This gives us reasonable confidence that the findings based on this sample are likely to be broadly representative of partner providers. The main exception to the relatively good match between the issued and achieved sample profile was with respect to the balance of providers in urban vs. rural areas.
- Sample size for sub-groups - while the overall sample size (191 responses covering 222 providers) is sufficient for the analysis required, the number of cases within specific sub-groups is smaller and the degree of confidence we can attach to any figures based on these sub-groups is consequently lower. For example, there were only 18 responses from providers in North Eastern Scotland (of a possible 120 in the issued sample). As such, any analysis or modelling based on this sub-group will have a much higher degree of uncertainty attached to it compared with analysis based on the achieved sample as a whole.
- A relatively high volume of missing data, particularly in relation to non-staff costs - as described above, in a relatively large number of cases, providers left particular cost cells blank. We therefore had to make decisions about imputing amounts based on the mean amount for similar kinds of provider. While this is standard practice for dealing with missing data, it is of course possible that this means that the total costs are either slightly higher or slightly lower than they are in reality. However, this should make only minimal difference to the overall average costs (particularly given that the largest share of the total costs are staff costs).
- Challenges around measuring session times and estimating ELC hours - the most difficult element of both the questionnaire design and the data cleaning related to estimating session times and ELC hours. Piloting indicated that the ways in which providers offered sessions varied widely, with some offering combinations of hourly, half-day and full-day sessions and others offering no standard sessions at all. It was practically impossible to include all possible variations of session lengths within the questionnaire, so we asked providers to estimate half day and full day session lengths. However, if high numbers of children for a given provider are on flexible hours that do not conform to these session lengths, then the total ELC hours derived from this may be under or (more likely) over estimates of the actual number of hours delivered. To the extent that the hours delivered are an over-estimate of actual hours, the hourly costs will be under-estimated (since these are derived by dividing total costs by total hours of ELC provided).
- Exclusion of profit from costs - in using these cost estimates, it is important to note that the questionnaire asked about costs but did NOT ask about profits. This is relevant in terms of discussions about funding - private companies are unlikely to continue to operate if they are not generating a profit in addition to their costs, although actual/desired profit margins will vary widely. It is also worth noting that for some private nurseries, the owner/managers' own income is taken as a draw-down from profits rather than as a salary. It is possible that some owner/managers may have excluded these payments to themselves when asked about 'staff costs'. While we do not have any evidence that this caused widespread difficulties in completing the survey, if it did occur it may have resulted in some underestimation of total staff costs.
- Future additional costs that may need to be taken into account in modelling - it is also important to note that there are some additional costs that providers will be required to meet in the near future - in particular, pension costs relating to auto-enrolment. Providers who are currently providing lower numbers of hours may also incur some new costs (e.g. rental costs where venues are currently being provided rent free on a 600 hour per year basis) if they are to extend their opening hours to accommodate the extension of the ELC entitlement to 1,140 hours.
- Degree of variation in childminder costs - the aim of interviewing a small sample of childminders for this study was to establish whether or not their costs appeared to fairly consistent or not. In fact, there was a large degree of variation in both the costs reported by the 10 childminders interviewed and in their estimated income after costs. Given this, there may be a need for further work with childminders to gain a more accurate understanding of their cost base and operating assumptions, particularly if they may be asked to be more involved in delivery of government funded ELC in Scotland.
Email: Sasha Maguire, Sasha.firstname.lastname@example.org