Nature-based early learning and childcare - influence on children's health, wellbeing and development: literature review

This review of global evidence aimed to understand the extent to which nature-based early learning and childcare (ELC) influences young children’s physical, cognitive, and social and emotional development


Step 1: Searching the literature

To ensure transparency and scientific rigour, the methodology of the present review was registered to the International Prospective Register of Systematic Reviews (CRD42019152582) on 2nd October 2019 prior to the commencement of the literature search. The planned methodology has also been peer-reviewed and published in a scientific journal (22).

This comprehensive systematic review aimed to gather global evidence on the effect of nature-based ELC on children's health, wellbeing and development from both scientific and non-scientific sources:

Scientific sources: nine relevant electronic databases were searched:

1) Education Research Information Centre (ERIC) – (EBSCOhost),
2) Australian Education Index – (Proquest,)
3) British Education Index – (EBSCOhost),
4) Child Development and Adolescent Studies – (EBSCOhost),
5) Applied Social Sciences Index and Abstracts – (Proquest),
6) PsycINFO – (EBSCOhost),
7) MEDLINE – (EBSCOhost),
8) SportDiscus – (EBSCOhost) and
9) Scopus (Elsevier).

Search strategies used for the nine electronic databases were constructed by the review team (VW, AM and AJ) and an example search strategy for the ERIC database can be found in Appendix A which was adapted for the other eight databases. To capture as much relevant evidence as possible, the searches were not restricted by year of publication or publication language.

To capture non-peer reviewed evidence, such as dissertations and reports, Open Grey (, Dissertation and Theses Database (ProQuest) and Directory of Open Access Journals ( were searched. Researchers in the field of children, nature and play were contacted directly to highlight articles. Finally, the first 10 pages of Google Scholar were checked. Literature citing of studies published from 2019 onwards were screened to identify recently published evidence that may have been missed in the initial searches.

Non-scientific sources: Relevant organisations and practitioners in the field were contacted via Twitter and email to obtain additional evidence. Websites of relevant organisations, professional bodies and other groups involved in outdoor education and outdoor play were also searched.

Step 2: Defining the inclusion and exclusion criteria

We followed the PI(E)COS framework for defining the eligibility criteria. PI(E)COS stands for Population, Intervention or Exposure, Comparison, Outcomes and Study design. This provides a systematic approach to capturing evidence relating to the research question.

Population: Children attending ELC settings (i.e. nurseries, preschool) who have not started primary school education were included. The age children start primary (or elementary school as it is known in other countries) varies globally and as this is a review of international evidence, children in eligible studies had to be between 2-7 years. Studies which included children younger than 2 years or older than 7 years were excluded because this age group would not typically attend ELC settings. Studies which included solely a child population with disease conditions (for example, autism, physical disability, attention deficit hyperactivity disorder) were excluded.

Exposure/Intervention: The exposure of interest was nature-based ELC which is an umbrella term that encompasses different types of international early years education types, including nature-based preschool, kindergarten and daycare (23). These can vary depending on country context, approach used, level of nature, and duration (half day, full day), but are related through their integration of nature in their curriculum and/or environment. This means to be eligible for inclusion in this review, studies had to include nature-based ELC; that is interventions that provided children with nature-based experiences or explored specific natural elements (e.g. hills, trees, water, snow etc.). ELC settings where they did not integrate nature into their curriculum and/or environment were excluded. For example, studies where settings utilised a more traditional indoor approach or where the playground was predominately concrete and features manmade structures (swings, slide, climbing frame etc.) were excluded.

Comparison: Attendance of traditional, indoor ELC (preschool, daycare) where children's outdoor opportunities were less and in an environment which was predominately concrete and consisted of manmade elements such as swings, slide, and climbing frames.

Outcomes: To capture the possible wide-ranging outcomes of nature-based ELC, any child-level outcome related to health, wellbeing and development were included. Specifically, this included outcomes related to children's physical (e.g. physical activity, motor development), cognitive (e.g. executive functions, attention), social (e.g. prosocial behaviour), emotional (e.g. stress reduction) and environmental (connectedness to nature) health, wellbeing and development. Studies were excluded if they included outcomes which were not child-level. Studies which assessed outcomes using unvalidated questionnaires were also excluded (for both quantitative and qualitative designs).

Study designs: Both quantitative and qualitative designs were eligible. Qualitative studies that explored perceptions (from parent, practitioner or child) at a time when the child was attending nature-based ELC were included. All quantitative study designs, including: cross-sectional and case-control studies measured when the child was attending nature-based ELC; longitudinal, quasi-experimental and experimental studies with at least two time points, and; retrospective studies if outcomes were assessed at a time when the child attended nature-based ELC were included. Studies were excluded where the timepoint of outcome measurement could not be readily associated with the exposure; for example, if studies measured effect once the child had left the nature-based ELC or case studies reviewing only one child. Qualitative studies were also excluded if they did not have a comparator (exposure, control group or pre/post).

Step 3: Selecting the studies

Only studies that met the above criteria were included. References from the nine electronic databases and other searches were imported to the referencing software, Endnote, and one reviewer (AJ) removed duplicates. Titles and abstracts were screened once (AJ, PM, RC, IF, SI, FL, BJ, VW) and 10% were screened in duplicate independently (AM). Two researchers independently screened full text articles in duplicate. A third reviewer was brought in to discuss and resolve any disagreement. Multiple publications for the same study were combined and reported as a single study.

Step 4: Extracting the data

Quantitative Data: Data from eligible studies was extracted by one reviewer (AJ) with another reviewer cross-checking all extracted data (AM, PM). The following information was extracted:

  • Study ID (authors, year of publication)
  • Country
  • Study design (cross-sectional, controlled cross-sectional, controlled before and after etc.)
  • Participants (age, gender, socioeconomic status, sample size etc.)
  • Intervention/ exposure type and duration (nature-based ELC, naturalised playgrounds etc.). Details on what any possible comparator groups received were also detailed (for example, characteristics of traditional preschool).
  • Outcome measures (type, assessment tool, unit and time point of assessment etc.)
  • Outcomes and results (effect estimates, standard deviation, confidence intervals etc.)

Qualitative Data: One reviewer read through each eligible qualitative study (AJ) and provided a summary of the main themes as reported by the study author and any other relevant information. A second reviewer read the study and summary provided by reviewer one and added any additional information (HT, PM). The following information was extracted:

  • Study ID (authors, year of publication)
  • Country
  • Participants (i.e. gender, socioeconomic status, sample size etc.)
  • Intervention/ exposure type
  • Intervention/exposure duration
  • Research aims
  • Outcome measures (interviews, focus groups etc.)
  • Outcomes and results (summary of key themes).

Step 5: Assessing the quality of the studies

The quality of all included studies was assessed by two reviewers independently (AJ/PM, AJ/AM), cross-checked and disagreement resolved through discussion with a third reviewer.

The quality of quantitative studies was assessed using the Effective Public Health Practice Project (EPHPP) Quality Assessment Tool (24). This assesses six components of study quality: selection bias; study design; confounders; blinding; data collection methods; withdrawals and drop-outs (in before and after studies only). Each component was rated 1–3 to give a total global rating of weak, moderate, or strong quality.

Why assess the quality of studies?

Assessing the quality of studies is important because it guides the interpretation of findings. For example, if a study demonstrates a significant positive health impact, but it is of weak design then we would interpret findings with caution. This might be because bias has been brought into the study through a small number of children from one or two schools only and/or the data collection methods used are not valid or reliable.

When we assess the quality of the evidence, we can make judgements on confounding. Confounding relates to other factors which may influence the findings of the study, for example, the child's age, gender or socioeconomic status. It is important in any study that these are considered in the design (the group receiving nature-based ELC are matched to a control group with the same characteristics) or in the statistical analysis. If confounding has been considered, then we can have more confidence in the findings presented.

Finally, the type of study design is also factored in. Studies which assess outcomes at baseline in an intervention group and control group and then assess outcomes again at follow-up (before and after studies) are generally of stronger design and we can have more confidence in the findings. However, before and after studies can still be rated weak if there is bias or confounding has not been considered. Cross-sectional studies have a weaker design. This is because they only assess outcomes at one timepoint and we cannot be sure that findings reported are a result of attending nature-based ELC.

For qualitative data, the trustworthiness of the study was assessed using the Dixon-Woods (2004) checklist (25). This tool assesses whether research questions are clear and suited to qualitative enquiry, whether sampling, data collection and analysis are described and appropriate, if claims are supported by sufficient evidence and whether data is integrated, and whether the study makes a useful contribution to the review question(s). Qualitative studies were excluded if the research questions were not suited to qualitative inquiry or if the paper did not make a useful contribution to the review question.

See Appendix B for the EPHPP and Dixon-Woods quality assessment tool.

Step 6: Synthesising the data

Synthesis Without Meta-analysis (SWiM) was followed for reporting findings (26). For synthesising the findings, studies with the same exposure and reported on similar outcomes were grouped and presented in summary tables. Outcomes were grouped into similar outcome domains (physical, cognitive, social emotional and environmental) and sub domains. SWiM aims to provide a summary of the effect direction and address whether evidence had favoured nature or favoured the comparison. A narrative synthesis was conducted to report on findings grouped by outcome domains with the better quality evidence prioritised in any conclusions drawn.

For qualitative studies, a thematic analysis of reported themes was conducted, grouping them into lower and higher order themes.

A logic model was created to summarise the findings of the qualitative and quantitative studies. The purpose of the logic model is to present a testable theory of change that will allow comparison and examination of how the different data types relate to each other and to enable readers to identify gaps for future research.

Step 7: Assessing the certainty of evidence

Assessing the certainty of evidence for each outcome allows to draw conclusions about our confidence that the observed findings reflect true associations and effects, and that future research is unlikely to change the results. The Grading of Recommendations, Assessment, Development and Evaluation (GRADE) framework was used to assess the certainty of the evidence for each of the assessed outcomes by judging the study quality, precision, consistency, and directness across studies (27). Risk of bias relates to the quality of all studies that assessed the same outcome and exposure. Precision refers to the range around an effect estimate where a small range indicates high precision. Consistency takes into account as to whether studies suggested conflicting results or not. GRADE was applied when there were two or more studies reported on the same outcome and exposure. The certainty of evidence was rated up or down depending on the risk of bias, precision and consistency across studies to provide an overall rating for the certainty of the evidence for each outcome: very low (true effect different from estimated effect, very likely to change with new evidence emerging), low, moderate and high (true effect is similar to estimated effect; unlikely to change with new evidence emerging) (27).

Quality of studies versus certainty of evidence:

Assessing the quality of the studies (see Step 5) relates to the design and conduct of the study. Judgements are made on selection bias, study design, confounders, blinding, data collection methods, withdrawals and drop-outs on each eligible quantitative study.

Whereas the certainty of evidence looks at a single outcome which has been reported in more than one study. Study quality (above and Step 5), precision, consistency, and directness are assessed across studies and provides a rating that enables us to draw conclusions about the findings reported. For example, if the certainty of evidence is low for a specific outcome, we need to be cautious in our interpretation of the findings and subsequently the recommendations.



Back to top