Carers Census, Scotland, 2021-22

Third publication of the Carers Census, covering unpaid carers being supported by local services across Scotland in 2021-22.

This document is part of a collection

Annex 1: De-duplication of Carer Census Records

As unpaid carers can sometimes be supported by more than one local service, it is possible for information on the same carer to be submitted by multiple organisations. To ensure that carers are not being double counted in the final results, the figures presented in this report refer only to records that have been de-duplicated.

De-duplication process

The de-duplication stage of the analysis involves taking only one record per unpaid carer to be included in the final results.

First, instances where an organisation has returned more than one record for the same carer are examined. If information for the same carer is split over several records, these are combined in order to obtain a single record for the carer that contains all the information that has been returned.

Then, records where month and year of birth, gender or data zone were unknown or missing were removed. This is because all three of these identifiers are required in order to create an accurate enough de-duplication ID that can allow us to determine if records submitted by different organisations refer to the same person or not. Of the records submitted in 2021-22, around 10% were removed due to missing identifiers.

De-duplication IDs were then created for each remaining record by combining the three identifiers: month and year of birth, data zone and gender. In cases where the de-duplication ID was not unique, further analysis of the data was undertaken to identify where those records with the same de-duplication ID referred to different carers.

If multiple records submitted by a single organisation had the same de-duplication ID, but different record IDs (e.g. Carer 1 and Carer 2), it was assumed that these records referred to different carers. In cases where the same system was used by multiple providers (e.g. Carer Centres run by VOCAL) and so used the same record IDs, a single record was taken for each carer. If providers each returned different parts of the data, these were combined into a single record.

As a result of the de-duplication process outlined above, 79% of the records submitted in 2021-22 were included in the final data analysis. This is a slight improvement from 2020-21, where 76% of the records submitted were included in final analysis.

Table 2: Number of records included in analysis following de-duplication


Records submitted

Unique number of carers (de-duplicated records)

Duplicates and records unable to be de-duplicated













In future years, we intend to link the Carers Census data with the National Records of Scotland’s population spine, which contains the personal identifiers of everyone in the Scottish Census, in order to obtain an accurate number of individual carers from the information submitted.

Analysis of duplicate records and records unable to be de-duplicated

The de-duplication process removed 10,880 records (21% of all records submitted) from the dataset in 2021-22. Further analysis was carried out on these records in order to ascertain if certain groups of carers were impacted more than others.

Effects of de-duplication on different areas

Some areas were more impacted than others by the de-duplication process. However, this is not necessarily due solely to data quality issues such as missing identifiers. Areas where organisations work together to provide unpaid carers with support will be more likely to return information on the same people, which would lead to more records being removed during the de-duplication process.

For instance, in some areas the carers centre will have conversations with the carer to put a support plan in place while the Local Authority will provide the support needed. In this situation, both orgnisations would return information on the same carer. Therefore, to avoid double counting the information would be combined into a single record to be included in the final analysis.

Effects on equality groups

In 2021-22, less than 20% of records were removed for each of the adult age groups (14% of records for 18 – 64 year olds and 19% of records for 65+ year olds). This is higher than for the 0 – 18 year old age group, for which 5% of records were removed during the de-duplication process. This means that the de-duplication process affected adult carer records more than young carer records.

Similar proportions of records for male (16%) and female (18%) carers were removed in 2021-22. However, there was slightly more variation across ethnic groups with the proportion of records removed varying between 13% and 17% for each ethnic group (not including records where ethnic group was missing or not known).

The proportion of records removed for each deprivation decile (as measured using the Scottish Index of Multiple Deprivation (SIMD)) varied between 12% and 24%, with the slightly higher proportions of records being removed for the less deprived SIMD deciles.



Back to top