Public sector - understanding equality data collection: main report

This research describes and explores the range of equality and socio-economic disadvantage data collected by public sector organisations. Findings offer insights into what works best in terms of collecting, utilising and safeguarding robust data, highlighting major barriers to its collection or use.

4 What equality data are collected?

4.1 This section discusses the equality data that are collected. It is based on information derived from the examination and exploration of 55 separate data collections (see Annex 2), using the research tools set out in Annexes 3 to 5.[30]

4.2 It is important to note that participants in the research emphasised that when their organisation was considering what data to collect, this was done with reference to the Data Protection Act 2018. Thus, equality data collection was undertaken for 'explicit purposes' and limited to 'only what is necessary' for business purposes. Participants also stressed the importance of being clear from the outset how the equality data collected would be used.


It is not sufficient justification that something is 'interesting to know'. It has to be something which is required for the delivery of a service.

We don't feel we should be asking for more - the information needs to be relevant, and needs to meet DPA requirements.

Collection of information about protected characteristics

4.3 Table 4.1 below shows the extent to which each of the (9) protected characteristics were collected in each of the 55 collections which this study considered. It can be seen that:

  • Age and Sex were collected in almost every case
  • Race and Disability were collected in a large majority of cases
  • Religion and belief and Sexual orientation were collected in a substantial minority of cases (more than a third but less than a half); Marriage and civil partnership, and Gender reassignment were collected in around one-quarter of cases
  • Pregnancy and maternity was collected in a small minority of cases (7).
Table 4.1 - Number of data collections containing each protected characteristic
Protected characteristic n
Age 53
Religion and belief 22
Race 43
Disability 42
Sex 48
Sexual orientation 24
Pregnancy and maternity 7
Marriage and civil partnership 12
Gender reassignment 14

4.4 The following points should be noted with regard to the information in Table 4.1:

  • 'Marriage and civil partnership' is covered by the Public Sector Equality Duty (PSED) only in relation to unlawful discrimination in employment. Although this study did not consider workforce / employment data (see paragraph 3.2 above), this protected characteristic was included in the current study for completeness.
  • In addition to the numbers reported in Table 4.1 above, interviewees explained that, in relation to some data collections, information on a particular protected characteristic might be collected if it were relevant, and recorded in narrative or open text fields on case management systems or within files associated with an individual's case. However, the organisation's data collection protocol did not require this information to be systematically collected (and recorded) in all cases. (As a result, these cases are not included in the counts presented in Table 4.1.)
  • In a few cases, public sector bodies sought information from organisations as well as individuals. In one case, for example, where both individuals and organisations were able to apply for grants, organisations were asked how many staff, board members, volunteers, etc. had a certain equality characteristic, with the options offered being the same as those offered to individual grant applicants. This point is returned to later in the report (see paragraph 7.18). Group level information was also sought in a few cases, for example, asking if anybody in a group of individuals had a disability.

4.5 Table 4.2 below presents the information gathered in this research in a slightly different way, by looking at the number of protected characteristics collected in each data collection. It can be seen that just one (1) collection contained information about all nine (9) protected characteristics. However, 15 collections contained information about seven (7) or more characteristics. By contrast, seven (7) collections contained information about two (2) or fewer characteristics.

Table 4.2 - Number of protected characteristics collected in each data collection
Number of protected characteristics Number of data collections
0-2 7
3 13
4 7
5 8
6 5
7 10
8 4
9 1

Collection of information about socio-economic disadvantage

4.6 The indicators of socio-economic disadvantage considered in relation to this research were (i) area deprivation as measured by the Scottish Index of Multiple Deprivation (SIMD), and (ii) (any measure of) household income.

4.7 The SIMD classifies small geographical areas (called 'data zones') based on information across seven domains: income, employment, education, health, access to services, crime and housing. Thus, in order for SIMD analysis to be possible, postcode information is required to link an individual to the data zone in which they live.[31]

4.8 As far as the collection of these data are concerned it was found that:

  • 41 of the collections included postcode information - although it was often clear that the postcode was not being used by the organisation as an indicator of socio-economic disadvantage (i.e. the postcode information was not being used analytically).
  • 14 of the collections included information about household incomes.

4.9 Whilst it was not part of this study to explore indicators of socio-economic disadvantage other than SIMD category / postcode, or household income (see paragraph 4.6 above), it was found that in 15 of the data collections, other information relating to socio-economic disadvantage was collected.[32] This included information about household type; type of property; occupation (of the service user, or parental occupation in some cases); employment status; parental education; location of secondary school; and (non-income related) eligibility for assistance.

Collection of data about other 'equality' characteristics

4.10 In addition to the characteristics described above, the collections that were reviewed also collected information on a range of other characteristics which can be seen as 'equality-related characteristics' in a broad sense including:

  • (Household) vulnerabilities
  • Primary reason for contacting a service
  • Communication requirements (including need for an interpreter / preferred language / language spoken in the home / British Sign Language requirement)
  • Whether an individual is a Gaelic speaker
  • Whether an individual is care experienced / or is a care leaver
  • Whether a child is on the Child Protection Register
  • Caring responsibilities
  • Whether a university / college student is estranged from their family
  • Whether an individual is an armed services veteran.

4.11 Care experience / care leaver and caring responsibilities were the most frequently reported additional characteristics collected; these were particularly common in relation to education and other services with a primary focus on children and young people.

Equality characteristics - definitions and response categories

4.12 For data collections containing information about a specific protected characteristic or socio-economic disadvantage indicator, (i) the question which was used (including any definitional issues), and (ii) the response categories offered were ascertained.

4.13 The question wording, accompanying definitions, and response categories were diverse, detailed and often complex.

4.14 In general, there was a lack of standardisation in the way these items are collected, in terms of (i) the wording of / terminology used in the question, (ii) the definitions supplied to support the question, and (iii) the range of response categories offered. However, there were also a lot of similarities in the questions asked, and the differences were often detailed in nature.

4.15 Whilst some of the variation reflects the requirements of a specific data collection (for example, the definition of 'disability' employed and the response options offered may vary depending on whether the information is required to support decision making at individual level - e.g., in relation to a student's access to educational materials or a patient's requirement for transport to a health care setting), in many cases there was no obvious rationale for the differences.

4.16 It is also notable that there were variations within (as well as between) organisations in the way items were collected.

4.17 Two further general points should also be noted:

  • Interviewees often said that the Census 2011 questions had provided a reference point for gathering equality data. However, even where questions were clearly based on census questions, there were often slight variations in wording or phrasing used or the response options offered.
  • A 'prefer not to say' response option was often included, across the full range of equality characteristics. However, it was least likely to be offered in relation to age, and most likely to be offered in relation to religion, sexual orientation, and gender reassignment - with regard to these characteristics, there were very few examples of questions which did not offer this option. (Note that unless a question is set as mandatory, an individual can, of course, simply choose to not answer a question without a 'prefer not to say' option being offered.)[33]

4.18 Annex 6 presents an overview of the questions, definitions, response categories -and variation on these - in relation to each individual protected characteristic.

Response rates and data 'completeness'

4.19 It was not always possible to get information about the completeness of equality data held by public sector bodies. Completion / responses rates often included 'prefer not to say' options and were also affected by whether a question was mandatory or voluntary. Organisations often reported good data completeness - 100%, or close to 100% - for questions or fields related to age, sex, and postcode. However, levels of completeness were much more varied, and occasionally much lower, for other characteristics - for example, a completion rate of 57% was reported for sexual orientation in one case, and in another a rate of 18% was reported for religion and belief.

4.20 Where organisations used separate equality monitoring forms, these typically had low response rates if their completion was voluntary. However, it was also noted by one interviewee that if an individual completed the form, they tended to complete all the questions.



Back to top