Public sector personal data sharing: framework and principles

This report considers frameworks and practices of providing access to personal data by public sector organisations to private organisations.

1. What is 'Personal Data'?

Data that is related to an identified or identifiable individual is called 'personal data'. Our report only looks at the pathways for sharing personal data. Personal data cannot be made freely available in the form of open data unless it is first processed to become fully anonymised, so that individuals are not identifiable. However, fully anonymizing data is difficult as different pieces of data that are each non-identifiable may still be able to be linked together with other datasets to re-identify them (Henriksen-Bulmer & Jeary 2016, Bampoulidis et al 2020). Personal data can also be 'pseudonymized'. This is where personally identifiable data fields are replaced by unique identifiers. For example, a researcher may need to know which hospital check-in records in a dataset relate to the same individual. Instead of releasing access to individual patient names or NHS/CHI numbers, this information can be replaced by a unique identifier, such as a random number or random sequence of letters and numbers. This process means that fewer personal data is disclosed, and thus reduces the risk of sharing data. However, pseudonymized data is still a form of personal data under GDPR laws (which will be discussed in the next section).

Personal data also includes 'special category data', which cover information relating to a person's race or ethnic origin, sexual orientation, political opinions, religious beliefs, trade union membership, health data (including genetic and biometric data). A full list is given in Table 1. These types of data have been defined under UK GDPR to be particularly sensitive and so require extra protection.

Terms surrounding data and data sharing can sometimes be used in different ways, for example anonymized and pseudonymized can sometimes be used interchangeably. To help clarify how terms are used in this report, Table 1, provides a summary of some of the key terms used throughout this report.

Table 1: Key data terms used in this report

Personal data

Data that relates to an identified or identifiable individual. This can be directly identifiable information, or information about individuals that can be indirectly identified through combining it with other information (UK Information Commissioner's Office n.d.b). Understood as such, this includes pseudonymized data.

Special category data

This is a special category of personal data which is defined under UK GDPR as:

  • "personal data revealing racial or ethnic origin;
  • personal data revealing political opinions;
  • personal data revealing religious or philosophical beliefs;
  • personal data revealing trade union membership;
  • genetic data;
  • biometric data (where used for identification purposes);
  • data concerning health;
  • data concerning a person's sex life; and
  • data concerning a person's sexual orientation."

(UK Information Commissioner's Office n.d.c).

Data controllers

Data controllers have control over the data purposes and decisions over the processing of personal data (UK Information Commissioner's Office n.d.d).

Data processors

Data processors "act on behalf of, and only on the instruction of, the relevant controller" (Ibid.)

Anonymized data

Data that has had any identifiable information removed from the data. Anonymized data cannot be linked back to an individual, because of this fully anonymized data is hard to achieve.

Pseudonymized data

This is personal data that has undergone further processing to remove direct identifying individual information and replace these with artificial identifiers.



