Data linkage for research in Scotland

Data linkage allows for the true value of our existing data to be realised.


In Scotland, publicly-held data can only be linked from different organisations:

  • if it is in the public interest to do so
  • for clearly specified research and statistical purposes
  • if data controllers (those responsible for the data) approve the linkage

In addition to this, the data linkage approval process must be open and accountable to the public. Only the minimum amount of data are linked to produce statistics or answer the research question proposed. This reduces any potential risk to privacy by limiting unauthorised sharing of large quantities of data.

Privacy is a major consideration of any data linkage work and the potential benefits of the statistical research must be weighed up in relation to the potential risk of a researcher being able to identify individuals. One approach is to ‘pseudonymise’ data that will be linked. This process takes the most identifying fields within a dataset and replaces them with one or more artificial identifiers, or pseudonyms (for example replacing a name with a unique identifier that is totally unrelated to the individual). ‘Pseudonymisation’ of data is a common approach to ensuring privacy is maintained when undertaking data linkage for research and statistical purposes.

Another precaution to ensure privacy is maintained is that researchers do not link data themselves and only see the ‘outputs’ from the data linkage via one of Scotland’s safe havens. The process of linking data using this method is described in further detail below under ‘separation of functions’. This process ensures that:

  • the data controllers remain in control of the original data
  • researchers only see the final results of the linkage, but never the original datasets
  • researchers only have access to the findings brought together to answer their specific research question (termed ‘outputs’); researchers never have access to the original data
  • researchers can only access the ‘outputs’ in a secure environment termed a ‘safe haven’ which has restricted access

Safe havens offer a secure physical location with strict operational standards to ensure that publicly-held data (including ‘pseudonymised’ data) is handled safely with appropriate levels of physical and electronic security. Only researchers who have been vetted and can demonstrate they have undertaken appropriate training to ensure they have the knowledge and skills required to undertake the data linkage in a legal, ethical and efficient will be granted access.

This approach to data linkage is supported by the Guiding Principles for Data Linkage.

Back to top