We are testing a new beta website for gov.scot go to new site

The data linkage process

In Scotland, publicly-held data can only be linked from different organisations:

• if it is in the public interest to do so
• for clearly specified research and statistical purposes
• if data controllers (those responsible for the data) approve the linkage
• for the duration of the work as agreed by the associated data controllers (linked data is not held indefinitely)

In addition to this, the data linkage approval process must be open and accountable to the public. Only the minimum amount of data are linked to produce statistics or answer the research question proposed. This reduces any potential risk to privacy by limiting unauthorised sharing of large quantities of data.

Privacy is a major consideration of any data linkage work and the potential benefits of the statistical research must be weighed up in relation to the potential risk of a researcher being able to identify individuals. One approach is to anonymise data that will be linked. There are different degrees of anonymisation, from complete anonymisation, where all identifying data is removed (for example, name, address, date of birth) to what is termed ‘pseudonymisation’ which takes the most identifying fields within a dataset and replaces them with one or more artificial identifiers, or pseudonyms (for example replacing a name with a unique identifier that is totally unrelated to the individual). Pseudonymisation of data is a common approach to ensuring privacy is maintained when undertaking data linkage for research and statistical purposes.

Another precaution to ensure privacy is maintained is that researchers only see the ‘outputs’ from the data linkage via one of Scotland’s safe havens. The process of linking data using this method is described in further detail below under ‘separation of functions’. This process ensures that:

• the data controllers remain in control of the original data
• researchers only see the final results of the linkage, but never the original datasets
• researchers only have access to the findings brought together to answer their specific research question (termed ‘outputs’); researchers never have access to the original data
• researchers can only access the ‘outputs’ in a secure environment termed a ‘safe haven’ which has restricted access

Safe havens offer a secure physical location with strict operational standards to ensure that publicly-held data (including pseudonymised data) is handled safely with appropriate levels of physical and electronic security. Only researchers who have been vetted and can demonstrate they have undertaken appropriate training to ensure they have the knowledge and skills required to undertake the data linkage in a legal, ethical and efficient will be granted access.

This approach to data linkage is supported by the Guiding Principles for Data Linkage.

There are a number of additional safeguards in place to further protect publicly-held data, ensuring it is being accessed in a responsible manner and privacy is considered at every step of the data linkage process, including:

• Proportionate risk management
• Separation of functions
• Statistical disclosure control