A Scotland-wide Data Linkage Framework for Statistics and Research: Consultation Paper on the Aims and Guiding Principles

The main purpose of this consultation is to seek views on the aims of the Data Linkage Framework and a draft set of guiding principles.

1. A brief explanation of data linkage for research and statistical purposes

In this framework, data linkage is the joining of two or more administrative or survey datasets to greatly increase the power of analysis then possible with the data.

This framework is concerned exclusively with data linkage for research and statistical purposes. It does not cover the sharing of personal information about an individual between organisations in order to deliver a co-ordinated service to that person. The following examples are all beyond the scope of this framework:

  • A Child Protection Officer sharing a particular family's case file with a School and the Police, in order that all three can work together to protect a child at risk.
  • A Local Authority sharing information about named individuals claiming Housing Benefit with any other organisation for the purpose of combating fraud.
  • A GP sharing information about an individual patient's symptoms or diagnosis with a hospital in order that the patient receives a co-ordinated service from all parts of the NHS.

This framework is concerned exclusively with the linkage of data for research and statistical purposes where there is no direct impact on an individual because of information about that individual being linked. Many examples are given throughout section 2, and can be seen as falling into three categories:

  1. Development and production of Official Statistics, including development of alternatives to the Census and the production of aggregate statistical information.
  2. Production and dissemination of research resources, such as longitudinal statistical products like the Scottish Longitudinal Study.
  3. Ad-hoc research projects, or linkages conducted to answer specific research questions using statistical analyses, such as the West of Scotland Coronary Outcomes Prevention Study.

Example of how Data Linkage for Research and Statistical Purposes can work: The Scottish Health Survey

The Scottish Health Survey is a sample survey of approximately 6,000 adults and 2,000 children per year. It is conducted through face-to-face interviews, and respondents are asked about a wide range of health issues including smoking, alcohol intake, diet, levels of physical activity, self-assessed health and mental well-being; prescribed medicines; and symptoms of ill-health. Some biological measures are also taken, such as waist and hip circumference, height, weight and blood, urine and saliva samples.

All aspects of the Scottish Health Survey, including data linkage, are approved by The National Research Ethics Service before being conducted.

All respondents are asked to consent to their name, address and date of birth being sent to the Information Services Division of NHS Scotland (ISD) so that their responses to the Health Survey can be linked with records holding data on medical diagnoses, in-patient and out-patient visits to hospital, and other information about cancer registration, GP registration and mortality.

Where the respondent gives consent for linkage the following process then occurs:

  • First, respondents name, address, date of birth and a unique serial number (which is different to that used on the publicly available survey dataset) are separated from the rest of the health survey dataset (all the responses to the health survey questions) and sent by the survey contractors to ISD.
  • ISD then link respondents name, address, date of birth and a unique serial number, with the health records, and delete the respondents name, address, date of birth. This leaves a file of unique serial numbers and administrative health data. This file is then sent to a named analyst in Scottish Government.
  • The Scottish Government analyst then merges that file with the data collected through the Health Survey, using the unique serial number. The unique serial number is then deleted and a new random one added.
  • This dataset is then analysed, results are checked for risk of disclosure, and the aggregate results and conclusions and disseminated as widely as possible.

All data is sent between the three organisations by secure FTP (File Transfer Protocol) servers and can be accessed only by a small number of named people in each organisation.

None of the three organisations - Scottish Government, the contractor, or ISD - has access to both survey and health records with personal identifiers attached at any time.

This is one example of how linkage projects for statistics and research purposes can be conducted in a way that protects privacy. Other projects and systems may use different procedures.


Email: Andrew Paterson

Back to top