Joined-up data for better decisions: Guiding Principles for Data Linkage

These principles accompany the associated publication 'Joined-up data for better decisions: A Strategy for Improving Data Access and Analysis'. The guiding principles are a key element of the Data Linkage Framework for Statistical and Research Purposes. They are designed to support data custodians, researchers and other stakeholders in taking decisions about safe and effective data linkage and sharing.

The Principles

1. Public Interest

Protection of privacy, efficient use of data, and scientifically sound and ethically robust research and statistics, are all in the public interest.

The public interest principles should be considered for all data linkage activity, regardless of the application of other principles.

1.1. The adequate protection of personal privacy should be a central consideration in all deliberations about the sharing and linkage of data

1.2. The rights of individuals should be respected with adequate and appropriate privacy protection, recognising that data sharing and linkage is never risk-free. Acceptable risks are those that are relative to the benefits for all in the appropriate use of data for research and statistical purposes and this should be recognised.

1.3. The production and dissemination of statistics through data linkage should be in accordance with the Code of Practice for Official Statistics, The Pre-release access to Official Statistics Order (Scotland) 2008 and National Statistician's Guidance on Confidentiality of Official Statistics.

1.4. Benefits arising from linkage of personal data are public goods and should be shared as widely as possible.

1.5. Where linkages resulting in commercial gain are envisaged, this should be clearly and publicly articulated and widely communicated.

2. Governance and Public Transparency

Clear decision making processes that are open and accountable to the public will help to ensure the appropriate balance of privacy protection, efficient use of data, and scientifically sound and ethically robust research and statistics.

The governance and public transparency principles should be considered for all data linkage activity, regardless of the application of other principles.

2.1. Data sharing and linkage should be carried out under transparent and proportionate controls and security processes, and the purposes and protection mechanisms should be communicated publicly and to oversight bodies/individuals with responsibility for data processing.

2.2. Information about all approved linkages; all privacy impact assessments; all data sharing agreements for linkage purposes and accessible summaries of plans for linked-data analysis should be made publicly available.

2.3. All practices, including all data linkages, shall be appropriately monitored and regulated by a relevant individual, organisation or governance body. It is possible that these activities will be monitored at an individual and organisational level simultaneously.

2.4. There should be a clear distinction in roles between those carrying out linkages, analyses and those policing governance and enforcing sanctions.

2.5. As far as possible, account should be taken of the full range of stakeholder positions in the development and implementation of governance arrangements.

2.6. The interests of one (or a few) stakeholder(s) should not dominate use/linkages or the conditions of the same, especially where this might be at the expense of other stakeholder interests.

3. Privacy

The law does not give absolute value to privacy, and a balance is needed between respect for privacy, through the proportionate mitigation of risk, and the potential benefits to all through the use of data for statistical and research purposes.

Methods for mitigating risks to privacy include anonymisation and security. Where data subjects consent to their personal data being shared or linked, privacy risk must still be considered.

3.1. Data controllers should demonstrate their commitment to privacy protection through the development and implementation of appropriate and transparent policies and procedures and show how these operate relative to the public interest in promoting safe data sharing and linkage.

3.2. Every reasonable effort should be made to consider and minimise risks of identification (or re-identification) to data subjects and their families arising from all aspects of data handling.

3.3. Serious consideration should be given to carrying out privacy-impact risk assessments, following the most up-to-date ICO guidance. Where a PIA is not considered feasible or necessary, this should be clearly and publicly articulated. PIAs should be made publicly available (excluding sections as necessary for reasons of security), well ahead of linkage occurring so there is opportunity for data subjects to raise concerns.

3.4. Linked datasets should be kept for the minimal time necessary for the original purpose of the linkage to be met. The onus is on those wishing to hold datasets for longer to justify this, e.g. by demonstrating that adequate anonymisation takes the data outside the remit of the data protection regime. If a secondary purpose arises, a new Privacy Impact Assessment should be considered, and data sharing agreements revised.

3a Consent

Consent of data subjects is an important consideration, although it is not a necessary requirement for data linkage under the Data Protection Act.

The consent principles should be departed from only where there is a strong justification and approval has been granted by an appropriate oversight body.

3a.i Where practicable, consent should be obtained from each data subject prior to the linkage of personal data for statistical and research purposes. Personal data are those from which an individual is identifiable or is likely to be identifiable.

3a.ii Where practicable, individuals or organisations collecting data should adequately inform data subjects of all material issues relating to the storage and use of their data. Material issues are those likely to affect a person in a non-trivial way.

3a.iii The minimum amount of personal data should be used to achieve the stated objective; the reasons and justification for its use should be adequate and clearly explained; and reasonable efforts should be made to inform data subjects of the purposes of the use.

3a.iv Where obtaining consent is not practicable, then (a) removal of direct identifiers should occur as soon as is reasonably practicable and/or (b) approval from an appropriate oversight body should be obtained which can confirm that the public interest in data linkage is met and appropriate safeguards are in place.

3b Anonymisation

There are degrees of data anonymisation and it may not be possible to completely remove the risk of reidentification. Nevertheless, data can be anonymised sufficiently for data controllers to make a reasonable risk-based judgement that data can be shared.

The anonymisation principles may have less importance if consent for linkage of non-anonymised data has been given or if linkage has been approved by an appropriate oversight body.

3b.i Procedures to link data should involve the separation of identifiers (e.g. name, or unique reference number) from the rest of the data, and consideration should be given to separating the indexing, linking and analysis functions and personnel.

3b.ii Linkage method used should be that which requires the minimum necessary identifiable data.

3b.iii The default position should be that data users have access only to data from which names and direct identifiers have been removed, and data users should be subject to an obligation not to attempt to re-identify individual data subjects. Any requirement for researchers to have access to data containing identifiers should be fully justified and risk assessed.

3b.iv Data controllers should determine and agree upon the appropriate extent of anonymisation to be applied to any given dataset or linkage exercise. Particular consideration should be given to indirect identifiers (e.g. individual reference numbers), combinations of data (e.g. gender, date of birth and qualifications) and geo-references (e.g. postcode). The balance to be struck is between the level of risk to privacy relative to the likely benefits from linkage.

3b.v The risk of re-identification of data subjects must be assessed by a body/individual with the relevant expertise to make such judgments, including risks arising from indirect means such as statistical disclosure.

3c Security

Security of data transfer, storage and use is vital for the protection of privacy, especially where there is any risk of reidentification.

3c.i Appropriate and proportionate physical and technical security measures should be applied to ensure the confidentiality, integrity and availability of information and should reflect the assessed risk level of information assets.

3c.ii All personnel involved in data linkage activities should be properly trained on the data security policies and procedures, and should undertake periodic refresher training.

3c.iii The importance of data security should be reflected in the business objectives of all organisations involved in data linkage.

3c.iv Information about data security policies and procedures should be highly visible within organisations conducting indexing or linking or sharing of personal data.

4. Access and Personnel

These principles are important publicly for demonstrating security and respect for privacy and in avoiding any one person or organisation having access to large quantities of personal data.

4.1. Roles and responsibilities of parties with regards to data linkages should be identified from the outset, and all personnel involved in data linkages should be fully aware of their roles and responsibilities.

4.2. Terms and conditions for data sharing should be set out in the form of a data sharing agreement. Where researchers wish to deviate from or modify the terms of the data use/sharing agreement, new terms must be agreed by all parties.

4.3. All data recipients should be appropriately vetted to ensure they have adequate training. Vetting procedures should be robust and transparent and proportionate to the requests made and the sensitivity of the data requested.

4.4. Whether a single data controller or otherwise, a clear distinction should be maintained between each of the functions of linker, indexer, and recipient. Linkers should be responsible only for linking data.

5. Clinical Trials

Data linkage as a method to support or enhance clinical trials presents specific requirements.

5.1. Mechanisms for linkages involving clinical trials must permit re-identification by the principal data source, this is particularly important for pharmacovigilance purposes.

5.2. The specific circumstances and conditions governing whether or not patients involved in clinical trials can be contacted and by whom, should be clearly set in place in transparent policies.

5.3. Researchers should only seek to contact participants directly with respect to information arising from a clinical trial in which they took part where prior consent to be contacted for specific purposes has been obtained. Any dilemmas in this regard should be referred to the ethics committee that approved the original protocol.

6. Sanctions

Where organisations or individuals break the law then legal sanctions apply. Other sanctions should be considered where the Guiding Principles are breached.

6.1. Sanctions for failure to respect terms and conditions should be clearly stipulated in the data use/sharing agreement and should be proportionate to the sensitivity and quantity of data in question.

6.2. Sanctions should relate to both the individual and the organisation, and should be both proportionate and specific in relation to financial penalty; period of time that the organisation will be refused further data access; and reports of improper use of data to senior management and/or funding bodies.


Email: Kirsty MacLean

Back to top