A Scotland-wide Data Linkage Framework for Statistics and Research: Consultation Analysis

Analysis of consultation responses to a Scottish Government consultation on the aims of the Data Linkage framework and a draft set of guiding principles.

Consultation Question 3: Guiding Principles

The consultation paper presented a set of draft guiding principles for data linkage activity and asked: Are the guiding principles sufficient and appropriate?

The table below shows that the majority of data custodian respondents stated that they thought the guiding principles were sufficient and appropriate, whereas data subjects were more likely to say they were not.

Are the guiding principles sufficient and appropriate?
Type of respondent Yes No No answer
Data custodian 8 3 0
Data user 14 15 1
Data subject 1 4 0
Multiple categories selected 1 2 5
No selection 0 1 6
Total count 24 25 12

General Comments

In commenting on the principles respondents provided both general comments on the principles overall, and comments on individual principles. The most common comment in relation to the principles was to suggest that additional information and, in particular, definitions of specific terms are required. Some of the terms for which definitions were sought included: every effort; proportionate; sound and robust; data controller; appropriate oversight body.

There were a number of comments from respondents which were broadly supportive of the principles but went on to make suggestions for their improvement. One respondent stated :

'A general comment is that the principles are all reasonable, however in places they read more like a set of detailed requirements rather than a set of guiding principles. Ideally the principles should leave more latitude for proportionality rather than be too detailed and prescriptive.' (The Medical Research Council)

The counter-view was also expressed, with one respondent proposing a number of changes which sought to strengthen the wording with the effect of making a number of the principles into requirements.

Another respondent stated that there was overlap between the principles and there would be scope for reducing them in number. In contrast, some of the respondents suggested that additional principles could be added to those presented in the consultation document. These additional principles related to the following areas:

  • Commitment to encourage the use of a single identifier on all databases.
  • Management of the risks of indirect as well as direct identification when data are disseminated as there is a potential increase in identifiability when datasets are combined.
  • Specification of the process to agree the nature of valid uses that can be made of the linked datasets and the approval mechanism to be applied to applications using the datasets, as well as any control mechanisms to be applied to such use.
  • Removal of personal identifiers as soon as they are no longer required - where they need to be kept they should be kept separate from the integrated dataset.
  • Assertion that the type of matching used should be the minimum needed and range of attributes used to establish a common identity should be the minimum necessary for the linking operation to succeed.
  • Development of minimum standards for secure management of information.
  • Management of data access (and risk of privacy and/or confidentiality breaches) for research projects ensuring: Confidentiality, data protection, information security, record management, data access agreements, international information security standard.
  • Acknowledgement of the valuable contribution that research populations could make to research design. They can offer significant insights to research teams and help more effective dissemination.

A summary of the main comments on specific principles are provided below under the headings given to the principles in the consultation document. In the analysis of the comments on the principles, the type of respondent (data user, subject or custodian) was examined, however, in the majority of cases there was no relationship between the type of respondent and the principle commented on. Indeed, many of the comments were made by respondents that did not select any of the type of respondent options.

Public Interest
Respondents made comments on the balance between individuals' rights to privacy protection and the public benefits from linking data. One respondent stated that principles 2 and 3 must explicitly state that the public interest does not override the individuals' right to privacy and the right to withhold consent. A further respondent requested clarification on the mechanism that would be used to achieve the balance.

There were a number of comments in relation to principle 5 (Where linkages resulting in commercial gain are envisaged, this should be clearly and publicly articulated and widely communicated). These included requests for clarification and further debate on the reference to 'commercial gain'.

It was also suggested by two respondents that principles for private sector data linkage might be different to those for public sector linkage.

Finally, one of the respondents suggested that findings from research should be widely publicly disseminated in a way which is accessible to the widest audience possible.

Governance and Transparency
Two respondents enquired how information about linkages would be made available to the public (principle 8). One further respondent suggested that, rather than making complete Privacy Impact Assessments and data sharing agreements available to the public, a subset of key information could be made available as the full documents are study specific and can be complicated.

In relation to monitoring and regulating practices (principle 9) one respondent suggested that there might be a risk of conflicting requirements, delays and duplications as a result of multiple overseers. There was also a query about how the costs of the governance body monitoring data linkages will be met. Another respondent suggested that there should be appropriate public representation for any organisation or governance body to ensure that the general public and patients have confidence in the use of their data for research.

For principle 14, one respondent stated that it is important that project planners have access to guidance on what measures are required to minimise risks of identification. One respondent expressed concern that applying all measures to achieve data privacy could be an overreaction and would make the data harder to use. In contrast, another respondent stated that the wording needs to be stronger to protect public privacy and that data linkage must never be allowed where there is a possibility of identification or re-identification.

A number of respondents stated that more information should be provided in relation to Privacy Impact Assessments (PIA). Specific comments and queries on PIAs included:

  • PIAs should be a requirement – rather than only suggesting that 'serious consideration' being given to their completion (principle 15).
  • The robustness of PIAs should be part of the evaluation process.
  • Where are the PIAs submitted to? Would it be the PAS?
  • Would the ICO's version of the PIA be used?
  • What are the implications and procedures if a PIA is not carried out?
  • Limited resources will have an impact on the consideration of completing a PIA – it could become common practice to not carry out a PIA if there is no sanction for not doing so.

One particular respondent stated:
'This, or relevant supporting information, should contain a clear statement about the extent and nature of Privacy Impact Assessments (PIAs) required; e.g. as part of a current cross sectoral data linkage work with which NSS is involved, a PIA of over 100 pages has been produced. By contrast, this is not the sort of PIA that has previously been required of academic researchers prior to granting access to de-identified linked data.' (NHS National Services Scotland)

Four respondents commented on principle 16 (Linked datasets should be kept for the minimal time necessary...) as they suggested that there might be requirements to hold the data for longer for research purposes.

Removal of names and direct identifiers
Several respondents raised issues associated with indirectly identifying variables in relation to principles 17 to 19. It was suggested that the principles should include an explicit reference to indirect identifiers as the removal of direct identifiers alone is not sufficient to guarantee anonymity. Furthermore, one respondent stated that the more datasets are linked the easier it is to re-identify anonymous data subjects. One respondent suggested that a direct identifier definition would help as de-identifying a dataset is complex.

Two respondents sought clarification on the role of 'Data Controllers'. For the assessment of the risk of re-identification, two respondents requested information on what would constitute a 'suitable body'.

The highest number of comments on the principles were in relation to the consent principles, with 19 of the 61 respondents providing comments specifically relating to principles 20 to 23 in the consultation document. All types of respondents (data users, custodians and subjects) provided comments in relation to the consent principles.

Several respondents stated that additional guidance and definition of terms would be beneficial for the principles on consent. In particular, respondents requested further information on what and how it could be deemed to be 'practicable' in relation to explicit consent and what would constitute an 'appropriate oversight body' (principle 23). It was suggested by one respondent that an example could be included to demonstrate the application of the principles on consent.

Opposing views were expressed regarding informed consent. For example, one respondent argued:

'Informed consent (opt-in, not opt-out) must be at the heart of any good privacy-respecting system.' (No2ID)

With another observing:

''Opt-in' processes on a study by study basis would result in very low uptake. This usually invalidates and removes the reason for the linkage. Thus explicit opt-in consent should only be sought when there is an overwhelming reason to seek this in the interests of privacy.'(Anonymous)

There were also comments on the practicalities associated with explicit consent in terms of how questions seeking explicit consent are worded and how individuals are informed about uses of data. Two of the respondents suggested that the principles in the consent section might conflict with the Data Protection Act in relation to use of data without consent. A number of respondents stated that it would not be possible to obtain explicit consent for the data they already hold.

One respondent requested further information on the role of the National Data Linkage Centre in relation to principles 24 to 28, in terms of access, data management, retention, standards for storage and transfer, and information on how a security breach would be managed.

Access and Personnel
One respondent expressed concern at the statement in principle 33 that linkers should be separate from data custodians as there are currently organisations which perform both functions, namely ISD and education services. Conversely, one respondent stated that principle 33 is ineffectual as 'a clear distinction' is open to any convenient interpretation and it is suggested that where data from multiple controllers is linked the functions must be physically, technically, financially and organisationally separate. There was also a request for more information in relation to 'robust governance mechanisms' referenced in principle 32.

Clinical Trials
In commenting on the principles associated with clinical trials two respondents suggested that principles 34 to 36 should not just be applicable for health data and clinical trials as the need for re-contact might arise in other scenarios. There were also requests from respondents for clarity on who has responsibility in relation to re-contact (principle 35). It was suggested by one respondent, in relation to principle 36, that improved data linkage is likely to increase the need for re-contact of individuals who have participated in a clinical trial.


Email: Michael Davidson

Back to top