Unlocking the value of data - Independent Expert Group: final report

This report is the final output of the Independent Expert Group on the Unlocking the Value of Data programme, to the Scottish Government. This report is a Ministerial commission, and was originally commissioned by the former Minister for Business, Trade, Tourism and Enterprise.

3. Annex: Context

The Scottish Government's Unlocking the Value of Data programme, and the IEG which forms part of it, take place within a broader context in Scotland, in the UK, in Europe and internationally of personal data use and sharing, digital policymaking and ethical and political discussions on data. Here we provide an illustration of this context. These topics are vast and we cannot cover exhaustively all aspects of them. However, we give a taste of some of these themes and their relevance to the IEG.

There are vast debates and activity around data access, sharing and use, and how to value this. Some of these relate to non-personal data (which includes data about deceased individuals and anonymous data), which is not the topic of the IEG's work; instead we have been looking at personal data which is defined in the UK GDPR (Article 4(1)) as:

any information relating to an identified or identifiable natural person ('data subject'); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person

We have been considering personal data held by the public sector in Scotland. This includes personal data collected from interactions with the NHS, the Scottish Government and local authorities in Scotland (including local council services such as education, social care and council tax payments). UK public sector organisations also collect personal data about people in Scotland, such as through immigration and the benefits system. While we would like these agencies to follow our policy statement, principles and recommendations, they relate to reserved powers and are beyond the jurisdiction of the Scottish Government.

Here we are also specifically considering access to public sector personal data by the private sector. The private sector can broadly be defined as organisations that are usually profit-making companies, and include a wide range of sectors, from pharmaceutical companies to supermarkets to energy companies and farms. The boundaries between the public sector, private sector and other sectors like the third sector (which include charities, community organisations and universities) are not always clear. We restrict our analysis to private sector organisations' access to public sector personal data as this is the mandate the Scottish Government gave us, but we consider that many of the points we make could equally apply to public sector organisations accessing other public sector organisations' personal data.

3.1 Data categories and types

Personal data can be further divided into various categories, some of which appear in data protection law and some of which do not. Data protection law (UK GDPR Art 9) recognises 'special category personal data' as:

personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person's sex life or sexual orientation.

The principles of data protection do not apply to processing data which is 'anonymous', including for statistical or research purposes, as defined in UK GDPR Recital 26 as:

information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.

Pseudonymous data is considered to be personal data (and therefore comes under the definition and scope of personal data), and is also explained in UK GDPR Recital 26:

Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person. To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

Other terms are used with regards to personal data especially in research contexts such as aggregated, raw data, and synthetic data. These terms are not legally defined. Each of these terms may also constitute personal data if the requirements above from data protection law are met.

Raw data is data that has not been processed for use. Aggregated data according to IBM is the outcome of a process 'where raw data is gathered and expressed in a summary form for statistical analysis'.

Synthetic data, according to the ICO, is 'data which does not relate to real people, it has been generated artificially'. If this data cannot be 'related to identifiable living individuals' it will not be personal data. However, as synthetic data is likely to have as its basis 'real' data, and if that data is personal data then its processing must comply with data protection law. Furthermore, according to the ICO:

it may be possible to infer information about the real data which was used to estimate those realistic parameters, by analysing the synthetic data. For example, if the real data contains a single individual who is unusually tall, rich, and old, and your synthetic data contains a similar individual (in order to make the overall dataset statistically realistic), it may be possible to infer that the individual was in the real dataset by analysing the synthetic dataset. Avoiding such re-identification may require you to change your synthetic data to the extent that it would be too unrealistic to be useful for machine learning purposes.

3.2 Relevant laws

Our work takes place in a complex and multi-layered legal and policy environment. In Scotland, we have three relevant levels of government: local authorities, the devolved administration (Scottish Government) and the UK Government. The competences of the UK and Scottish Governments are part of the devolution settlement and governed by the Scotland Act 1998. Until 2021, European Union (EU) law also applied in the UK including Scotland, but since the UK left the EU, this has ceased to be the case unless the specific law is 'retained'. There are certain key areas of legislation which relate to personal data and data sharing, namely data protection law, equality and human rights, and the Data Economy Act. There is also the common law of confidentiality in Scotland, and more recently a common law right to privacy was recognised by the Court of Session.

3.2.1 Data protection law

Data protection law, which governs the processing of personal data, originates in EU law and notably the General Data Protection Regulation (GDPR) which the UK implemented before it left the EU. Data protection law is highly significant for our work given the subject-matter of the IEG being 'personal data'. Currently, data protection is a 'reserved' matter to the UK Government, and the GDPR remains implemented in the UK post-Brexit via the UK GDPR and in conjunction with Data Protection Act 2018 (although this may be subject to change with the proposed DPDI Bill). Organisations, whether public, private or third sector, in Scotland must comply with UK data protection law, and the UVOD and IEG work takes place within the framework of UK data protection law.

Within the definition of 'personal data' there are subsets of personal data termed 'special category personal data' which are listed in Art 9 UK GDPR (in full above). Some personal data held by the Scottish public sector will constitute special category personal data, and is subject to further requirements and restrictions for its processing and use.

EU data protection standards, while not perfect, do represent a current global 'gold standard' or 'best practice' in the absence of substantive international law on this topic.

Data protection law allows the processing of personal data so long as certain conditions are met. Contrary to a commonly-held belief, data protection law does not always require the consent of individuals to process their data. There are six lawful bases for processing personal data, of which consent is one. Consent may not always be the appropriate lawful basis for processing.

Data protection law also does not prohibit per se the accessing of personal data by third parties. This is possible so long as its requirements are met. As current data protection law has been implemented since 2018, Scottish public sector organisations are experienced in complying with these rules and standards, and this remains fundamental to how personal data is handled. Data protection compliance in the UK is overseen by the regulator, the Information Commissioner's Office (ICO), which also issues guidance on data protection issues.

The ICO has issued guidance on the research provisions of data protection law. It states that:

These provisions recognise the importance of scientific and historical research and technological development to society. They ensure that data protection requirements enable technological innovation and the advancement of knowledge.

'Scientific research' is elaborated on in Recital 159 UK GDPR:

the processing of personal data for scientific research purposes should be interpreted in a broad manner including for example technological development and demonstration, fundamental research, applied research and privately funded research.

Private sector use of public sector personal data within the UVOD programme's scope is likely to fall within this definition of research. The ICO has an indicative list of criteria for scientific or historical research and an indicative list as to what would constitute 'statistical purposes'.

There are various exceptions to rights for data subjects in data protection law which relate to research such as within the right to be informed when data is collected from a source other than the individual (UK GDPR Art 14(5)(b)) and within the right to erasure (UK GDPR Art 17(3)(d)).

Within the data protection principles in UK GDPR Art 5, two of them (purpose limitation and storage limitation) contain research-related provisions: personal data can be further processed for research purposes which would not be considered incompatible with the original purposes; and can be stored indefinitely so long as there are appropriate measures in place to safeguard the rights and freedoms of data subjects.

For research, the usual lawful bases used are 'public task' or 'legitimate interests'. For special category personal data, a lawful basis for processing and a special category condition for processing in compliance with UK GDPR Art 9 is needed. This processing of special category data must also be 'in the public interest'. Public interest is not defined in the legislation but the ICO says: 'you should broadly interpret public interest in the research context to include any clear and positive public benefit likely to arise from that research'. The ICO provides some indicative examples of what may constitute public interest/public benefit processing and also mentions the avoidance of harm being 'a key factor in determining whether or not your research is in the public interest'.

Section 19(2) DPA 2018 stipulates that the research provisions cannot be used if the processing is likely to cause substantial harm or substantial distress to a data subject. This is one of a number of safeguards (in Art 89 UK GDPR and section 19 DPA 2018) which must be put in place in order to use the research provisions of data protection law. Other safeguards include:

  • technical and organisational measures to ensure respect for data minimisation;
  • the use of anonymous information where possible;
  • where not possible the use of pseudonymous information; and
  • not carrying out research for the purposes of measures or decisions about particular people (unless the research is approved medical research).

Among the technical and organisational measures, the ICO suggests, among others:

  • the carrying out of data protection impact assessments (DPIAs) where necessary;
  • the use of privacy enhancing technologies such as trusted research environments (TREs - see below for more detail); and
  • accountability frameworks such as the Five Safes (see below for more detail).

Data protection impact assessments, or DPIAs, are 'a process to help you identify and minimise the data protection risks of a project' and must be carried out if processing is 'likely to result in a high risk' to individuals. The ICO also considers it 'good practice' to carry out DPIAs 'for any other major project which requires the processing of personal data'.

The ICO has obligations under the Data Protection Act 2018 to produce various codes of practice, including one on Data Sharing, whose current version is from 2021. The Data Sharing Code aims:

to give individuals, businesses and organisations the confidence to share data in a fair, safe and transparent way in this changing landscape. This code will guide practitioners through the practical steps they need to take to share data while protecting people's privacy.

Among various proposed reforms, including changing the definition of 'personal data', the DPDI Bill also has provisions on research. According to the UK Government Department for Science, Innovation and Technology (DSIT), which has responsibility for data protection:

Unleashing more scientific research

Current data laws are unclear on how scientists can process personal data for research purposes, which holds them back from completing vital research that can improve the lives of people across the country.

The Bill has updated the definition of scientific research to clarify that commercial organisations will benefit from the same freedoms as academics to carry out innovative scientific research, ​​such as making it easier to reuse data for research purposes. This will reduce paperwork and legal costs for researchers, and will encourage more scientific research in the commercial sector. The definition of scientific research in the new Bill is non-exhaustive, in that it remains any processing that 'could reasonably be described as scientific' and could include activities such as innovative research into technological development.

These proposals have received differing receptions from different stakeholder groups. Some concerns have been raised by some (e.g. Dr Chris Pounder of Amberhawk) that this may lead to 'unethical' research taking place. Questions have also been raised about whether the UK will retain its adequacy decision with the EU if it implements the DPDI Bill reforms.

3.2.2 Human rights and equality law

The UK, and Scotland as a constituent part of it, remains a member of the European Convention on Human Rights (ECHR - which is a separate legal regime to EU law). ECHR rights are given some effect in the UK via the Human Rights Act 1998. Compliance with the Human Rights Act 1998 is a condition of the Scottish Parliament passing legislation as per the Scotland Act 1998. There are other pieces of legislation such as the Equality Act 2010 (which implements anti-discrimination rights and the Public Sector Equality Duty) and the aforementioned Data Protection Act (which implements the right to privacy from Article 8 ECHR).

The Scottish Parliament passed a bill to incorporate the UN Convention on the Rights of the Child into Scots law in 2021, but this is being challenged by the UK Government at the time of writing. The Scottish Government has also committed to introducing a Human Rights Bill for Scotland, incorporating four more UN treaties (the International Covenant on Economic, Social and Cultural Rights; the Convention on the Elimination of All Forms of Discrimination against Women (CEDAW); the Convention on the Elimination of All Forms of Racial Discrimination (CERD); and the Convention on the Rights of Persons with Disability (CRPD)) and including the right to a healthy environment, and rights for older people and LGBTi people.

The human rights legislation in force is relevant to personal data, to ensure that personal data is handled in ways which do not cause discrimination or infringe other human rights. Public sector bodies making decisions about data access must also adhere to the positive legal obligations designed to advance equality placed on them by the Public Sector Equality Duties, both within the Equality Act 2010 and the associated Scotland specific legal regulations.

The Equality Act protects against discrimination, victimisation and harassment due to one or more of nine protected characteristics: age; disability; gender reassignment; marriage and civil partnership; pregnancy and maternity; race; religion or belief; sex; and sexual orientation. The PSED requires public authorities to have due regard to the need to eliminate discrimination, advance equality of opportunity and foster good relations between different groups when they are carrying out their activities. In accordance with the Equality Act 2010 (Specific Duties) (Scotland) Regulations 2012, Scottish public sector organisations must carry out Equality Impact Assessments (EQIAs) 'to assess the impact of applying a proposed new or revised policy or practice against the needs of the general equality duty'. The Equality and Human Rights Commission publishes guidance on the Equality Act 2010, including specific guidance on the relationship between PSED and data protection law.

3.2.3 Digital Economy Act

The UK-wide Digital Economy Act 2017 (henceforth DEA) includes provisions in Part 5 on data sharing in the public sector, in the context of five chapters: (i) public service delivery; (ii) civil registration; (iii) debts owed to the public sector; (iv) fraud against the public sector; and (v) data sharing for research purposes. Data protection law must also be complied with in these circumstances. The DEA provisions are supplemented by statutory codes of practice. The DEA does not cover the sharing of health and social care data.

In 2020, the Scottish Government consulted on a list of Scottish public authorities to be considered for inclusion in the debt and fraud schedules of the DEA. A further consultation was conducted later in 2020. These consultations fed into the development of the Digital Government (Scottish Bodies) Regulations 2022 which added the Scottish bodies to the schedules. This provides the bodies with access to the powers to share information to better manage debt and fraud against the public sector. Data can only be shared in accordance with the specific purposes set out in the DEA and not for other purposes. Public bodies must have regard to a Code of Practice which provides detail on how these powers should operate.

3.2.4 NHS data sharing

The Scottish public sector holds personal data on a number of topics and issues. Prominent among those data are health and social care data from NHS services in Scotland.

For general NHS data processing, including sharing, the relevant legislation and guidance include:

  • statute law, including the aforementioned data protection law (including the Data Protection Act 2018 and UK GDPR) and Human Rights Act 1998, as well as the National Health Service (Scotland) Act 1978, Infectious Disease (Notification) Act 1889, Adults with Incapacity (Scotland) Act 2000, the Abortion Act 1967 among other pieces of legislation;
  • the common law in Scotland on confidentiality (which, in summary, requires either consent or a legal or public interest requirement for disclosure);
  • professional standards such as the Good Medical Practice principles for doctors, and equivalent professional standards for other registered professions, and;
  • the policies and organisational standards of the Scottish Government (Directorate of Health and Social Care) and NHS Scotland including Chief Medical Officer guidance.

3.2.5 Beyond the law

Our work proceeds on the basis that the aforementioned legal requirements in data protection, equality and human rights and other areas will be adhered to by all involved in private sector access to public sector personal data. We acknowledge that this is not always the case in practice. For instance, the ICO reprimanded the Scottish Government and NHS National Services Scotland in 2022 for concerns about the NHS Scotland COVID Status app as regards information provided to the public about how the app would use their data in the Privacy Notice. In 2022, we have also seen the Department of Education in Westminster reprimanded by the ICO for its 'poor due diligence' in permitting access to a database of personal data by an employment screening company, Trust Systems Software UK (Trustopia), which then used the database to build age verification systems for online gambling.

Even if we assume that all organisations in all sectors fully comply with data protection law, and other relevant laws such as the Human Rights Act and Equality Act, we also have to look beyond these pieces of legislation in order to understand how and whether private sector organisations should be able to access public sector personal data. For one, data protection law may leave some discretion on how certain provisions could be complied with, which opens up an ethical choice between two compliant outcomes, one of which may be more ethical than the other (O'Keefe & O'Brien, 2018). There are also a number of ethical, social and political issues which relate to this topic that are not clearly covered or resolved by the law as it stands. Among these are the role of the public and the value that the use of personal data could bring. Furthermore, data and corporate infrastructures may be international or transnational whereas legislation is regionalised to particular jurisdiction/s.

However, before we look in more detail at ethical issues, we now look at a series of policies around data and digital issues in Scotland which help implement and explain some of the legal requirements.

3.3 Policy

The Scottish Government has a number of policies in relevant areas for our work, some of which we refer to below.

3.3.1 Open Data

Scotland has had an Open Data strategy since 2015. Open data is non-personal and non-commercially sensitive, and so would exclude personal data held by the public sector, unless it is anonymised. The aim of the strategy is:

to create a Scotland where non-personal and non-commercially sensitive data from public services is recognised as a resource for wider societal use and as such is made open in an intelligent manner and available for re-use by others.

The Strategy envisaged that making data open would achieve the following:

1) Delivery of improved public services through public bodies making use of the data

2) Wider social and economic benefits through innovative use of the data

3) Accountability and transparency of delivery of our public services

In lieu of a single national official portal, a volunteer-run portal, Open Data Scotland, helps people to find open data in Scotland, held by a range of public sector organisations including local councils and Scottish Government agencies. More information about the importance of open data to a healthy data use ecosystem, and a list of recommendations, were published by the David Hume Institute in early 2022 (Watt, 2022) A Statement released from the Scottish Open Data Unconference in 2022 recognises the Scottish Government's public commitment to open data but its implementation of that commitment 'lags far behind what should be delivered' as 'the majority of public bodies… publish no, or very little open data'. The Statement also points to a lack of clear leadership, accountability and responsibility over open data in Scotland, and the deficiencies in the open data which is published.

3.3.2 Data and Intelligence Network

In May 2020 at the beginning of the COVID-19 pandemic, the Scottish Government established the Data and Intelligence Network (D&IN), a 'community of data experts' from across the Scottish public sector and academia whose aim is to provide evidence and analysis to inform decision-making and governance on data in the pandemic context. Among its more specific aims are to ensure information security and ethical data use in its projects, to develop 'frameworks and guidance on the data ecosystem, public participation and ethics', and to combine 'data from across the public sector, to generate actionable insights to make improvements for the people of Scotland, in a safe and transparent way, trusted by the public'.

The Data and Intelligence Network produced an Ethics Framework in 2021:

a set of values and principles that can be used by the D&IN either to apply to strategic decisions or to help frame problems or solutions for which members of the D&IN are seeking to use data or digital technology

The Values are: Competency; Transparency; Fairness; Purpose; Trust; Voice and Agency. The Principles are: Responsible; Accountable; Insightful, Necessary; Beneficial; Observant; and Widely Participatory.

3.3.3 Digital Strategy

In February 2021, the Scottish Government published its Digital Strategy, A changing nation: how Scotland will thrive in a digital world, which emphasised the Scottish Government's aspirations to be an 'Ethical Digital Nation' which engenders trust in how it uses data and digital technologies. It set out its vision as:

… a society where people can trust public services and businesses to respect privacy and be open and honest in the way data is being used. But this is about more than the use of data. It is about trust, fair and rewarding work, democratic, social and cultural inclusion, climate change, the circular economy and making sure that the raw materials used in production are ethically sourced.

A place where children and vulnerable people are protected from harm. Where digital technologies adopt the principles of privacy, resilience and harm reduction by design and are inclusive, fair and useful. This is not simple, nor quick work – but it is what we must work towards.

The Scottish Government convened an independent expert group in digital ethics which published a report, Building Trust in the Digital Era: Achieving Scotland's Aspirations as an Ethical Digital Nation, in November 2022. Like the IEG, this was supplemented by public engagement in the form of 'a broadly representative group of 30 people from across Scotland to learn, discuss and deliberate on key aspects of digital ethics'. The report produced a series of recommendations including on environmental aspects of digital ethics, which is referred to above.

3.3.4 AI Strategy

The Scottish Government's AI Strategy for Scotland has been in development since 2021, as part of a partnership with the Data Lab and a multistakeholder steering committee and working groups. The Strategy sets out a vision and principles for AI in Scotland, with the aim of making Scotland 'a leader in the development and use of trustworthy, ethical and inclusive AI'. One activity the AI Strategy aims to achieve is to:

Secure safe, proportionate and privacy-preserving access to data for research and innovation in the public interest, including Open Data and Research Data Scotland

Public sector personal data may be desirable to train AI, especially machine learning models. Work conducted by the GRAIMatter project at the University of Dundee in 2022 looks at this issue in more detail in the context of data from TREs/Safe Haven and makes a series of recommendations about how TREs could better accommodate this research while continuing to ensure privacy and security of TRE data (Jefferson et al., 2022).

3.3.5 Health and social care: data strategy

In February 2023, the Scottish Government and COSLA published the first data strategy for health and social care data, Greater access, better insight, improved outcomes: a strategy for data-driven care in the digital age. As mentioned many times in this document, health data is a very significant and valuable kind of personal data held by the Scottish public sector. While there are public sector personal data which is not related to health and social care, this strategy is still very important given the significance and value of this kind of data.

Among others, the Strategy 'sets a framework for the ethical, transparent use of data by health and social care providers' and introduces a 'shared set of ethical principles' which would apply to all kinds of organisations including 'an NHS organisation, a social care organisation, an academic body or a research company looking to utilise health and social care data' (p. 6). Of particular relevance to the IEG is the following principle:

We will always be clear about the intended benefits and potential risks that arise from our use of health and social care data for individual care, performance, and research.

There is also the following Commitment:

As set out in Scotland's Digital Strategy, we will make more of our health and social care data available openly where it is safe, practical and lawful to do so. This will include providing an improved framework for open data to enable non-public sector organisations to access data in a safe way. This will support linking and usage of data to develop new insights and support innovation.

Specifically on the theme of Supporting Research and Innovation, the Strategy makes the following commitments:

  • 'We will seek to maximise the opportunities for data-driven research and innovation, with broad public support, to accelerate realisation of the public benefits.'
  • 'We will openly demonstrate and describe the uses, safeguards, and benefits of the use of health and care data for research and innovation.'
  • 'We will support access to health and social care data through trusted research and innovation environments, such as Scotland's 'Safe Havens', with appropriate approval processes providing assurance that data is used in line with ethical principles.'
  • 'We will consider the use of data for research and innovation in the design of all new developments set out in this Strategy to maximise the opportunities and public benefits.'

Among the Strategy's deliverables, the following are most relevant to the IEG and UVOD programme:

  • 'We will work to create clarification of the terms for access and use of data for industry projects including the approval and controlled access pathways to ensure ethical use in the public interest. This will be refined with the conclusions of the Scottish Government's Unlocking the Value of Data programme once completed.'
  • 'We will examine how we could support collaborative data-driven research and innovation across the UK and internationally where this has public benefits for Scotland, there is suitable agreement and it is ethical to do so' (p. 74).

3.3.6 Equality Evidence Strategy 2023-2025

In March 2023, the Scottish Government published its new Equality Evidence Strategy, whose aim is:

to enable policymakers to develop sound and inclusive evidence-based policies to improve service delivery and outcomes for Scotland's people

The Scottish Government's Vision is:

To tackle structural and intersectional inequality of outcomes, Scotland's equality evidence base will become more accessible, wide-ranging and robust. A stronger evidence base will enable the development and delivery of sound, inclusive policies and services and enable the measurement of improvements in the lives of all of Scotland's people.

The Vision is supported by three core principles, of which the first is most relevant to the IEG's work:

1. More robust and comprehensive data and evidence will be gathered on the intersecting characteristics of people in Scotland across a range of outcomes.

2. Equality evidence will be made more easily accessible so users will be able to access what they need, when they need it.

3. Good practice will be shared and promoted to support increased confidence and competence in the production and use of robust equality evidence.

This new strategy follows the establishment of the Equality Data Improvement Programme in 2021, which 'aims to strengthen Scotland's equality evidence base which will in turn enable policy makers to develop sound and inclusive policy to improve service delivery and outcomes for people in Scotland with protected equality characteristics'.

3.3.7 Relationship to IEG and UVOD

The UVOD programme may contribute to these other policies by making data available for research and innovation, which may include AI development, and facilitating Scotland as an ethical digital nation through the governance of public sector personal data use by the private sector. Already, as mentioned, there are requests to use public sector personal data in Scotland for AI development, in particular machine learning model training (see Jefferson et al 2022). The UVOD programme is referenced explicitly by the health and social care data strategy and there are clear synergies and alignments there. Ensuring that public sector datasets are inclusive of protected characteristics will be key to ensuring that the Enabling Conditions in Principle #6 are realised.

At a broader level, ethical and appropriate governance of public sector personal data and its use by the private sector in certain circumstances may help fulfil the Scottish Government's National Performance Framework to 'create a more successful country' which increases wellbeing, decreases inequality and creates 'sustainable and inclusive growth', listing a series of outcomes.

3.4 Relevant organisations and initiatives in Scotland

For personal data held by the public sector, there are already some mechanisms for its access for research purposes, by academic and commercial researchers. One main mechanism is the Safe Havens which are also known as Trusted Research Environments (TREs). TREs, also known as 'Data Safe Havens' or 'Secure Data Environments', are highly secure computing environments that provide safe and secure access to personal data such as population data, census data and health data for approved researchers to use in research and development. In some cases, health data is linked to non-health data in TREs, but in other cases, non-health data is also available, not linked to health data, for research via TREs.

There are a series of user governance checks and controls that researchers and their projects must pass before they are granted access to the TRE and its data. There are also export controls that must be satisfied before they can withdraw any materials or outputs from the TRE. Typically, no identifying personal data is permitted to leave a TRE. TREs conduct Statistical Disclosure Checks to ensure that personal data cannot be inferred from publication-ready charts and tables. All data egress from the TRE is subject to inspection to make sure no original data, pseudonymised or otherwise, leaks into the public domain. In this way, TREs strike a balance between facilitating research through data access and preserving privacy and security. TREs in the UK are governed in accordance with aforementioned legal frameworks and the 'Five Safes' model: 1. Safe People; 2. Safe Projects; 3. Safe Outputs; 4. Safe Data; and 5. Safe Setting.

Research Data Scotland (RDS) is a collaborative initiative launched in 2020, involving the Scottish Government, Scottish public bodies and Scottish universities, which aims to 'facilitate insight from data and promote and advance health and social wellbeing in Scotland'. RDS is underpinned by the following principles:

1. RDS will only enable access to data for research that is for the public good and considers equalities

2. RDS will ensure that researchers and RDS staff can only access data once it is deidentified

3. RDS will ensure that all data is always kept in a controlled and secured environment, using the FAIR principles of Findability, Accessibility, Interoperability, and Reuse of digital assets, and building upon the 5 safes data privacy framework

4. RDS will be user-and problem-led not data-led.

5. As a charity, all income that RDS generates will be re-invested into services to help researchers continue to access data, and firms that access public sector data for research in the public good through RDS will share any commercial benefits back into public services

6. RDS will be transparent about what data has been made available for research through its services and how it is being used for public benefit

7. Aligned with the Scottish data strategy, we will support people's appropriate choice over the use of their data in research.

At the outset, RDS aims to provide a single point of access to researchers wanting to access data held by the Scottish public sector, with later stages involving RDS developing and cataloguing a dataset portfolio, and helping to streamline and clarify the processes by which researchers can access datasets.

Another initiative which works with RDS is Administrative Data Research Scotland (ADR Scotland). This is a partnership between the Scottish Government's Data for Research Unit and researchers in the Scottish Centre for Administrative Data Research at the University of Edinburgh, which 'help[s] to make administrative datasets more readily linkable and conducting research on a suite of critical issues in Scotland'. This research must be compliant with the ADR Scotland Strategy including the need for public benefit.

In the healthcare space, Public Health Scotland's electronic Data Research and Innovation Service (eDRIS) provides support to researchers who wish to access administrative datasets. eDRIS is another partner involved in RDS. Scotland hosts one of the six Health Data Research UK (HDR-UK) sites, co-ordinated by the University of Edinburgh, whose aim is to unite 'UK's health and care data to enable discoveries that improve people's lives'.

RDS envisages the use of data by academic/not for profit researchers but also by private sector organisations which may make profit. However, the following conditions would apply:

All results from research conducted through RDS must be capable of being published for the public benefit. Any commercial benefits will be shared back into RDS to improve public services. For private sector organisations, the same conditions apply for research institutions but a licensing model may be considered which will allow the benefits from the use of the data to be shared with other researchers.

There are other services such as DataLoch: a collaboration between the University of Edinburgh and NHS Lothian, with other NHS boards in South-East Scotland joining later. The DataLoch service brings together health and social care data from South-East Scotland for research and service-management purposes. In 2021, some concerns about privacy were raised in a Ferret investigation about GP practices contributing patient data to the DataLoch service. The ICO engaged in positive discussion with the NHS Lothian team to consider the concerns raised regarding DataLoch, particularly those over aspects of both transparency and the data protection impact assessment. As a consequence, steps were then taken which improved practice accordingly.

In July 2022, the DataLoch service extended its governance framework to cover the new possibility of private-sector access to data extracts. (To date, no private-sector access to data has yet been granted.) The governance extension was informed by a DARE UK-funded public consultation focused on the principles of trustworthy access for a range of organisations. Furthermore, all research applications undergo a Public Value Assessment, where members of the public from DataLoch's Public Reference Group assess proposals to ensure there is sufficient public value to warrant support by the DataLoch service.

The Industrial Centre for Artificial Intelligence Research in Digital Diagnostics (iCAIRD) is another important collaboration in Scotland. iCAIRD was an Innovate UK research programme funded through the Industrial Strategy Challenge Fund. The programme was active between 1st February 2018 and 31st March 2023. iCAIRD was one of five AI Centres established across the UK. It brought together a pan-Scotland collaboration of 15 founding partners from across industry, the NHS, and academia; including four SMEs. At its peak it had '30 active partner companies, with over 30 current research projects across radiology and pathology'. The industry leads were Canon Medical Research Europe (radiology) and Royal Philips (digital pathology). iCAIRD established secure analytic research environments through the NHS Safe Havens in Glasgow, hosted by NHS Greater Glasgow and Clyde, and Aberdeen, hosted by NHS Grampian and University of Aberdeen. Through the programme collaborations, Canon Medical Research developed the Safe Haven AI Platform (SHAIP), a TRE software system suitable for hosting secure access to de-identified and pseudonymised patient healthcare data (including imaging, and non-imaging electronic healthcare records, reports, letters, and pharmacy data) within a high-performance compute environment for the development of AI and machine learning solutions within a Federated Data Network. The SHAIP system is managed by the Safe Havens and works within the established governance and guidelines. The key principle was to facilitate research on key health challenges in Scotland, providing access to NHS, academia, and industry researchers within a safe and secure environment where the data never left the NHS network. As part of the access arrangements, the appropriate system security procedures were reviewed, data protection impact assessments (DPIAs) were performed, and organisations were required to agree and sign suitable data sharing agreements. Then before individual researcher access could be granted, researchers were required to complete user governance checks, including;

In many ways, the approaches and advances made by the iCAIRD collaboration have helped to establish the concept and definition of a TRE. Indeed, the governance frameworks and processes established by the iCAIRD programme can provide an exemplar for private sector access to pseudonymised data held by the public sector in an established TRE. At the time of writing, SHAIP continues to support AI and machine learning research and development projects for healthcare applications at sites in Glasgow and Aberdeen.

The Scottish Government has set up two Public Benefit and Privacy Panels to scrutinise requests for public sector data. The first, set up in 2015, is the Public Benefit and Privacy Panel for Health and Social Care (HSC-PBPP), which is for requests for secondary use of NHS Scotland (NHSS) data held in NHSS health boards, which is managed and run within NHSS. The other panel, also set up in 2015, is known as the Statistics Public Benefit and Privacy Panel (S-PBPP) which reviews requests for SG and National Records of Scotland census data. Requesters must also demonstrate alignment between their data requests and the National Performance Framework (NPF). These PBPPs work in similar ways, and collaboratively where possible, but are independent due to the different legislation relating to the data involved and the data controllership. Once approval has been given from either panel, the data are usually made available in the Scottish National Safe Haven (for HSC-PBPP) or released directly to the requester (for S-PBPP).

In order to access health data held by the NHS in Scotland, especially from more than one NHS Board, applications are scrutinised by the NHS Scotland Public Benefit and Privacy Panel for Health and Social Care (HSC-PBPP):

The HSC-PBPP provides robust, transparent, consistent, appropriate and proportionate IG and scrutiny of data access requests to ensure the IG principles of safe people, safe projects, safe data, safe places are maintained.

The HSC-PBPP is a 'patient advocacy panel' which ensures that public benefit and privacy implications of proposals to access data have been properly addressed and articulated in applications:

The HSC-PBPP need to balance public benefit with potential risk to privacy and ensure that the public interest will be furthered by the proposal, detailed in an application, and demonstrate that the social need for the processing of the data requested will result in a reasonable likelihood that it will result in a tangible benefit for society.

The panel includes representatives of different stakeholders from across Scotland including the general public, NHS representatives, research representatives and technical specialists. Among the panel are also NHS Scotland Caldicott Guardians, who are senior members of organisations which process health and social care personal data and who ensure 'that the personal information about those who use the organisation's services is used legally, ethically and appropriately, and that confidentiality is maintained'. Caldicott Guardians employ eight Caldicott Principles 'to ensure people's information is kept confidential and used appropriately', which are:

  • Justify the purpose(s) for using confidential information
  • Use confidential information only when it is necessary
  • Use the minimum necessary confidential information
  • Access to confidential information should be on a strict need-to-know basis
  • Everyone with access to confidential information should be aware of their responsibilities
  • Comply with the law
  • The duty to share information for individual care is as important as the duty to protect patient confidentiality
  • Inform patients and service users about how their confidential information is used.

The HSC-PBPP has devised its own set of principles to address issues raised by private sector use of public sector personal data. The purpose of the principles is for the HSC-PBPP to try to ensure consistency across applications it receives and considers. These principles are not currently publicly available on the HSC-PBPP website but their release is planned as part of the next iteration of updates to HSC-PBPP's application form and guidance notes. The principles include the need for clarity of the partnership and roles of each actor, clarity on data protection law and policy compliance (a DPIA must be in place), clear justifications for the data request, a clear statement of public benefit, clarity about IP allocation over outcomes, ethics approval (where needed) and external independent scientific peer review, and public engagement and transparency.

Individual NHS Boards in Scotland have their own Caldicott Guardians for reviewing access requests to personal data for the purposes of service evaluation, audit and research.

3.5 Scottish Government commissioned research and engagement

As part of the UVOD programme, the Scottish Government has commissioned three literature reviews to accompany and inform the IEG's work. We summarise them here, as they will also be published in full on the Scottish Government's website to accompany this Report.

3.5.1 Public Engagement around the Access of Public Sector Data with or by Private Sector Organisations – August 2021

This literature review was conducted by Sonja Erikainen and Sarah Cunningham-Burley from the University of Edinburgh Centre for Biomedicine, Self and Society, and delivered to the Scottish Government in August 2021, prior to the start of the IEG. The literature review considered public engagement activities on the use of public sector data by or with the private sector from ten years prior to 2021 and looked at developments in Scotland, the UK and internationally. While ostensibly the review looks at 'public sector data', in practice the resources cited mostly concern 'personal data' more specifically that is held by the public sector, and within that personal data, much of it relates to health data.

The authors identified different kinds of public engagement and research methods used which were grouped under 'deliberative', 'dialogic' and 'qualitative', and generated eight key themes from the literature: low public awareness, 'gut reactions', and changing perceptions around data use; acceptability of private sector data uses; the centrality of public benefit; importance of benefit-sharing and distribution; trust and distrust; oversight, governance, and safeguards; public involvement and engagement; and the impact of demographic differences of people's views.

The key findings of the review are:

  • 'Deliberative and dialogue based qualitative public engagement and research methods are effective in identifying informed and considered public views on private sector use of public sector data, and they can enable the construction of a public consensus that can be used to inform decision making.
  • 'There is a low level of public awareness and understanding of private sector access to public sector data and how this data is used. Publics tend to express negative 'gut reactions' towards the topic, but when provided with more information and opportunities to reflect on or deliberate it, they often change their minds.
  • 'There is widespread conditional acceptance of private sector use of public sector data especially among informed publics. Acceptability is most conditioned by the rationales for the data use, but also by the type of data being used and the type of the private sector organisation using it. Public benefit is the primary driver of acceptability and commercial gain or private profit the primary driver of unacceptability.
  • 'Demonstratable public benefit is the most prevalent consideration that publics have around private sector access to and use of public sector data. While the definition and scope of 'public benefit' is open and contested, publics want to see evidence that public benefit of some kind is the primary driver of public sector data access, that it can actually be achieved, and that it outweighs any possible private benefits.
  • 'Publics want to see the development of equitable benefit-sharing models for collaborations or partnerships between private and public sector organisations, as they expect benefits – including profits – to be returned to publics and reinvested into the public sector.
  • 'Public trust and distrust are key factors around private sector access to public sector data. While publics tend to be relatively distrustful of private sector organisations, they generally have a high level of trust in the public sector, and this is shaped by perceptions that the public sector is acting for public benefit whereas the private sector is motivated by private interests. Publics are more trusting of private sector uses of public sector data when public sector organisations retain control over the data during collaborations with the private sector.
  • 'Publics expect to see stringent oversight, governance, and safeguard arrangements around private sector use of public sector data, especially concerning an oversight or governance body, transparency and accountability processes, and arrangements for data security and safety, consent, and confidentiality. However, the precise nature of what the safeguards should be is contested, and it may be that the nature of the safeguards is less important than the fact that effective safeguards exist.
  • 'Publics want there to be public involvement or engagement processes and activities around private sector use of public sector data, but the precise nature of what this should look like, who should be involved and in what ways is contested while some want to be actively involved in decision making, others prefer more passive forms of communication and information distribution, and proportionality matters.
  • 'There is no singular 'public perspective' on private sector use of public sector data, but rather, while overarching patterns can be identified, publics are plural, and individuals' views are shaped by a diverse range of intersecting demographic and attitudinal variables.'

3.5.2 Public sector personal data sharing: Framework and Principles – December 2021

This literature review was conducted by another team from the University of Edinburgh and delivered to the Scottish Government in December 2021, before the start of the IEG. The research team comprised: Steven Earl, Morgan Currie, Matjaz Vidmar and Victoria Gorton, who comprise multidisciplinary backgrounds.

The review contained an analysis of frameworks and practices to provide access to public sector personal data for private sector organisations. The research team noted that 'this practice is extremely rare as it involves considerable legal, moral or ethical risks, including damage to public trust in the private sector'. The team identified two 'broad pathways' for facilitating public to private sector personal data sharing: data sharing agreements (which were the most common); and specific legislation for data sharing. They also identified an emerging third pathway being developed for artificial intelligence (AI) applications. The team also interviewed nine individuals from the public, private and third sectors in Scotland, Europe and globally to gain a deeper understanding of how these pathways worked in practice.

Data sharing agreements are the most common pathway in the UK, other European countries and elsewhere, involving the identification of the public interest and the drawing up of a data sharing agreement. The team notes that 'this pathway is currently predominantly used to facilitate sharing personal data held by the public sector to accredited research organisations'. On the occasions on which public sector personal data is shared with the private sector, this is the pathway used. The research team notes that the technical approaches used to manage the data sharing process vary, from the personal data being further processed and stored in a separate safe environment such as the Safe Havens, to data remaining in the original public sector organisation's database and researchers only being returned the results of their analysis.

Supplementing data protection law (EU General Data Protection Regulation and Law Enforcement Directive in the European Union, Data Protection Act 2018 and UK GDPR in the UK, see discussion above), the second pathway involves further, specific legislative frameworks being used to facilitate or restrict data sharing. UK examples given include the Serious Crime Act 2007 which permits the disclosure of personal data to specific anti-fraud organisations to prevent fraud. Another example is given from Finland of the Act on the Secondary Use of Health and Social Data, to facilitate the reuse of health and social care data, which is at an early stage of implementation and involves:

A separate permit authority will be set up – Findata – that will enable a centralized system for the issuing of the data requests and permits, rather than requiring sharing agreements with each data controller (as is the case in the UK).

The third emerging pathway relates to the 'new demands for larger-scale data sharing' implicated by AI development. Currently much personal data used by AI projects is done on the basis of consent. However, the EU's draft AI Act includes provisions (Articles 53-55) for a 'regulatory sandbox' with terms for the re-use of personal data within the sandbox, and Earl et al. (2021) consider that this might form a third pathway which would be applied to crime, public security, public health and safety and environmental issues.

The team note technical barriers to data sharing (lack of harmonisation across agencies), legal barriers (overlapping and complex legislative frameworks e.g. health data), conservatism about sharing data among public sector organisations and a lack of willingness at times to explain the reasoning behind decisions to refuse access, and public concern about private sector access to public sector personal data. They consider that the current situation can be improved:

Existing pathways for data sharing with researchers – and by implication, other parties – can be improved by creating shared data standards and protocols across agencies, demonstrating public value and involving the public in the designs of infrastructure and data sharing models, marketing the value of data sharing to immediate stakeholders and users, developing a central resource that facilitates data sharing and makes these procedures transparent, and sharing ethical standards and best practices internationally.

3.5.3 Private sector access to public sector personal data: exploring data value and benefit-sharing – December 2022

The third literature review, conducted by Anna Berti Suman (European Commission Joint Research Centre) and Stephanie Switzer (University of Strathclyde Law School), concerned how costs and benefits can be shared between the public and private sector vis-a-vis access to personal data, and intellectual property (IP) and royalty schemes relating to private sector use of public sector personal data.

A summarised version of key findings from their work is:

  • 'Public-sector bodies generally lag behind in developing and implementing data sharing regimes compared to the private sector'; however, there are existing good practices in policy areas such as public health.
  • 'Studies demonstrate that ordinary people are supportive of health and social care data being used for public benefit but wish those public benefits to outweigh private profits and interests.'
  • When assessing costs and benefits, these 'should not be conceived of as solely financial but understood in broader, more social terms'.
  • Prerequisites for achieving public benefit include transparency and public engagement to ensure a social license.
  • There is, however, some concern that a 'lack of a definition of public benefit may enable the concept to be exploited to facilitate […] commercialisation of government-held personal data'.
  • 'There is a vast literature on data value in general', with the notion of such value informed by the 'underlying context and socio-technical settlement' prevalent within society.
  • There were concerns that the '"assetisation" of personal data may influence conceptions of value, thereby potentially resulting in a lack of public scrutiny and inequity'. To overcome such issues, 'value co-creation and exchange beyond the market' was suggested.
  • 'Benefit-sharing is a concept typically associated with international environmental law and in particular, international biodiversity law to deliver commutative and distributive justice. Benefit-sharing is thus linked to justice and emphasises the optimisation of benefits to society, together with the minimisation of harm, and the achievement of equity.'
  • 'If data has the potential to benefit the public, it should be shared. Public benefit cannot be obtained in the absence of such sharing. However, the absence of common principles for trusted government-to-businesses (G2B) personal data sharing may lead to restrictions on data flows resulting in detrimental economic impacts.[...] Legal certainty is key for such sharing to take place.'
  • Creative personal data sharing schemes are 'being established between civic organisations and private actors (at times also engaging the public sector)' for certain citizen science activities, 'with creative commons licensing schemes, value co-creation, and the reality of "data cooperatives"'.
  • 'There is growing attention in the literature for the concept of 'data altruism' as also incorporated in the European Union Data Governance Act; this reflects a tendency to embrace a fair and open sharing of personal data for public benefit.' However it is restricted to not for profit uses of data, which may exclude much if not all private sector involvement.

The researchers also identified a set of key guiding principles:

  • 'proportionality,
  • transparency,
  • public engagement,
  • co-creation of the concept of value,
  • legal certainty, and
  • respect for ethical values and norms'.

3.6 DemSoc Public Engagement

The Scottish Government commissioned the Democratic Society (DemSoc) to conduct some initial public and stakeholder engagement activity on the principles in their draft form (which were published in August/September 2022). Two co-creation workshops with stakeholders and some IEG members, and two public workshops were held between November 2022 and January 2023. DemSoc conducted a feedback questionnaire with workshop participants and did some desk-based research on potential methods for future participatory engagement.

Key messages from participants at the two public workshops are:

  • 'Building public trust is really important. To build trust, do not turn the principles into a box-ticking exercise and build public awareness on the security of data and how it is used, stored, shared.
  • 'It's important to make it clear and transparent how someone can follow data and what data is publicly available. There needs to be a robust system in place on how to monitor the data for accountability.
  • 'Review whether our current model of consent, ownership and privacy is efficient and informed. There needs to be a re-evaluation of how people gain access and rights to their personal data and the ability to say 'no' to storage of their personal data.
  • 'Clear and specific language needs to be used. For example, when referring to ethical standards, laws and guidelines, they need to be clearly stated and referenced.
  • 'Need an international approach in order to consider how international laws might impact the UVOD programme in Scotland and also take inspiration and best practice from other countries.'

3.7 Public benefit, public interest and value

What is public benefit, public interest and value? These terms are key for our work, but are contested and open to different interpretations. We cannot take account of all these contestations and interpretations here, but we put forward a summary.

What can be characterised as public benefit, interest and value are deeply context-specific, depending on the values and objectives of a society, community, nation, individual, etc. As Harvey and Laurie (2021) put it: 'Actions taken in the public interest can be broadly described as those that promote objectives valued by society'.

3.7.1 Public benefit

The first literature review conducted by Erikainen and Cunningham-Burley (2021) clearly sets out the importance of public benefit in the public sector data context:

Demonstrable public benefit is the most prevalent consideration that publics have around private sector access to and use of public sector data. While the definition and scope of 'public benefit' is open and contested, publics want to see evidence that public benefit of some kind is the primary driver of public sector data access, that it can actually be achieved, and that it outweighs any possible private benefits. (Erikainen & Cunningham-Burley, 2021, p.1)

This relationship – between acceptability and public benefit – is further illustrated by a Wellcome study (Ipsos MORI, 2017) in relation to health data and public attitudes to commercial access. This specifically considered private sector access to data and the conditions under which this may or may not be permissible, and describes how participants applied four key tests (Figure 1.3) when considering the acceptability of data usages.

Decisions around acceptability may exist on a sliding scale, with those that have clear public benefit at one end, and those that have solely private benefit at the other. Further, it points to a space where these benefits may be 'mixed' in nature.

In the academic sphere, Aitken and colleagues have written extensively about the public engagement work they have conducted in relation to health data sharing, including in the Scottish context (Aitken et al., 2016). In particular, their work in relation to public expectations of public benefits from data-intensive health research (Aitken et al., 2018) has indicated that the term 'public' may be construed broadly, so that data usage can benefit as many people as possible. However, understandings of relevant publics may also be needs-led: in other words, there may be broader public benefit in research using data that benefits a smaller group or number of people in need (for example, research in relation to rare diseases). Similarly, Aitken and colleagues found that participants' preference in terms of the types of benefits was to keep this broad – so, in the context of health research, these benefits were not just seen as medicalisation, but also related to living longer, happier, and healthier lives. Perhaps more notably, publics were also concerned that such benefits should be measurable, and that these would actually be realised through the actions of key policy and government stakeholders.

The Office for National Statistics gives some guidance on how public benefit can be demonstrated by those seeking to conduct research using its data, whereby it stipulates that one of the criteria must be demonstrated:

  • provide or improve evidence bases that support the formulation, development or evaluation of public policy or public service delivery
  • provide an evidence base for decisions that are likely to significantly benefit the UK economy, society or quality of life of people in the UK
  • significantly extend existing understanding of social or economic trends or events, either by improving knowledge or challenging accepted analyses
  • replicate, validate, challenge or review existing research (including official statistics) in a way that leads to improvements in the quality, coverage or presentation of existing research.

In 2022, the ONS and Administrative Data Research UK (ADR UK) published a report comprising insights gleaned from research conducted with publics in the UK (including in Glasgow) on what they considered to be 'public good' (considered interchangeable with 'public benefit' and 'public interest') use of data for research and statistics. The research produced five 'key findings' which emerged from discussions which took place with a diverse sample of participants:

  • 'Public involvement: Members of the public want to be involved in making decisions about whether public good is being served'
  • 'Real-world needs: Research and statistics should aim to address real-world needs, including those that may impact future generations and those that only impact a small number of people'
  • 'Clear communication: To serve the public good, there should be proactive, clear, and accessible public-facing communication about the use of data and statistics (to better communicate how evidence informs decision-making)'
  • 'Minimise harm: Public good means data collected for research and statistics should minimise harm (and not contribute to anything harmful), including an awareness of unintended harmful consequences of the misrepresentation of data research and statistics'
  • 'Best practice safeguarding: Universal application of best practice safeguarding principles to ensure secure access to data should help people feel confident to disclose data.'

Another recent public dialogue, which was co-funded by the National Data Guardian for Health and Social Care (for England) amongst others, provides a deep dive into public benefit, exploring how this might be assessed in the data context (Hopkins Van Mil, 2021). This was conducted in the context of health and care data with around 100 participants, and its findings underline the need for transparency throughout the data lifecycle, and for authentic public engagement with a cross-section of society, amongst other matters.

In late 2022, the National Data Guardian issued guidance on evaluating public benefit for uses of health and social care data for purposes beyond individual care, which include but are not limited to research and innovation. While this guidance is not applicable to Scotland, it may be useful for us to take on board in considerations of public benefit. The NDG's public dialogue informed this definition of 'public benefit':

Public benefit means that there should be some 'net good' accruing to the public; it has both a benefit aspect and a public aspect. The benefit aspect requires the achievement of good, not outweighed by any associated risk. Good is interpreted in a broad and flexible manner and can be direct, indirect, immediate or long-term. Benefit needs to be identifiable, even if it cannot be immediately quantified or measured. The public aspect requires demonstrable benefit to accrue to the public, or a section of the public.

The NDG recognises that its definition of public benefit also reflects the Charity Commission's interpretation of the public benefit required in charity law, discussed in more detail below. The Guidance reiterates the need for transparency and public engagement for earning public trust in secondary uses of unconsented data, along with 'proportionate governance processes and building in ongoing evaluation and learning'.

As regards use of data by the private sector, the NDG states that:

If the only benefit of a specific data use is the generation of profit by a commercial organisation, that use cannot be deemed for public benefit. However, the generation of proportionate commercial profit may be acceptable to the public if the use also delivers a public benefit, such as improved services or improved NHS knowledge and insights. When assessing proportionality, the public benefit evaluation process should ask the data applicant to provide a transparent assessment of how the commercial interests are proportionately balanced with the benefits to the public.

The NDG points to guidance for NHS (England) organisations entering data sharing agreements with third parties to help realise patient and NHS benefits. The NDG also points to the importance of 'fairness' in weighing public and private benefit, which is further elaborated in a report from Understanding Patient Data and the Ada Lovelace Institute, and DHSC Guidance on creating frameworks for realising patient and NHS benefit. NDG also points to the Centre for Improving Data Collaboration, part of the former NHSX in England (which has now been integrated into the NHS Transformation Directorate), whose remit is to support fair data sharing partnerships. In terms of understanding benefit, the NDG public dialogue findings demonstrated that:

people think the concept of public benefit should be broad and flexible and include direct, indirect, and long-term benefits. People also told us the benefit needs to be identifiable, even if it cannot be quantified or measured.

From the dialogue, a list of indicative questions was formulated to help determine whether an intended purpose can be considered for public benefit, which range from very concrete and measurable benefits to more abstract benefits such as the support of knowledge creation and exploratory research.

For partnerships with the private sector, the NDG drawing on the public dialogue presents three (illustrative, non-exhaustive) suggestions for discerning public rather than private benefit:

  • 'Will any private profit, or progress made by a commercial organisation, also lead to benefits for the health and care system that will ultimately benefit patients? For example, improving how the NHS operates by increasing service or administrative efficiency?
  • 'Where a commercial organisation makes private profit or progress that serves its own interest, is the agreement that underpins its partnership with the NHS based on fair terms? Does that agreement recognise and safeguard the value of the NHS data on which the organisation's profit or progress is founded?
  • 'Will research findings be openly shared with others who can use them to maximise benefits to patients, the wider public, and the health and social care system?'

The NDG further recommends that data users should be prepared to demonstrate the public benefit being delivered, as specified by the public sector organisation providing the data and should be shared with the public e.g. in a data uses register, with this being particularly important when the user is seeking renewed or additional access to data, in which case public benefit up to that point should be demonstrable.

Once public benefit has been established, the NDG recommends a consideration of the risks inherent in that data use. Risk should be avoided, and if not possible, minimised with sufficient safeguards, and if it still exists, an assessment of whether 'on balance, the public benefit is sufficient to justify running that residual risk'? Anonymous data will significantly reduce risks to privacy and 'are unlikely to outweigh a public benefit'. Furthermore the risks of not using data may be more detrimental to public benefit than the risks of using the data, and this should also be taken into consideration (see e.g. Jones et al., 2017). The risks of non-use can be economic in nature, that placing overly burdensome barriers in the way of accessing public sector personal data could damage favourable economic activities related to research, development and innovation.

Public benefit is not a concept confined to data issues, as recognised by the NDG above. In Scottish charity law (Charities and Trustee Investment (Scotland) Act 2005 section 7) a charity is a body which only has charitable purposes and which provides or intends to provide public benefit in Scotland or elsewhere. Section 8 of the 2005 Act stipulates that public benefit cannot be presumed from any particular purpose, and in determining whether a body provides or intends to provide public benefit, regard must be had to:

(a) how any—

(i) benefit gained or likely to be gained by members of the body or any other persons (other than as members of the public), and

(ii) disbenefit incurred or likely to be incurred by the public, in consequence of the body exercising its functions compares with the benefit gained or likely to be gained by the public in that consequence, and

(b) where benefit is, or is likely to be, provided to a section of the public only, whether any condition on obtaining that benefit (including any charge or fee) is unduly restrictive.

The Scottish Charity Regulator (OSCR) explains that:

To see whether an organisation provides public benefit or (in the case of applicants) intends to provide public benefit, we look at what it does or plans to do to achieve its charitable purposes.

Public benefit under charity law relates to a subcategory of activities providing public benefit in a charitable sense which is to advance an organisation's charitable purposes. Nevertheless, looking at charity law can be useful for considerations of public benefit in other contexts.

OSCR takes a broad view of what 'benefit' and 'public' mean, acknowledging many forms of benefit, tangible and intangible, but that they must be identifiable. 'Public' can refer to the general public but also to subsets of the public, e.g. a particular community, children or people with specific needs. To demonstrate that a charity provides public benefit, OSCR states that it must describe the work they do and their achievements in their annual report, which is publicly available as well as subject to review by trustees. In assessing public benefit, OSCR adopts the following process:

  • The comparison between the benefit to the public from an organisation's activities; and
    • any disbenefit (which is interpreted as detriment or harm) to the public from the organisation's activities
    • any private benefit (benefit to anyone other than the benefit they receive as a member of the public).
  • The other factor that we must take into account in reaching a decision on public benefit is whether any condition an organisation imposes on obtaining the benefit it provides is unduly restrictive. This includes fees and charges. Seeundue restrictions for more information

It considers public benefit from a holistic perspective, 'based on all the facts and circumstances applying to the organisation'.

In England and Wales, the regulator, the Charity Commission, has also provided guidance on the public benefit requirement in the context of the Charities Act (Charity Commission, 2013; plus updated format 2017). Again, to satisfy the 'benefit aspect' of public benefit, 'a purpose must be beneficial' and 'any detriment or harm that results from the purpose must not outweigh the benefit'; to satisfy the 'public aspect' of public benefit the purpose must 'benefit the public in general, or a sufficient section of the public' and 'not give rise to more than incidental personal benefit' (Charity Commission, 2013, p. 5). As noted above, this is also a distinction explored in research conducted by Aitken et al. (2018), and adopted and adapted by the NDG in its guidance for England discussed above.

3.7.2 Public interest

To turn next to notions of 'public interest', it is apparent that this term can be equally, if not more, elusive. In the context of health research regulation, it has been claimed that 'actions taken in the public interest can be broadly described as those that promote objectives valued by society' (Harvey & Laurie, 2021).

More specifically, in the context of data use, the public interest is a prominent feature of the policy and legal regimes that govern the use of confidential data – for example in data protection legislation and the common law duty of confidentiality in the UK. However, neither this legislation nor case law provide a definition of what is, or is not, 'in the public interest'. Indeed, what emerges from these discussions is that, much like public benefit, the public interest is deeply contextual, and so perhaps we should consider what the public interest 'does', rather than solely what it 'is', and how it may relate to other similar terminology, such as the public benefit.

The Information Commissioner's Office (ICO) recently (2022) consulted on guidanceon the research provisions in the UK's DPA 2018 and GDPR, and stated in the published guidance:

The legislation does not define the 'public interest'. However, you should broadly interpret public interest in the research context to include any clear and positive public benefit likely to arise from that research.

The public interest covers a wide range of values and principles about the public good, or what is in society's best interests. In making the case that your research is in the public interest, it is not enough to point to your own private interests.

The 'public interest' is not defined in the legislation although as mentioned in section 3.2.1 above, the ICO has given some indicative examples of what it may constitute. For special category data, a 'substantial public interest' is one of the grounds on which special category data can be processed (UK GDPR Art 9(2)(g), see also section 10(3) of the DPA 2018). There are 23 specific substantial public interest conditions set out in Schedule 1 of the DPA 2018:

  • Statutory and government purposes
  • Administration of justice and parliamentary purposes
  • Equality of opportunity or treatment
  • Racial and ethnic diversity at senior levels
  • Preventing or detecting unlawful acts
  • Protecting the public
  • Regulatory requirements
  • Journalism, academia, art and literature
  • Preventing fraud
  • Suspicion of terrorist financing or money laundering
  • Support for individuals with a particular disability or medical condition
  • Counselling
  • Safeguarding of children and individuals at risk
  • Safeguarding of economic well-being of certain individuals
  • Insurance
  • Occupational pensions
  • Political parties
  • Elected representatives responding to requests
  • Disclosure to elected representatives
  • Informing elected representatives about prisoners
  • Publication of legal judgments
  • Anti-doping in sport
  • Standards of behaviour in sport

As 'public interest' is broader than 'substantial public interest', the public interest may encompass these substantial public interest conditions but may also encompass other conditions which are not listed here. We also note the limitations of these high level conditions which provide some detail, but little in the way of context. Furthermore, some of these public interest conditions such as 'insurance' may not be appropriate for the reuse or further use of public sector personal data by the private sector, as opposed to the private sector collecting personal data directly for its own services and products.

Turning to other research, the connection is made between public interest and public benefit, to argue that a principal function of the public interest in law is 'to carve out a legally legitimate space within which [research] activities that infringe on individual interests but have potential public benefits can be lawfully conducted, which otherwise would not be permitted' (Sorbie, 2022). However, the argument is also made for a conception of the public interest that is socially (as well as legally) legitimate, pointing to the difficulties of defining this term on the basis of a homogenised conception of who 'the public' are, and in the absence of engagement with actual publics' views (for example, see Sorbie, 2020; 2021).

Indeed, as Erkainen and Cunningham-Burley (2021) recognise:

There is no singular 'public perspective' on private sector use of public sector data, but rather, while overarching patterns can be identified, publics are plural, and individuals' views are shaped by a diverse range of intersecting demographic and attitudinal variables.

Taken together, it has been argued that the public interest is best understood in ways that foreground relationality, temporality and accountability (Sorbie, 2022). In short, relationality requires that, as noted above, the diversity of and within 'publics' should be explored, as well as how context can shape these interests. Temporality points to the ways in which data use, on the one hand, and the public interest, on the other, overlap and intersect each other throughout the entire data lifecycle, therefore underlining the need for ongoing review. Finally, accountability emphasises the nuanced role of transparency in multifactorial decision making, yet underlines that mere transparency is in no way a synonym for accountability. These are all features that are reflected in our principles.

Furthermore, in the UK context, Cheung (2020, pp. 7-8) points to the mis-alignment between what publics may consider to be of public benefit in health data use and government priorities 'of stimulating economic growth through maximising value from NHS data, particularly through private-sector collaboration'. Indeed, ultimately what the public interest and public benefit are may be, as Scassa and Vilain (2019, p. 11) put it, 'perceived differently depending on social circumstance or ideology'.

In our formulations of public interest and public benefit in the Guiding Principles we have aimed to take account of the diversity of the publics, the need for meaningful and ongoing engagement on these and other issues, as well as the need for transparency and accountability. We hope this also goes some way to correcting the misalignment identified by Cheung (2020) above as regards what the public views as beneficial and what the government may view as beneficial when using public sector personal data by the private sector.

3.7.3 Value

We view the concept of 'value' in a very broad sense encompassing economic, social and environmental aspects. We do not define what 'value' is beyond this, only noting that the value produced by private sector access to public sector personal data should not be solely economic or financial but should also encompass social and environmental value. Yet, what value is in any of these contexts, like the terms public benefit and public interest discussed above, will be contextual and dependent on the values and objectives of the society, community and individual.

At a societal or national level, we find a vision of value in the Scottish Government's National Performance Framework (which is also Scotland's localisation of the UN Sustainable Development Goals) with its aims to:

  • create a more successful country
  • give opportunities to all people living in Scotland
  • increase the wellbeing of people living in Scotland
  • create sustainable and inclusive growth
  • reduce inequalities and give equal importance to economic, environmental and social progress.

The NPF also contains three values:

  • treat all our people with kindness, dignity and compassion
  • respect the rule of law
  • act in an open and transparent way.

These could guide what 'value' means when it comes to private sector access to public sector personal data producing 'value'.

Nevertheless, the idea of 'value' or the kinds of steps which are required to achieve it may be deeply ideological and individualised. Notions such as 'growth' are not universally accepted; indeed, especially in the context of environmental economics, there is a rich discussion of the need for 'degrowth' (see e.g. Kallis et al., 2018; Enough! Collective, 2022). Even if the NPF is followed, there may be differing opinions on how the aims are achieved e.g. via more government intervention in markets, by the private sector leading economic activity with minimal interference, or by individuals and local communities taking a lead, etc.

We will not engage more in these debates here (as there is unlikely to be consensus on these issues from IEG members). Suffice it to say that if 'unlocking' the 'value' of public sector personal data for use by the private sector in Scotland is to be achieved, questions about what 'value' this is require resolution as part of broader democratic (political economic) discussions involving the Scottish Government, Parliament and people in Scotland. Equally, there needs to be parallel conversations about what 'harm' is and any harm and costs which might also be generated by data use. Conflicting values must also be taken account of in such democratic debates.

3.8 Critical views on (digital) data

There is a critical vein of research, especially from humanities and social sciences, on the involvement and role of (large) private sector organisations, especially transnational corporations, in digital data and technologies. Some of these companies, such as Google (Alphabet), Meta (Facebook), Apple, Amazon and Microsoft possess significant economic - and political - power in many countries including Scotland and the UK, and in some cases they are economically bigger and more powerful than countries (see e.g. Daly, 2016). From this economic power also comes computational power, especially in the form of the resources and infrastructures needed to facilitate the level of data processing capacities that increasingly only the private sector can provide (Durante, 2021).

Concern has also been raised about such large transnational private sector organisations in digital technologies offshoring their tax obligations and paying only minimal amounts in countries such as the UK (see e.g. Klinge et al., 2022). Furthermore, there is concern about the labour practices of some of these large companies including Amazon in the UK (Briken & Taylor, 2018). While it is the labour practices of low-waged Amazon workers in fulfilment centres which has caused most concern, it is of note that the company is an infrastructure provider for some TREs through its Amazon Web Services cloud service.

Although beyond the scope of the IEG's work, there is also concern about the role of governments/the public sector in digital technologies and data gathering, including surveillance activities (see e.g. Keenan, 2021) and/or the often unintentional monetisation of personal data via third party platforms used in the public sector (e.g. Microsoft), situated within the larger global digital infrastructure (see e.g. Srnicek, 2017; Van Dijck et al., 2018). These concerns often relate directly to issues surrounding inequality and human rights. From biased, predictive policing based on digital data (Browning & Arrigo, 2021; Eubanks, 2018), to data gathered for health purposes which are then used for immigration enforcement purposes, which particularly affects asylum seekers and undocumented migrants (Waterman et al., 2021; see also Papageorgiou et al., 2020): existing debates thus critique such discriminatory practices through the weaponisation of digital data against vulnerable groups already marginalised in society (see e.g. O'Neil, 2017).

The involvement of large digital private sector organisations in using public sector personal data, especially in health, has proved controversial in other parts of the UK. One example is the Google DeepMind-Royal Free partnership (Powles & Hodson, 2017), which the ICO ultimately found did not comply fully with data protection law. Current controversies relate to the involvement of Palantir in providing data infrastructure for the NHS in England (see Dyer, 2021; Iliadis & Acker, 2022; Salisbury, 2023). Cheung (2020, p.1) has pointed to the involvement of such players in the health data space as rendering public sector health data 'potentially subject to the logics of data accumulation seen elsewhere in the digital economy'.

While commentators have critiqued and raised concerns about these (and other) aspects of data, there is also a vein of research on what more progressive and inclusive data and digital futures might look like, including concrete proposals for models and approaches which would better serve the public interest. Among these are the work on Good Data (Daly et al., 2019; see also Hartman et al., 2020) and Data Justice (Dencik et al., 2022). We can also look to Indigenous Data Sovereignty (IDS), developed by First Nations scholars to ensure that the creation and use of data realises their rights under the United Nations Declaration on the Rights of Indigenous Peoples (Kukutai & Taylor, 2016). The approaches and models developed by IDS scholars can inform more equitable data collection and use for non-Indigenous people and communities as well (Carroll Rainie et al., 2019).

We have very briefly touched on models which may facilitate greater public participation and control over (personal) data in Recommendation #10, which include personal data stores, data cooperatives and data trusts (Nanada & Narayan, 2022). However it is important to note the limits and shortcomings of certain applications of these models as well, such as the example of a data trust given by Scassa (2020) which was top-down and originating from a single stakeholder. While there was no consensus among the IEG on recommending an opt-out function, opt out can be considered as an 'ultimate' form of individual control, especially vis-a-vis private sector use of public sector personal data.

Another issue on which there was no IEG consensus was intellectual property (IP) arrangements between the public and private sector over (aspects of) the process and outputs of using public sector personal data. Private sector use of IP has a long and contested history, including as regards (personal) data and benefit-sharing (see e.g. Lucas et al., 2013; Andanda 2019).

Issues have arisen more recently around IP, especially commercial confidentiality, blocking access to public sector personal data including use by other public sector organisations (see e.g. Goldacre & MacKenna, 2020 on this issue in NHS England). The Financial Times, in an investigation of data sharing from NHS England in 2021, found that:

insights from the data were often shared or sold on to other commercial entities and providers that use it to price products being sold back to the NHS, or conversely restrict the NHS's access to analysis of its own data, creating conflicts of interest. Among the biggest criticisms focused on the opacity around the data's fate after it leaves the NHS's servers, and the lack of an auditing trail beyond the companies on the [Data Use] register (Murgia & Harlow 2021).

To remedy this, Pasquale (2013, p. 683) advocates that:

Policymakers need to skillfully navigate areas of law often used to stop the sharing of data, including intellectual property rights and contractual obligations.

This should be taken into account in devising contracts and equitable benefit-sharing for the use of public sector personal data by the private sector in Scotland.


Email: christopher.bergin@gov.scot

Back to top