Unlocking the value of data - public engagement: literature review

This report highlights the findings from a literature review commissioned by the Scottish Government on public engagement regarding the use of public sector data by or with the private sector over the last 10 years both in the UK and internationally.


Included studies

Forty-four documents were included into the analysis, of which 9 were focused on private sector use of public sector data as a primary topic. Most of the documents retrieved reported studies conducted in the UK. Out of the 44 documents, 26 studies involved UK populations only (6 in Scotland, 12 in England, 1 in Northern Ireland, and 8 across the UK), and 3 were comparative studies involving populations in the UK and other countries (UK and Sweden; US, Canada; and Australia). The rest involved populations in other high-income countries (5 in the US, 2 in Canada, 2 in Australia, 1 in Singapore, 1 across Europe, and 1 comparative study between the US and Japan). None involved populations in low- and middle-income countries. Out of the 9 primary topic documents, 7 studies involved UK populations only (3 in England, in 1 Scotland, and 2 across the UK), and the other two involved US and Australian populations. The documents also mainly reported studies focused on health or medical (including genomics) data. 36 of all the included documents and 8 of the primary topic documents were exclusively focused on health or medical data. The 9 documents that covered other kinds of data all also included health and medical data. The research or engagement activity participants included representatives of general publics and members of specific communities, patients and people with long term health conditions, and medical or health research and databank participants, with a diverse range of demographic characteristics and backgrounds.

Eight analytical themes were identified across the included documents, and we outline these below. While data across all documents was included in the analysis, in discussing the themes here, we place more emphasis on the 9 documents for which private sector use of public sector data was a primary topic. This is because these documents considered the related issues in more depth and their findings cover a more comprehensive range of considerations relevant to answering our research questions.

Public engagement and research methods used

The included studies used a range of different qualitative public engagement and research methods, mixed qualitative and quantitative methods, and quantitative survey, questionnaire, and, in one case, polling methods.

Fourteen used quantitative survey and questionnaire methods. Most were surveys or questionnaires designed to assess public attitudes, beliefs, knowledge, or awareness of data related topics, generally based on random sampling across a national population or from specific groups such as regional or patient populations, but 4 used stated preference, discrete and scenario based choice experiments. For example, Aitken et al (2018) and Tully et al (2020) used discrete choice experiment studies that enabled participants to select their preferred choice from a set of fixed options around questions concerning data sharing and linkage, and to establish the relative importance of different considerations around each option. The survey and questionnaire based studies generally did not, however, provide participants with information about the topic beforehand, with the exception of one study where a short informational video was provided for survey participants to watch before filling the survey. In addition, 3 of the included documents were based on mixed methods that combined survey or questionnaire methods (in 2 cases) or polling methods (in 1 case) with qualitative workshop or interview methods.[2]

The majority of the included documents were based on qualitative research and public engagement methods. This included focus groups, but many studies and engagement activities used deliberative and group dialogue methods (4 used Citizen Juries, 1 used a Citizen Summit, 6 used workshop methods, 3 used other deliberative methods, and 5 used a combination of qualitative methods including one or more of the above deliberative and dialogue methods as well as round tables combined with focus groups or interviews). Most of these were based on the formation of ‘mini publics’(c.f. Setälä & Smith, 2018), which bring together randomly selected members of publics who are broadly representative of the population in terms key demographic and, often, attitudinal characteristics in a deliberative environment. These methods generally involved a range of deliberation activities and discussion, and often including prompts, scenarios or case studies of actual or proposed uses of public sector data to scope and assess public perspectives around different examples of possible data uses, which allowed the identification of nuances, complexities and contextual considerations that members of publics had. They generally included facilitators and expert witnesses who provided background information on the topic from different perspectives. Deliberative methods including citizen juries and the citizen summit also enabled the participants to collectively develop recommendations that could be used to inform decision making.

Nearly all studies using these kinds of methods highlighted that they enable informed and in depth discussion, reflection and engagement from publics, resulting in more considered and consequently more nuanced views, and that they enabled consensus of recommendations. For example, Tully et al (2019) highlighted that the deliberative process, like citizen juries, enable publics to learn more about an issue, be exposed to a range of evidence and perspectives, and arrive at informed conclusion about them. Street et al (2021) similarly emphasised the informed nature of the deliberations that these methods enable, and the ability of citizen juries to rapidly increase participants’ knowledge and understanding, which allows participants to effectively engage with policy relevant challenges and generate policy-guiding recommendations that are collective produced and agreed. Understanding Patient Data and Ada Lovelace Institute (2020) noted specifically that public deliberation is a valuable method for public involvement in good data governance and can be purposely designed to ensure public views feed into the development of policy frameworks, while Teng et al (2019) among others emphasised that deliberative methods can enable people with diverse views, opinions, and interests to reach a consensus and contribute to policy through the diverse life experiences they bring to bear on a topic.

Further, Tully et al (2018, 2019) among others highlighted that publics are often unaware how data is used and why, which can mean that responses to quantitative surveys and questionnaires are given from a position of relative ignorance on the topic, which can have an impact on the nature of those responses. They also noted that qualitative methods including deliberative and dialogue methods, but also focus groups that involve the provision of information and reflection on the topic, enable publics to arrive at and provide more informed views. Indeed, this is supported by our review, which found that the reported findings from quantitative studies tended to report more negative reactions towards private sector access to public sector data than studies and engagement activities that provided participants with information on the topic beforehand and opportunities for discussion and reflection (see the low public awareness, ‘gut reactions,’ and changing perceptions theme below). Tully et al (2018) also noted that the citizen jury process in particular has been specifically designed to enable decision makers to hear publics’ views on complex topics, making it ideal for situations where decision makers wish to learn more about what informed publics think.

Combining quantitative methods with qualitative deliberative and dialogue methods also allowed some studies and engagement programmes to specifically identify how and to what extent the deliberative and dialogue activities influenced or shifted public’s views. For example, the Royal Academy of Engineering (2010) combined electronic polling with focus groups and deliberative activities for their public engagement programme around the Breathing Country performance, which was designed to explore young people’s views on the development of electronic patient records. This allowed the identification of how publics’ perceptions changed from before to after the performance.

Overall, across the included documents, qualitative deliberative and group dialogue methods enabled the emergence of more nuanced and considered public views, likely because they enable participants to become more informed about the topic and reflect on and discuss it with others, whereas quantitative methods generally uncovered views that were based on publics’ pre-existing knowledge and spontaneous attitudes towards the topic.

Findings from public engagement and studies on public views: themes

1. Low public awareness, ‘gut reactions,’ and changing perceptions

Many studies found that public awareness and understanding of data in general and private sector use of public sector data particularly is low. For example, the TNS (2012) and Ipsos MORI (2016) found that there is a limited understanding of how data is used in and shared beyond the public sector and what constitutes ‘data’ in the first place, including confusion between data and concepts like ‘information’ and ‘knowledge.’ Several studies found low awareness of the kinds of data sharing that already occur between the public and private sectors and about the role that private sector companies play in public sector service delivery. Hopkins, Kinsella, and Hopkins Van Mil (2020), Castell, Robinson, and Ashford (2018), and Ipsos MORI (2016), for example, found that most members of general publics are unaware of existing partnerships and data sharing between the NHS and private sector organisations, and when they are aware of it, they tend to underestimate the extent to which cross-sector data sharing already occurs.

Several studies also found that when members of publics are provided with more information about and opportunities to reflect more deeply on private sector uses of public sector data, many – albeit not all – change their minds about the topic, generally towards more positive views. Tully et al. (2019) for example found that over half of their study participants changed their views from feeling negative to feeling positive about private sector use of public sector data after deliberating on and becoming more informed about it, especially about the safeguards that are in place, while Street et al (2021) found that many of their study participants shifted from complete opposition to recognition of the value of cross-sectoral data sharing. It should be noted, however, that while the shift in perspectives was generally found to be from negative to positive, some, including Chico, Hunn, and Taylor (2019), also found that a small minority change their minds in the opposite direction.

Several studies also suggest that there is a link between low awareness and understanding of private sector use of public sector data on the one hand and negative perceptions of private sector access to public sector data on the other. Ipsos MORI (2016), for example, found that lack of awareness of the current state of affairs led to concerns over private sector access to public sector data and to negative ‘knee jerk reactions’ towards it, while participants who changed their minds over the course of Chico, Hunn, and Taylor’s (2019) study from positive to negative reported that their initial views indicated their spontaneous ‘gut reaction’ towards the topic. Their views changed to become more accepting after they became more informed. It is important to note that this likely bears relevance to and functions as a caveat when it comes to studies that report findings about the acceptability of private sector access to public sector data, especially when these studies report public views at the ‘baseline,’ where participants were not provided with further information or opportunities to reflect on the topic before their views were collected.

2. Acceptability

Several studies found that the spontaneous views of members of the publics are generally negative towards private sector access to public sector data, and significantly higher numbers of people find it either unacceptable or undesirable than acceptable when their initial views are collected. This is linked to public perceptions about the rationales or motives that drive private sector organisations, which tend to be perceived as connected to profit making. For example, Rand Europe (2015) found that an overall pattern in their large cross-European survey was that Europeans are averse to the idea of private sector actors accessing their health data, while Davidson et al (2013) found that in the UK, there is a strong spontaneous opposition to public data being used by the private sector, for the purpose of profit maximisation. Hill et al (2013), among others, similarly found that when it comes to using data for research, there is a dichotomy between uses that are seen as acceptable versus uses that are seen as unacceptable, and this is based on perceptions concerning whether the research is for public benefit or for profit, respectively; these are seen as clashing motives.

All studies that considered public perceptions of acceptability beyond spontaneous reactions found that when publics’ views are unpacked in more depth, they become more nuanced than simple aversion or opposition. Publics actually tend to find private sector use of public sector data acceptable under certain conditions, depending on the rationales for the use, the type of the data used, and the type of the private sector organisation using it. Illustrative of this is Ipsos MORI (2016) finding that participants in their study applied four key tests to decide how acceptable any given case of health data use is, with acceptability depending on why the data is being used, who is using it, what kind of data it is, and how it is used. They found that acceptability is assessed on a scale where uses of aggregate data by public health providers for public benefit in secure and regulated environments are most acceptable, while uses of identifiable personal data by private organisations with no link to improving public health for purely private benefit in environments with poor security or regulatory oversight are least acceptable.

Relatedly, several studies found that the ‘why’ question – the rationales and motives for private sector use of data – was central. Generally, if the rationales and motives were directed towards public benefit, private sector use of public sector data was considered acceptable overall, and if the rationales and motives were perceived to be profit driven, it was considered less acceptable or unacceptable. Indeed, while Ipsos MORI (2016) found that their participants applied the above mentioned four key tests to decide how acceptable any case of data use is, they also found that the overall purpose was the key consideration: if the overall purpose for why data is used are considered acceptable, concerns over the involvement of private sector companies fade. This was supported by many other studies, and Tully et al (2018) and Grande et al (2013) for example found that the reasons why data is being used matter more than who is using it. Publics care about the user, but less than they care about the use and rationales for use.

Nonetheless, some studies found hierarchies of acceptability around the types of data that were being accessed and the type of organisation accessing it. When it comes to data type, anonymised and passively collected aggregate data is generally seen as most acceptable, and identifiable, personal, and sensitive data being least acceptable. For example, Davidson et al (2013) found that postcode data, sexual orientation data, and commercial data were particularly sensitive, while Hopkins Van Mil (2021) found that genomics and mental health data as well as qualitative data were seen as especially sensitive because they were considered personal in nature and because more care is needed to interpret qualitative data. When it comes to the acceptability of different private sector organisations accessing public data, the hierarchy of acceptability is shaped by perceptions of the organisations’ motives and ability to deliver public benefits. Studies that found a hierarchy of organisations consistently found that at the bottom were insurance companies, with marketing companies following as a close second bottom. Hill et al (2013), for example, found that their study participants had concerns that insurance companies using public data would affect premiums or cover. Ipsos MORI (2016, 2021) similarly found that their participants opposed access being granted to insurance companies in particular, and that insurance and marketing companies were perceived as detrimental to individual and public interests. Because of this, they were seen as a red line, although some considered targeted marketing of health products to be potentially acceptable in cases where it might serve public interests in some way. Internet service, social media and retail companies using public sector data was found less acceptable than pharmaceutical companies. Davidson et al (2013) for example found that while internet service providers and social media companies were associated with online fraud, the majority of their participants felt that pharmaceutical companies contribute to understanding of health and disease and the development of drugs and treatments, and thus should be able to access data, even though there were concerns over profit making. Atkin et al (2021) additionally found that across different kinds of private organisations, participants in their study were likely to see healthcare focused companies’ uses of data as more acceptable than non-healthcare focused companies.

Overall, there were different levels of acceptability for private sector use to public sector data that were generally determined by perceptions of what the rationale for data use is and who profits or benefits from the use, where public benefit was the primary driver of acceptability and commercial gain or private profit was the driver of unacceptability. Uses that were perceived oppositional or harmful to public benefit (such as insurance uses) were consistently found to be a red line that should not be crossed. It should be noted, however, that even if publics generally support conditional private sector access to public sector data when it is perceived to carry public benefit, Ipsos Mori (2016), among others, found that there remains a segment of the population that will find it hard to accept whatever the conditions or circumstances are.

3. Public benefit

Public benefit (for which terms like public interest and social benefit were also sometimes used) was the most prominent theme across the included documents and, as highlighted above, a key condition of acceptability for private sector uses of public sector data. Overall, there is widespread public support for private sector uses of the data if these uses can be convincingly demonstrated to deliver or have high potential to deliver public benefit, and this was generally the case even if these uses also carry private or commercial benefit as long as public benefit outweighs private benefit.

The TNS (2012), for example, found that public interest arguments generally trumped all other concerns, while Tully et al (2018, 2019) found not only that public benefit is a key condition of acceptability for private sector uses of public sector data but also that most of their participants accepted commercial gain if it accrued secondary to public benefit. Indeed, across different studies, a key underlying reason for opposition to private sector use of public sector data was the concern that it would be motivated by private or commercial rather than public interest: these were seen as incompatible. Davidson et al (2013), among others, found that their participants opposed private sector involvement when they perceived it as profit driven or did not trust private sector organisations to act in the public interest, while Tully et al (2018) found that the reasons their participants gave for why organisations should not have access centrally included failure to clearly indicate that data use would principally be for public benefit rather than private gain. Some studies also found that the potential for public benefit from any proposed private sector use of public sector data must be convincingly demonstrated rather than just speculated. Connected Health Cities (2017), for example, found that participants in their study required that public benefit as opposed to commercial gain was satisfactorily demonstrated, and generally did not support proposals of private sector use when there remained doubts about whether public benefit would be realised, while Chico, Hunn, and Taylor (2019) found that a major consideration was whether there was evidence that any alleged public benefit would materialise.

What, precisely, ‘public benefit’ is taken to entail is, however, variable and contested. Several definitions of the concept were proposed by participants across the different studies, and different groups of participants placed emphasis on different kinds of benefit. In some studies, such as that undertaken by Hopkins Van Mil (2021), public benefit was defined expansively to include both direct and indirect benefits, such as improving health and social care outcomes and knowledge development. Participants in the study undertaken by Aitken et al (2018) used wide ranging conceptualisations of public benefit but considered that it should be as broad as possible, including benefit to individuals, specific groups, and society as a whole. Ipsos MORI (2016) similarly found examples of public benefit to include a wide range of areas and activities from developing new drugs and treatments, to improvements in care, to higher level of service and greater access to services for vulnerable groups.

Several studies conducted in the UK specifically identified that private sector access to or use of NHS data should bring both outcome based and financial benefits directly to the NHS. Tully et al (2019) and Connected Health Cities (2017) both found that their participants wanted to see demonstratable potential for improvements in areas such as cost savings and targeted use of resources as well as treatments, while participants in the study undertaken by Hopkins, Kinsella, and Hopkins Van Mil (2020) wanted the NHS to reap the benefits from any private sector partnership through improvements in patient care as well as resource and administrative efficiency. Chico, Hunn, and Taylor’s (2019) participants, on the other hand, identified two main ways in which the NHS should benefit: any products or services developed using NHS data should be made available to the NHS at a preferential rate, and the NHS should have unlimited access to any knowledge or insight arising out of the use of NHS data.

There was some ambivalence around the relationship between public benefit and individual benefit, with some studies subsuming individual benefit under public benefit while others considered it separate, and some finding that public benefit outweighs individual benefit and others finding that it may not do so. Chico, Hunn, and Taylor’s (2019) participants could perceive data use to be of public benefit even where they could not see clear and direct benefits to individuals, while Hopkins, Kinsella, and Van Mil Hopkins’ (2020) participants saw a clear link between benefits for individual patients and wider societal benefit, and Ipsos MORI (2016) found that individual benefit was seen as valuable but less so than wider societal benefit. Participants in Castell, Robinson, and Ashford’s (2018) study found it difficult to assess the idea that uses of personally identifiable patient data may create wider public benefits without benefitting the individuals whose data is used, but Davidson et al (2013) found that whether benefits should accrue to individuals or wider publics depended on the type of data. They found that uses of data that require individuals’ proactive participation (e.g. research involving genetic data) should directly benefit individuals whose data is used, whereas uses of routinely collected aggregate data need not benefit individuals directly but should be of wider public benefit.

While the conceptualisations of ‘public benefit’ and what it should encompass were variable, the general lesson across the included studies is that demonstrable public benefit broadly conceived is either one of the most important considerations or the most important consideration that publics have around private sector use of public sector data. This theme is overlapping with but can be distinguished from another prevalent theme, which was that of benefit-sharing and distribution.

4. Benefit-sharing and distribution

Beyond the general demand that private sector use of public sector data should deliver or have demonstratable high potential to deliver public benefit, a related but separate theme across the included studies pertained to the question of how benefits should be shared or distributed to enable public benefit to be realised most appropriately or equitably. Considerations included fair distribution of benefits, profit sharing and reinvestment, and, for private sector uses of NHS data in the UK, the question of whether the NHS should charge for data access.

Participants in several studies expressed the view that benefits should be distributed fairly across the population. For example, studies undertaken by Understanding Patient Data and Ada Lovelace Institute (2020) and by Hopkins, Kinsella, and Van Mil Hopkins (2020) found that participants generally felt that fair distribution entailed an even distribution of outcomes, benefits and rewards across the country, at least in the longer term, even if short term benefits would first be realised at local or regional levels. Hopkins, Kinsella, and Hopkins Van Mil’s (2020) participants also considered that any benefits that accrue from the use of NHS data should be fairly distributed across the NHS, but also that in case of partnerships between private and public organisations, there should be mutual benefit for all parties, including for the private sector organisations involved.

Davidson et al (2013) additionally found that their participants considered benefit-sharing to be more relevant and important for private sector access to public sector data than for other kinds of public data use because of the prevailing perception that private sector organisations were especially motivated by profit, and thus benefit-sharing models are particularly needed to ensure public benefit. Some studies also specifically considered questions around profit sharing and reinvestment, including whether public sector organisations should charge for access to their data. Davidson et al (2013) and Ipsos MORI (2021) found support for the idea that any products or services developed using public sector data and especially NHS data should be provided at a lower cost for the NHS. Ipsos MORI’s (2021) participants generally felt that it was important for the NHS to receive a return investment and recover costs of providing and maintaining data, while participants in the Davidson et al (2013) study considered that since the public sector invests considerable funds to data collection, it should recoup some of these funds from those wishing to use the data. They also considered that the cost could vary depending on the type of the organisation, and third sector and other public organisations could be given data for free or at a lower cost than private sector companies, who were presumed to generally make more profit out of data use.

Members of publics generally felt that benefits from private sector use of public sector data should be equitably shared and returned to publics and patients.

5. Trust and distrust

The extent to which publics trust and distrust different types of organisations across the private, public and third sectors, and why, were prevalent questions across the included studies, and many found that trust in and the perceived trustworthiness of the organisations involved with the data are key considerations for publics.

As noted above, several studies found that the spontaneous views of members of publics towards private sector companies accessing public sector data tend to be negative. This is likely connected with many studies finding that publics tend to express spontaneous distrust towards private sector companies especially because of worries over commercial motives but also because of concerns over security and unethical practices. Ipsos MORI (2016), for example, found that the private sector in general was distrusted and participants were concerned about conflicting commercial priorities and about the ability of private organisations to store data safely. Tully et al (2019) similarly found that many of their participants were uncomfortable with commercial access to public sector data because they did not trust private companies’ motives or ‘agenda,’ because the companies did not have a track record of protecting data, or because they worried that companies might use data to manipulate or exploit individuals or populations in ways that support their own agenda. The question of trust was also connected with the above discussed hierarchy of acceptability around different private sector organisations accessing public data, with Krahea, Milligan, and Reilly (2019) and Robinson and Dolk (2016) finding that organisations whose access to data was considered least acceptable were the same organisations that were least trusted and perceived to be the most untrustworthy, namely, insurance companies. Chico, Hunn, and Taylor (2019) found that whether private sector organisations were perceived as trustworthy had two elements: members of publics needed to be able to trust that the organisation was committed to public benefit and that the organisation was able to protect the data they were using.

However, several studies also found that members of publics became more positive about initiatives involving private sector uses of public sector data (regardless of the type of the private organisation) if public sector organisations were involved as key partners in these initiatives and especially if public sector organisations retained control over the partnership and data use. This is linked to higher levels public trust in public sector organisations, which were generally perceived to be working for public benefit and were consequently perceived as more trustworthy than private sector organisations. In the UK, publics placed an especially high level of trust in the NHS, and if the NHS was involved as a partner and retained control over the data and its uses, private sector involvement was perceived more positively. For example, Atkin et al (2021) found that the NHS was consistently identified as the most trusted entity to make decisions about data use, while Tully et al (2019) and Chico, Hunn, and Taylor (2019) found that when the NHS was closely involved, publics had greater trust in the ability of an initiative to deliver public benefit despite private sector involvement, including because the private sector companies would need to accept and adopt NHS principles. Castell, Robinson, and Ashford (2018), among others, also found that participants believed that the NHS owns health data and is ultimately responsible for both the data itself and how it is used, and while trust in the different organisations accessing the data was important, participants believed that the NHS would protect public interest and ensure no harm comes to individuals. They noted that publics generally trust the NHS unequivocally to protect their data, and because of this, the NHS bears the weight of expectations when it comes to private sector uses of it.

While similar patterns of trust and distrust were evident in other countries as well, notable exceptions were some studies conducted in the US where participants were generally found to have lower levels of trust in the government. Brown Trinidad et al (2010), for example, found that their US based participants expressed widespread distrust in federal agencies and the government in general, while a comparative study undertaken by Ghafur et al (2020) found that distrust in US agencies was significantly higher than in UK agencies, and that this is likely linked with concerns that data might not be protected from commercial end-use within the largely privatised US health system. In ways that may be related to this, Chico, Hunn, and Taylor (2019) found that their participants in the UK expressed lower levels of trust in US based private companies than in private companies based in the UK or Europe, because they did not feel confident that their data would be given the same level of protection in the US.

Overall, the questions around trust, distrust, and trustworthiness were key considerations across different studies, and whether an organisation is trusted or perceived as trustworthy was centrally linked to perceptions of whether the organisations is acting in the public interest or for private benefit. In the UK, there is a high level of trust in public sector organisations, and in the NHS in particular, and private sector access to data is perceived more positively, especially if the NHS oversees and retains control over the process, largely because the NHS is seen as driven by public benefit and because it is trusted to act in public interest.

6. Oversight, governance, and safeguards

Across the included studies, oversight, governance and safeguards around public sector data in general and private sector access to and use of public sector data in particular were a key reoccurring consideration. This encompassed questions around different mechanisms that could or should be implemented to govern data use and who does or should have control over data sharing and partnerships, questions around transparency, accountability, confidentiality, consent processes and questions around safety and security arrangements.

Participants across different studies wanted there to be a governing body or an oversight committee responsible for overseeing how public sector data is shared and used outside the public sector. Understanding Patient Data and Ada Lovelace Institute’s (2020) and Hopkins, Kinsella, and Van Mil Hopkins’s (2020) participants wanted a single governing body to be held accountable for NHS data sharing, which would undertake proactive, reactive, and monitoring activities including establishing a governance framework, taking regulatory action when needed, and auditing and reporting. Hopkins, Kinsella, and Van Mil Hopkins’s (2020) also found that participants wanted this body to be governed by both legal requirements and guiding principles that are binding and subject to ongoing review.

There were, however, some differences in views on what the oversight or governing body should look like and who it should encompass, with some studies finding support for an independent oversight body with diverse representatives and others finding support for public sector bodies to take responsibility for oversight. For example, Hopkins, Kinsella, and Van Mil Hopkins’ (2020) participants wanted non-hierarchical representation from the NHS, academic, charity and industry, and members of communities and publics, and Davidson et al (2013) found strong support for the establishment of an oversight body that would comprise of a range of different stakeholders. Aitken et al (2018), however, found that their participants were divided on whether data linkage processes should be overseen by the relevant public service or by another (non-governmental) independent body, while some studies in the UK found that publics want the NHS to retain oversight and control over health data. For example, Castell, Robinson, and Ashford (2018) found that their participants were relatively uninterested in questions around safeguards and governance around health data, but this was because they devolved this responsibility to the NHS, reflecting their high level of trust in the NHS as an organisation presumed to act in the public interest. Their participants presumed that the NHS should retain control and ownership of the data throughout the process as well as decide and ensure adherence to rules of data use.

Several studies found that participants considered transparency and accountability around private sector use of public sector data important. For example, Aitken et al (2016) found that publics’ perceptions of different organisations’ level of transparency and accountability was crucial for shaping their views, while Hopkins, Kinsella, and Van Mil Hopkins’ (2020) participants considered that transparency and accountability to publics should be prerequisites for all NHS data sharing partnership. They saw this to include a public register of organisations using NHS data and transparency around the aims, process, results, and benefits (financial and otherwise) that all partners would receive. Similarly, Davidson (2013) found that transparency was seen as key to mitigating against the possibility of data misuses and instilling public confidence, but also that their participants wanted mechanisms in place that would enable publics to hold data users accountable and monitor or challenge instances of data use, including the extent to which they are contributing to public benefit, the level of profit to private companies, and whether they are achieving their intended outcomes. More generally, Hopkins Van Mil (2021) found that transparency is not an ‘add-on’ to but rather a part of realising public benefit, including because it is required to demonstrate how public benefit is or will be achieved.

Some studies also raised issues around safety and security arrangements as well as around consent, confidentiality or privacy. Davidson et al (2013) and Hopkins, Kinsella, and Van Mil Hopkins (2020) found that concerns about data security and privacy featured prominently in their studies: participants worried over the risk of data falling in the ‘wrong hands,’ and generally wanted reassurance that security and privacy would be safeguarded. Ipsos MORI’s (2016) participants, on the other hand, generally felt that no amount of security could fully remove the risks involved in data sharing (such as leaks and hacking) and the involvement of private sector organisations intensified this concern because it was perceived to entail that public sector data is moving beyond the public systems that are or should be accountable for it. Ipsos MORI (2016), among others, also found nuances and complexities around publics’ views on the issue of consent: while a majority of their participants would prefer that consent was sought whenever their NHS data is shared with private sector organisations, they also found that participants shifted their views when they had more information, especially when they were presented with projects that they wanted to happen, but it would be impractical to seek consent from everyone whose data is used. Taylor and Talyor (2014) relatedly found that their participants were willing to accept consent models other than those which they may ideally prefer, including consent models with a lower level of individual control over how their data is used, if there are other safeguard arrangements in place to govern how the data can be used. These included provisions such as security, controls over detrimental use and commercialisation, transparency, the existence of an independent ‘watchdog’ body, and confidentiality protections. Indeed, confidentiality was identified as particularly significant in the context of private sector access in some studies, including by Aitken et al (2016) who found that confidentiality and anonymity were seen as having a higher level of importance when data might be accessed by private companies.

Overall, Ipsos MORI (2016) found that having safeguards in place makes a difference to publics’ views on private sector use of public sector data, but there is no clear preference on the precise nature of what the safeguards should be – rather, the nature of the safeguards is less important than knowing that safeguards exist in the first place. This is to some extent reflected across the other studies as well, which generally found that publics want to see stringent oversight, governance, and safeguard arrangements, especially concerning an oversight or governance body, transparency and accountability processes, and arrangements for data security and safety, consent, and confidentiality. However, the precise nature of what these should look like varied across the different studies.

7. Public involvement and engagement

Many studies identified a need for further public engagement and involvement around data use in general and private sector access to public sector data in particular. This partially overlaps with but can distinguished from the theme of oversight, governance and safeguards, as public involvement and engagement was often identified as a component of oversight or transparency, but it was also raised as a separate element that should be considered.

Several studies identified a need to involve publics in decision making about how their data is used, but there were different ideas about the nature and extent of this involvement and what it should look like in practice. For example, Davidson et al (2013) found unanimous agreement that there should be public involvement in decision making on data sharing, including how benefit-sharing models should be developed and in setting the rules for how data is governed. They found, however, that participants favoured feedback and consultation models over more active forms of engagement, such as participation in agenda-setting and representation, partly because the participants felt that publics may not have the specific skills required to be involved in an oversight body. Yet, they also found that views on this varied and participants often struggled to articulate who should be involved and what involvement should look like. Hopkins, Kinsella, and Van Mil Hopkins (2020) and Understanding Patient Data and Ada Lovelace Institute (2020) on the other hand found that their participants strongly supported active forms of public involvement in decision making about data use. Hopkins, Kinsella, and Van Mil Hopkins (2020) reported a desire for public involvement at various levels, including policy and practice and the establishment and management of partnerships, via forms such as Citizen Juries and deliberation activities for key decisions, public votes for partnerships, and representation in governance boards. Understanding Patient Data and Ada Lovelace Institute (2020) highlighted however that some participants also felt that proportionality matters, and they wanted to ensure that public involvement did not result in overly complex bureaucracy that might hinder positive developments. There was a suggestion that public involvement could focus on ‘grey area’ cases of data use that attract a diversity of views and perspectives, whereas involvement may not be needed for broadly acceptable cases, where there is clear public benefit and low risks, or for broadly unacceptable ‘red line’ cases, such as cases involving insurance or marketing uses or clear commercial exploitation of data.

Generally, while several studies identified a need for public involvement or engagement, the precise nature of what this should look like, who should be involved and in what ways was contested.

8. Impact of demographic differences

Finally, many studies found that demographic differences had an impact on public perceptions and views on private sector use of public sector data, and several different demographic variables were found to be significant. Ipsos MORI (2016), among others, found that many interconnected factors including educational attainment, age, socioeconomic status as well as existing knowledge and awareness of data uses are linked to the extent to which individuals accept private sector access to public sector data. Other studies found links especially pertaining to age, socioeconomic and employment status, education and work sector, gender and ethnicity, and the status of having a long term health condition.

For example, Atkin et al (2021), Castell, Robinson, and Ashford (2018) and Chico, Hunn, and Taylor (2019) found differences linked to age and life stage. They found that generally younger people tended to be more accepting of data sharing and use by private sector organisations than older adults, while Aitken et al (2018) found that older participants were most concerned with how profits are managed. Ipsos MORI (2016), however, also found that the relationship between age and acceptance of private sector use of data is complex and non-linear, and acceptance could be seen as being most shaped by individuals’ mindset concerning issues such as risks and benefits of data use. Castell, Robinson, and Ashford (2018), among others, found that socioeconomic and employment status and work sector impacted participants’ views, while Aitken et al (2018) found that those who are not in full time work were more concerned with oversight and type of data used whereas those in full time work were less concerned with oversight and more with the purpose of data use. Aitken at al (2018) also found that men were more concerned about oversight arrangements in place than women, while a comparative study across the UK, US, and Australia undertaken by Milne et al (2019) on genomic data sharing arrangements found that men were more likely to have high overall trust across different kinds of organisations involved in genomic data sharing, including private sector organisations, than women. In the US, Grande et al (2013) found that African American and Hispanic participants were less willing to share their health information for private sector (especially marketing) uses than white participants, while Castell, Robinson, and Ashford (2018), among others, found that people with long term health conditions and a patient status tended to be more positive and open to data sharing than general publics. Chico, Hunn, and Taylor (2019), on the other hand, found that patients were less likely to support scenarios involving targeted marketing of health products and pharmaceutical companies accessing public sector health data than members of general publics.

Overall, these differences show that individuals’ social positions and life experiences shape their perceptions of and attitudes towards private sector uses of public sector data, but the ways in which they do so are complex and not always consistent across different studies. This highlights that publics do not have a homogenous perspective or shared understanding of this topic and while patterns can be identified across several demographic variables, there are likely to be complicated intersections across the different variables.


Email: christopher.bergin@gov.scot

Back to top