A Health and Biomedical Informatics Research Strategy for Scotland

This strategy sets out key areas for action and specific recommendations from the Health Informatics Research Advisory Group (HIRAG) on how Scotland should respond to the opportunities and challenges around the secure use of routinely collected patient data for research.

3 Moving Forward: What is needed now?

3.1 Optimising the infrastructure for health informatics research

A flexible federated network of accredited safe havens

Safe havens are being established within NHS NSS (the National Safe Haven) and within the four lead regional NHS Research Scotland nodes, using similar data and security architectures and operating according to SHIP information governance and data sharing principles. This should allow them to function as a federated network in which each node develops particular resources, datasets, skills or analytical methods, with the potential to provide site-specific services across the Safe Haven Network or to approved external researchers. The governance standards applied to the network of safe havens must reassure data controllers that data can be safely shared between nodes of the federated network, in order to facilitate access to data for researchers from external organisations. Safe havens are therefore distinct from the data repositories used by Health Boards to manage and analyse their own data (although they may be involved in that activity on behalf of their Health Board). Work is already in hand to develop a Safe Haven Charter that will underpin the establishment and operation of such a network. Independent accreditation of safe havens would provide additional confidence about the operation of safe havens to data controllers, the patients and the wider public. The Scottish Government, through the eHealth Directorate and CSO, should play a lead role in establishing independent accreditation standards for safe havens, and establishing systems for monitoring compliance. In addition to the accreditation of safe havens (safe places), national safe researcher (safe people) training is being established providing further standardisation and controls.

The National Safe Haven is best-placed to take the lead when the research requires the linkage of national datasets with supplementary data from multiple Health Boards or other public bodies. The node safe havens may be better placed to handle datasets where the majority of the data are derived from a local source. Specific projects could be led from any node in the network with the direction of data flows reflecting volume requirements, specialised expertise and capability, and available capacity to take on new projects but the handling/processing of the data will require the express agreement of the Data Controller for the source data. Appropriate data and income sharing arrangements, and transparent mechanisms to ensure data security, will be needed to underpin relationships within the network. The key is to find a balance between centralisation and standardisation on the one hand, and the potential for each safe haven to develop and innovate on the other hand.

The services provided by safe havens encompass a range of activities such as creating research datasets abstracted from clinical databases, adding capability to enhance the visualisation and analysis of data, or promoting interoperability between analogous and heterologous data sources. Due to the incremental way in which hospital clinical information systems have developed, work will be needed to make analogous data held in different node safe havens interoperable. This will require an understanding of the underlying data structures and how the data relate to real world clinical information and current knowledge. It will also require close interaction between data analysts and domain experts (clinical specialists, medical physicists, and other colleagues).

Health data within NHS Scotland range from data held within national datasets (e.g. on hospital admissions, discharges, cancer registration, prescriptions) to the much more detailed biomedical and clinical data recorded in various formats in distributed hospital clinical information systems and bespoke research datasets. Appreciating the range of data that may be available for research and how to access it can be challenging for researchers, the Electronic Data Research and Innovation Service (eDRIS) (Box 5) aims to provide a single-point of entry to the wide range of national datasets that may be accessible for research, and to assist researchers in study design, obtaining approvals and performing linkages. As well as increasing overall capacity, the node safe havens can potentially add specific analytical, programming or clinical expertise and access to specialist regional datasets.

A potential drawback of the safe haven approach is that by attempting to guarantee data security, it distances the researcher from the data themselves. Researchers are used to holding copies of linked datasets in restricted access areas of university servers. Systems such as the one used by the Scottish Longitudinal Study (SLS), where researchers specify analyses which are implemented within a safe haven but cannot remove data from the safe haven, are still the exception. Arrangements for permitting remote access, or for enabling researchers to use data within the safe haven, will have to be flexible, efficient and cost effective to win the confidence of researchers. Safe havens must be able to provide excellent metadata so that if researchers cannot see the raw data, they can at least understand how the data were processed, cleaned, how analytical variables were derived, and methods used.

A common complaint from both industry and university-based researchers is that there appears to be a multiplicity of entry points, and a confusing array of data providers, governance bodies and permissions processes.

eDRIS (Box 5), established to support researchers and facilitate access to national datasets, but also working in conjunction and collaboration with other safe havens within the federated network of accredited safe havens, can potentially offer a single point of entry for researchers wishing to access data. However, there are a number of technical challenges to be overcome to achieve interoperability between safe havens. A particular challenge is to define and agree the practical details of how a network of safe havens should operate. Some basic issues of principle must also be settled, such as whether there should be a single point of entry for each study (which could differ from study to study), or one point for all studies, and whether the network will include only the NHS NSS and NRS node safe havens. The advantages of a single point of entry need to be balanced against the need for local knowledge and expertise to maintain inventories of datasets, generate metadata, support local researchers and to liaise with data providers. It is important that sufficient incentives are in place for the node safe havens to drive up the quality of the data they manage and develop specialist expertise.

RECOMMENDATION 1: Establish a Charter to set out the principles, and address at a high level the practical, technical and governance challenges that need to be overcome to establish a strong and efficient federal network of safe havens, and to provide a basis for the development of an accreditation framework for NHS safe havens in Scotland. Future funding of NHS partners including National Services Scotland (NSS) and the NHS Research Scotland (NRS) nodal safe havens in Aberdeen, Dundee, Edinburgh and Glasgow should be conditional on their agreement to the Charter, and ability to fulfil the standards it specifies. Safe havens may be established within NHS Boards other than the NRS nodes, or to support specific projects, and consideration should be given to whether they should also be able to join the network.

Proportionate Governance

Robust, efficient and proportionate governance to mitigate risks to patient confidentiality and privacy is a key component in the promotion of public trust and maintaining public confidence in the use of their data by the NHS and researchers. Loss of public trust would undermine the use and reduce completeness and availability of health data for all purposes whether healthcare delivery, patient safety or research. Data controllers, processors and researchers each have a responsibility for Information Governance.

The Academy of Medical Sciences investigated the regulations and governance of health research. Although its report was primarily focussed on systems in England, it made observations that could be applied to Scotland, including duplication in ethics, Research and Development (R&D) and Information Governance approvals. It proposed that these systems be streamlined and simplified[29].

A more recent review of information governance, again on behalf of the Department of Health in England but with relevance to Scotland (Box 9), referred to 'a growing perception that information governance was being cited as an impediment to sharing information, even when sharing would have been in the patient's best interests', and acknowledged researchers' concerns about the 'complexity, confusion and lack of consistency in the interpretation of the requirements they have to satisfy before research projects can proceed'19.

Box 9: The Caldicott 2 Review

In response to concerns that the legislative and regulatory environment was impeding appropriate sharing of information in the interest of the patient and the wider public, the Secretary of State for Health in England commissioned Dame Fiona Caldicott to review the balance between protecting and sharing information19.

The review was conducted against the background of the UK Government's Information strategy, the Open Data White Paper, and the Health and Social Care Act 2012 which created a new legal basis for NHS bodies in England to share confidential information with the Health and Social Care Information Centre (HSCIC). The report, published in April 2013, endorsed the approach of the new NHS constitution in England to facilitate patients' access to their medical records, and give them the right to opt out of sharing personal information beyond their care pathway, including with HSCIC. It acknowledged that protecting confidentiality should be balanced with that of the benefits of exploiting electronic patient data for research and statistical purposes.

The report upheld the existing Caldicott principles and recommended that a seventh principle be added to the six set out in the original Caldicott Report, published in 1997[30]. The new principle states that "the duty to share information can be as important as the duty to protect patient confidentiality".

Outline Information Governance Principles from the Caldicott Reviews:

1. Justify the purpose(s)

2. Don't use personal confidential data unless it is absolutely necessary

3. Use the minimum necessary personal confidential data

4. Access to personal confidential data should be on a strict need-to-know basis

5. Everyone with access to personal confidential data should be aware of their responsibilities

6. Comply with the law

7. The duty to share information can be as important as the duty to protect patient confidentiality

Requests to use data held by NHS Boards in Scotland are reviewed by Caldicott Guardians appointed by each Board. Caldicott Guardians, who combine their information governance role with other significant responsibilities within their Boards, have had to cope with a sharply increasing volume of requests for national or cross-Board datasets in recent years. A National Caldicott Scrutiny Panel process was established in 2010 to streamline Caldicott review procedures through the use of a common application form, and the implementation of a system whereby applications to use data from multiple Boards or projects judged to have national implications are referred to the national panel. Any system of proportionate governance requires a commitment of time from, and complimentary skill of, information governance experts and Caldicott Guardians. Information governance leads within the Scottish Government (who currently provide the secretariat for the panel) have taken on some of the burden of risk assessments and administration.

The panel works to a target of dealing with requests within 20 working days. However, data suggests that obtaining permission from Caldicott Guardians still takes markedly longer than other NHS R&D decision-making processes, where performance is closely monitored with routine publication of performance statistics[31]. Despite the changes, obtaining permission from Caldicott Guardians and other approval bodies to use non-consented patient data for research can still be a lengthy and frustrating process for researchers (Boxes 10 and 11), and access to NHS data for commercially funded studies is a source of particular difficulty.

Box 10: The challenges of governance

For studies involving linkage of a number of datasets, for example to enable long term follow-up of a patient cohort to assess levels of service use, the regulatory burden can be substantial because of the number of separate approvals required. In a recent CSO-funded study to assess the feasibility of following a cohort of patients from a single GP practice, it took nine months to obtain the full range of approvals.

In the course of the study, for which prior R&D and ethical approval had been granted, the researchers were obliged to approach four separate Caldicott Guardians (two concerned with national, two with Board level datasets), the CHI Advisory Group, the Practitioner Services Division of NHS NSS and the Local Medical Committee, to obtain approval to approach patients to seek consent for the use of their primary care data in the study.

In some cases, approval was granted quickly, but in others decisions took several weeks, and were only obtained after considerable effort spent by the researchers on chasing responses. A number of promising avenues were abandoned altogether. In several cases approval was only granted to approach patients in ways that ruled out further attempts to contact non-respondents.

The overall picture is of a cumbersome process, tilted towards restricting rather than facilitating access to data for research purposes, even for ethically approved studies. The researchers concluded that conducting a study like theirs on a large scale would be too expensive and time-consuming, largely because of the requirement to obtain consent from cohort members to use their primary care data, and that the creation of a wholly anonymous cohort (via eDRIS) should, where possible, be the preferred route (Information Governance Principles 2 from the Caldicott Review19).

A recent review by the panel found that one of the key reasons for delays is insufficient information from the applicant in key areas required to scrutinise information governance (e.g. no clarity on how and where the data will be transferred, stored and deleted; absence of information on technical and personnel security; absence of information on local records management policy, deletion and back-up.) Thus, as well as Government and the NHS ensuring proportionate governance is in place, Researchers themselves can contribute to speedier processing by ensuring that applications are efficiently completed with clear and up-to-date information about their own institutions' information governance arrangements.

Box 11 Overlap and duplication in the governance process

Revitalisation of historic cohorts is one of the most exciting uses of linked health records for epidemiology, with potential to provide unique insights into early life influences on health over the whole life-course. In one such study, researchers planned to link information on intelligence tests conducted in the late 1940s to the Scottish Morbidity Record (and equivalent datasets in England and Wales) and to recontact a random subsample who had been selected to undergo more extensive testing during childhood and early adult life.

This is clearly an ambitious and highly sensitive study for which the highest standards of governance are required. It illustrates the challenges of addressing privacy concerns within such a complex, multi-stranded study and the close working with various bodies that feed into developing the study design to meet these concerns.

The researchers' experience illustrates a particular drawback with the current system of governance. In all the researchers had to make a total of 16 applications or resubmissions to nine separate bodies (across England, Wales and Scotland) and submit over 250 supporting documents. Within Scotland alone, the researchers had to apply in turn for ethical, NHS R&D and Privacy Advisory Committee approval, in addition to securing research passports for university based researchers to work within the NHS.

Up to 30 supporting documents had to be submitted with some of these applications, and as separate approvals were required for the same documents, amendments required by one body required resubmission of the documents that had already been approved by others.

The researchers found each individual body to be helpful and efficient. All their aims were eventually approved and the study is progressing well. But, in a process like this, with such a high degree of overlap between the responsibilities of the different bodies, delays are almost inevitable even if each body reaches decisions quickly.

Further work is underway to simplify the national governance structure and have a single information governance process with the creation of a Public Benefits and Privacy Panel for Health and Social Care (PBPP). This panel has the support of Chief Executives of NHSS, Health Board Caldicott Guardians and Chief Executives (who are the Data Controllers as defined by the Data Protection Act 1998). The panel will have delegated authority from Health Board Chief Executives to scrutinise how national data is used. Given the variety of requests, the panel will operate a two tier structure to triage projects to facilitate expedition of straightforward/non-contentious projects. There will be pooling of information governance resources so that personnel from each Health Board contributes to the new national governance structure and the inclusion of lay and research representatives in the panel reduces the risk of misunderstanding with the public as to how data is used beyond direct care. The merging of three district advisory groups (NHS NSS Privacy Advisory Committee (PAC), National Caldicott Scrutiny panel and CHI Advisory Group (CHIAG)), removes duplication in the information collected and scrutiny and adds consistency to decision making. This process is designed to cope with the growing workload and demand for cross-sector research or health/social care integration.

In support of streamlined information governance, the accreditation of safe havens should provide a means to demonstrate that robust controls and safeguards are in place18,19 to minimise the risk to confidentiality and privacy, addressing key areas of interest to the PBPP. Their use will reduce the need for further scrutiny by the Information Governance leads or Caldicott Guardians in these specific aspects, building on the proportionate governance work already developed by NSS PAC where applications are filtered depending on sensitivity of data/risk. Combined with this, training will help users understanding the basic objectives of protecting privacy and confidentiality and help them design studies that meet these objectives23. There are a number of sources of training but work is underway to establish a standardised national training programme.

A major frustration and source of delay with the current system is the requirement to submit the same information in the course of several different approvals (Box 11). An obvious route to address this would be for a common application form for REC, NHS R&D and Caldicott approval. This could be implemented by incorporating the information required for Caldicott approval within IRAS, the Integrated Research Application System, which already captures the information needed for R&D, REC and a range of other approvals. However this would only be possible on a UK-wide basis. Before this is possible, or deemed feasible, the simplification and streamlining of the information governance application form in Scotland is required. This should be combined with the development of clear guidance and a source of advice.

A potentially efficient model for deciding applications to access data held by safe havens would be for accredited safe havens to handle requests through a bespoke access committee or governance board linked to the safe haven, rather than the researcher applying to the safe haven for the data and seeking permission separately from a Caldicott Guardian. The model could involve the Caldicott Guardian sitting on the access committee, or delegating decision making to the access committee. A similar approach is already used by Research Tissue Banks, such as the UK Biobank and by the Secure Anonymised Information Linkage (SAIL) service in Wales[32]. SAIL operates a streamlined review process with a single decision point covering research ethics, Caldicott Guardians and privacy advisory committees. Caldicott Guardians sign data access agreements that give permission for anonymised data held within SAIL to be used for research purposes approved by the Information Governance Review Panel. The creation of the PBPP provides the additional streamlining and simplification of information governance scrutiny and would be in a position to inform the decision making of the safe haven along the lines of the Information Governance Review Panel for SAIL.

The NHS Scotland Caldicott Guardian manual[33] allows for delegation of decisions involving research projects to others in the NHS Board (a senior colleague or a defined post in the R&D Office). Responsibilities could be further clarified by considering whether the NHS Board should always remain the data controller after the data has been transferred to a data processor (e.g. a safe haven or a researcher) and therefore liable for breaches in security that occur. Legal advice is being sought by the NSS Caldicott Guardian on the roles of Data Controllers and Data Processes in relation to the National Safe Haven.

For this system to work, Caldicott Guardians must have confidence in the access procedures and information security arrangements implemented by the safe havens, and these should be key considerations in the accreditation of safe havens (see Recommendation 1).

A federated network of safe havens, operating with clearly defined information governance procedures, has the potential to deliver substantial efficiencies, in the context of a streamlined system of Caldicott approval for the use of national and cross-Board datasets. It is also worth considering whether legislative changes could be used to deliver further improvements. Unlike in England, there is no legislation defining the status of accredited safe havens, but the review of the Patients' Rights Act, due in 2016, may provide an opportunity to make clear in law the status of the safe havens.

RECOMMENDATION 2: Remove duplication in the research governance process, and improve the speed and consistency of decision-making. This should be facilitated by the Safe Haven and accreditation framework. The success of any new structures in terms of streamlining decision-making should be closely monitored and national benchmarks and performance metrics established; further changes should be implemented if necessary to bring performance into line with other approvals processes.

Improve Provisioning of National datasets

New national datasets: General Practice, laboratory and imaging datasets

NHS NSS is currently developing a National GP primary care dataset (Scottish Primary Care Information Resource (SPIRE)) (See Box 2) that aims to provide an efficient means of providing information to support policy, planning and evaluation at national and NHS Board levels, and that also meets the needs of researchers. SPIRE is conducting an engagement strategy with GP groups to ensure strong endorsement of the project with concomitant agreement to provide a rich minimum dataset, generic consent for bespoke extracts and a high rate of opt-in, all of which will be essential to create a valuable research resource. Local safe havens may also contain additional sources of GP data linked to local enhanced service contracts, which may be made available for research subject to the agreement by local GP management groups.

National Research Scotland (NRS) node safe havens are working to provide local datasets from Board PACS systems and SCI store laboratory data for linkage to other datasets (see Box 12). These clinical datasets from Health Boards across Scotland would create an extremely valuable national research resource. Possible uses include supporting novel ways of matching patients with clinical trial protocols and more efficient surveillance of licensed medicines. Through the Farr Institute Scotland, there will be consultation with Health Boards on the creation of analysis-ready extracts of SCI store laboratory data and PACS images to enable linkage to existing national datasets for supporting NHS activity and research. The consultation will consider which organisation(s) should support and maintain these datasets. Health Board cooperation, leadership and collaboration will be essential as Health Boards' laboratory data in SCI store has been not been developed on common standards and will require local expertise to support migration and provide metadata. It is proposed that the dataset provider will supply node safe havens with local extracts of these data sources to maintain the capability to link these datasets to other locally-held datasets.

Box 12: Linking biochemistry, clinical care and demographic data to understand the epidemiology of chronic kidney disease

Not everyone with poor kidney function will go on to develop kidney disease requiring dialysis or transplantation. Understanding who will or will not progress is important for planning care for individual patients and their families, and for planning services for the population as a whole. Researchers at the University of Aberdeen and NHS Grampian created a cohort including all patients in Grampian on renal replacement therapy, all those with chronic kidney disease (i.e. impaired renal function for 3 months or more) and all those who had one test showing impaired function, plus samples of patients who had been tested but found to have adequate kidney function and people who had not undergone testing.

Following ethical, Caldicott Guardian and Privacy Advisory Committee approval, the biochemistry data was linked with hospitalisation and mortality data by ISD, and the dataset stored in the Grampian Safe Haven. The cohort, with a total size of ~70,000 participants, was followed for six years from 2003-2009, to ascertain incidence and progression of kidney disease, other major health events requiring hospitalisation, and deaths.

Using the linked data, the researchers were able to identify factors associated with risk of death, heart attack and progression to more severe kidney disease, and to develop accurate models of need for renal replacement therapy.

Maximising the potential for research to enhance the delivery of healthcare will require greater convergence with the eHealth strategy. This should ensure that in the implementation of new clinical and administrative information systems, the ability to efficiently use and share routinely collected NHS data for research and other purposes is built into the design. This will require the specification of in-built system queries, the mapping of diverse standards to equivalent data definitions and automated workflows to ensure that data can be accessed quickly and reliably for research. Investment in initiatives led by clinical groups committed to building effective electronic patient records to drive quality improvement should also be encouraged. Such initiatives should seek to make data available for research, based on internationally agreed standards.

RECOMMENDATION 3: Improve the provisioning of national primary care, prescribing and clinical datasets for research and in support of NHS healthcare activity by ensuring that provision for both system query and data extraction is built into the specification of new systems. At the same time, this should promote more efficient integration of data across NHS Boards through specification of common data standards, and/or initiatives to map local data standards to consistent data definitions.

3.2 Engagement with Industry

Active engagement of industry will be an important gauge of the success of the new investment in health informatics research. Developing a clear understanding of commercial partners' requirements is a priority. Work commissioned by Scottish Enterprise will identify the mechanisms by which health informatics research capability can generate economic benefit, and the opportunities to ensure that the earnings from such activity are recycled into further infrastructure developments or research activity. Models of engagement are required that balance commercial partners' requirements for the protection of intellectual property, with the delivery of public and patient benefit through placing the findings of research into the public domain. The Farr Institute Scotland and the Safe Haven Network should work with their partners in industry to develop exemplars of the benefits of engagement for both economic and social benefit.

RECOMMENDATION 4: The Farr Institute Scotland, the Safe Haven Network, the Chief Scientist Office and eHealth Strategy Board should consider the ways in which health informatics research capability can generate economic benefit, and work with partners in industry to develop early exemplars of the benefits of engagement for both economic, health and social benefit.

3.3 Engaging patients and the public

Managing public concern about the risks to privacy and confidentiality associated with innovative uses of medical records for research is vital to continued progress. The available evidence suggests that the use of anonymised patient data for publicly funded research is accepted by the public, so long as confidentiality is protected and the research is intended to improve health[34]. Attitudes towards commercial use of patient data are much more ambivalent, and depend on factors such as the aims of the research and how the benefits will be shared[35],[36]. The Administrative Data Taskforce found that data controllers in the public sector also had concerns about commercial access to data18, and its potential impact on their continued ability to obtain information from members of the public. Research on public attitudes to the use of cross-sectoral data suggest that it is viewed in the same way as the use of health data, with recognition of the potential benefits and a confidence that public bodies will protect confidentiality36.

Maintaining public confidence will be essential, as the use of health and cross-sectoral data increases. Public understanding of the uses that the NHS makes of patient data needs to be encouraged and developed in order to promote best patient care, quality assurance, research and development applications. The NHS and others who make use of data also need to listen and respond to citizens opinions about how data are managed and used. Continual effective two-way dialogue is essential in promoting the nation's health and economic transformation. Successful models of benefit sharing, especially with the private sector, also need to be identified. However, transparency in the sharing of public-sector data and independent oversight appear to be more important to the public than direct involvement in decision-making. A recent consultation exercise33 has suggested that the Scottish Government's current approach to involving the public in decision-making primarily through consultation is broadly in line with expectations but should be maintained as an on-going process. Communicating the types of research that are being conducted and positive outcomes will form part of this dialogue. For example the role of the public and the importance of public trust is in-built into the proposed Public Benefits and Privacy Panel (PBPP). A communications and engagement strategy is currently being developed for the Scottish Informatics and Linkage Collaboration (SILC) (Box 7). This will address the issues associated with cross-sectoral data linkage, and it is important that developments within health are part of this wider strategy.

RECOMMENDATION 5: The eHealth Strategy Board should work with the Scottish Informatics and Linkage Collaboration and Data Management Board to develop a programme of public engagement activities to widen understanding of how data is used in research to improve population health and the quality and effectiveness of healthcare.

3.4 Building capacity

The new and emerging infrastructure is expected to generate a marked increase in the volume of health informatics research in Scotland. This will only materialise if there is an increase in the number of information specialists and researchers with the relevant training and expertise. SHIP has in the past provided a variety of training courses, as well as an online information governance toolkit (http://www.scot-ship.ac.uk/toolkit), and there is a range of Masters courses across Scotland relevant to health informatics research, including research ethics and data linkage.

Resources have been sought within the Scottish e-HIRC bid to fund a PhD programme plus a number of research positions intended to provide a broad range of career development opportunities for data managers, software engineers, informatics scientists and data analysts. These are important developments, which should, in time, provide the basis for further grant acquisition. There is also an urgent need for the NHS to increase capacity and make investments in key expertise to enhance capability to use health informatics to promote the development of 'learning' healthcare systems that use data efficiently to manage services, monitor outcomes and improve policy.

RECOMMENDATION 6: Research funders and the NHS should be encouraged to prioritise investment in health informatics research expertise through doctoral and postdoctoral training schemes, and by increasing the capacity of the NHS to use patient data to inform service improvement.


Email: Pamela Linksted

Back to top