A Health and Biomedical Informatics Research Strategy for Scotland

This strategy sets out key areas for action and specific recommendations from the Health Informatics Research Advisory Group (HIRAG) on how Scotland should respond to the opportunities and challenges around the secure use of routinely collected patient data for research.

2 The Health Informatics Research Landscape in Scotland

2.1 The Unique Patient Identifier

The foundation of Scotland's success in the use of health data for research was the adoption of the Community Health Index (CHI) in the 1970s. Every person registered with a GP in Scotland is allocated a 10 digit CHI number from a centrally maintained register. The register contains data on address, postcode, GP, date of birth, region of registration and, where relevant, date of death. The CHI number is the unique patient identifier used in all primary health care activities and hospital-based clinical information systems, throughout NHS Scotland, including the emergency care summary (ECS). The ECS links the CHI register with prescribing and other information documenting known adverse reactions. Data in Scotland are currently coded in GP and secondary care systems according to Read 2 and International Classification of Diseases (ICD-10) respectively. Where the CHI number is unavailable (e.g. historical or non-health datasets), probabilistic matching can be used to link records.

2.2 Clinical Information

The Scottish Government's eHealth Directorate works with NHS Boards to promote convergence across NHS Scotland in the use of clinical systems and how data are stored and managed. PACS (Picture Archiving and Communications System ) is a national repository of clinical images and radiological reports in which all the contributing Health Board PACS use Data Imaging and Communications in Medicine (DICOM) standards. The use of these common data standards in PACS should facilitate the development of a national clinical images dataset. However, many existing clinical systems have evolved independently as local initiatives. Differences in data structures and terminologies need to be resolved to link these rich datasets across (and sometimes within) Boards. Health Board Scottish Care Information stores (SCI store) are repositories which receive data from multiple laboratory systems. There is no standardisation in Health Board SCI stores, resulting in systems with different data standards and laboratory reference ranges. This constitutes a significant challenge to making laboratory data from Health Boards readily and securely available to researchers in the form of a national dataset.

Ownership and control of NHS data are shared between the NHS Boards and the Information Services Division (ISD) of NHS National Services Scotland (NHS NSS). NHS Boards are the data controllers, under the Data Protection Act (1998) for their patients' clinical information, whether held on local systems or held centrally. NHS NSS is the data controller of the national datasets. ISD holds information centrally that is updated on a monthly basis on hospital admissions across Scotland (See Box 1) and community prescribing (Prescribing Information System for Scotland - PRISMS dataset) and provides NHS Boards with local extracts. ISD also uses the datasets to provide health information, health intelligence, statistical services and for advice to support the NHS in planning, decision-making and quality improvement.

Box 1: The Scottish Morbidity Record (SMR)

SMR are a set of national datasets compiled by Information Services Division of NHS NSS that are derived from information collected about patients treated in Scottish Hospitals.

Outpatient Attendance dataset (SMR00) comprises of data from 1997 for patients on new and follow up appointments at outpatient clinics in all specialities (except A&E and Genito urinary Medicine). General Acute / Inpatient dataset (SMR01) comprises episode level data from 1981 on hospital inpatient and day case charges from acute specialities. Maternity Inpatient and Day Cases dataset (SMR02) comprises episode level data from 1975 every time a mother goes in for an obstetric event and includes information on mother and baby characteristics, birth weight, gestational age, mode of delivery, induction and outcome of pregnancy and where a baby is delivered. Mental Health Inpatient and Day Case dataset (SMR04) comprises episode level data since 1981 on patients that are receiving care at psychiatric hospitals at the point of both admission and discharge. Scottish Cancer Registry (SMR06) comprises information from 1958 on Scottish residents when they are diagnosed with malignant (and some benign) tumours. Scottish Birth Record (SBR) introduced in 2002 is a universal record for all babies born in Scotland. SBR replaced the Neonatal Inpatient dataset (SMR11) which provided episode level data on babies discharged from hospital from 1975 to 2002 and supplemented the mother's delivery information as recorded in the Mother's Maternity and Inpatient Day Cases dataset (SMR02).

SMR datasets are commonly linked to research datasets (e.g. clinical trial cohorts) and population based studies such as the Scottish Longitudinal Study (see Box 3) and the Scottish Health Survey.

2.3 Primary Care Data

Scottish GP information systems are a rich repository of consistently recorded patient level clinical information in electronic form, with great potential for research. However, the data are distributed across nearly one thousand practices and accessibility for research is very limited. Each practice controls its own patients' information, which requires researchers to approach large numbers of data controllers for all but the most local studies.

For 26 years until 2013, when the GPASS system (General Practice Administration System for Scotland) on which it was based was superseded by commercial systems, the Primary Care Clinical Informatics Unit (PCCIU) at the University of Aberdeen extracted details of patient encounters, diagnoses, test results and issued prescriptions from some 20-30% of Scottish volunteer general practices. Better use could be made of the information held within the primary care system for both primary care services and research through establishing a national system to simplify and standardise the process for data extraction and analysis. Currently, GP practices in Scotland are subject to multiple electronic data extractions for the purposes of audit and making performance payments linked to GP general medical services and local enhanced services contracts. Quality and Outcomes Framework (QoF) payments are currently managed on behalf of Health Boards by Practitioner Services Division (PSD) of NHS NSS, using a UK-wide data extraction mechanism. ISD manages a limited data extraction involving 6% of practices in Scotland where the focus is to record consultations with practice clinical staff. Health Boards directly manage the additional payments to practices linked to local enhanced services contracts via data extraction mechanisms conducted by Board information service departments or via private sector partners. These data are used by Health Boards to manage these contracts and for the provision and improvement of local services.

Following a recommendation by the Delivering Quality in Primary Care Steering Group, ISD are implementing a National GP Information service to manage regular data extraction from GP practices in Scotland (see Box 2)[16].

Box 2: Scottish Primary Care Information Resource (SPIRE)

SPIRE, a collaboration between the Scottish Government and NHS National Services Scotland (NHS NSS), will provide a national information resource to inform on the provision of primary care across Scotland, and facilitate payments to GP practices against the Quality and Outcomes Framework. SPIRE will create and maintain a new National Primary Care dataset with NHS NSS as the data controller. Information Services Division (ISD) will manage an automated extraction of a defined dataset, designed to be of broad utility, at regular intervals from participating GP Practices in Scotland. The data will be stored securely in the National Safe Haven at NHS NSS. Additionally, SPIRE will be able to perform approved ad hoc data extractions where the national dataset does not meet requirements. SPIRE will also be available for research purposes. Participating GP Practices can elect to opt-out of any particular data extraction. Patients will also be able to decline the use of data from their health records for research. An Independent Advisory Body including GP and patient representation will manage requests for data and approve linkages to other datasets. Extensive engagement with a range of stakeholders, including the British Medical Association (BMA) and Royal College of General Practitioners (RCGP), is expected to act as impetus to encourage wide participation of GP practices to make their patients' data available to SPIRE.

SPIRE has been primarily conceived as an information resource for the management and planning of primary care services, but is also configured to facilitate secure access to primary care data in the National Safe Haven for research purposes.

2.4 Safe havens

Safe havens are now widely accepted as the preferred method of providing access to de-identified data for research and other secondary uses. The Thomas-Walport Data Sharing review, published in July 2008[17], recommended the establishment of safe havens to ensure that de-identified data could be used for research and analysis in the public interest. The use of accredited safe havens has since been endorsed by the Administrative Data Taskforce[18], and by the recent Caldicott 2 Information Governance Review[19] on behalf of the Department of Health, England.

For several years, the Scottish Longitudinal study (Box 3), which combines Census, Scottish Morbidity Record, Mortality and other routinely-collected data has been accessible via a safe setting within National Records of Scotland[20].

Box 3: The Scottish Longitudinal Study (SLS)

The Scottish Longitudinal Study (SLS) is a large-scale record linkage study which pulls together Census, Vital Events (births, deaths, and marriages), National Health Service Central Register (NHSCR), and NHS data on 274,000 members (5.3%) of the Scottish population. The study is a replica of the England and Wales Longitudinal Study (LS) which has been running successfully for the past 30 years, but with the added advantage of a wider range of non-Census data. The linkage of individual social, demographic and health records through time on such a large sample creates a unique and powerful resource for health and social research in Scotland, which is designed to be used widely. But because the sample, and much of the valuable information, is derived from the Census, special procedures have been put in place to ensure confidentiality.

The SLS data are held within National Records of Scotland (NRS), and can be accessed only from a secure room using NRS stand-alone computers. Researchers who need to work with individual-level data may visit the SLS safe setting in Edinburgh where support officers are available to help users extract and use the data in the correct way. Alternatively, the researcher may obtain a version of the database, from which all the data except variable names and labels, has been removed. The researcher specifies the analyses required and returns the code to the SLS team to be run on the original dataset. The only aggregated data outputs that can be released to users are tabulations and model outcomes (such as regression coefficients). Users are instructed thoroughly about the confidentiality rules, and must sign an SLS Undertaking Form describing how they must hold and use any data received from the SLS before they can begin analysis.

Use of SLS data are covered by the National Statistics Code of Practice and the Protocol on Data Access and Confidentiality. Specific legislation also covers the release of information held in the SLS: for example, the 1920 Census Act, the 1938 Population (Statistics) Act, the Data Protection Acts and Freedom of Information Legislation.

Strengths of the SLS include its comprehensive coverage of the Scottish population, low levels of attrition, incorporation of health as well as demographic information, and its configuration from the outset as a freely accessible (subject to the constraints of good governance) resource for all bona fide researchers.

Longitudinal Studies Centre Scotland

Full information of the use of the SLS can be obtained from the Longitudinal Studies Centre Scotland

A national safe haven, hosted by NHS NSS and funded through SHIP became operational in 2013 (Box 4).

Box 4 ScottisH Informatics Programme (SHIP - formally the Scottish Health Informatics Programme)

SHIP, a collaboration between the Universities of Dundee, Edinburgh, Glasgow and St Andrews, and NHS National Services Scotland (NSS), was funded by a £3.7 million grant from the Wellcome Trust, the Medical Research Council and the Economic and Social Research Council between 2009-13. SHIP has been instrumental in creating an infrastructure and governance framework (SHIP Blueprint, Box 6) to promote secure data sharing across institutional boundaries and enhancing the capability in Scotland to conduct research using data in electronic patient records.

The National Safe Haven and eDRIS research portal (Box 5), hosted by NHS NSS and central elements of this infrastructure, became operational in January 2013 and provides a state of the art technical facility with high end performance computing. Infrastructure investments have been supported by a programme of work centred on a core set of four generic activities: provisioning of datasets for research; governance; engaging researchers; and engaging the public. The core programmes have supported a related series of research projects; supporting clinical trials; national epidemiology; pharmacovigilance; and the linkage of electronic patient records to socioeconomic, geospatial and environmental data.

The success of SHIP has paved the path for subsequent investment by the Medical Research Council (MRC) and others, first in a Scottish Health Informatics Research Centre and then in the Farr Institute Scotland. SHIP has now come to an end and the programme has moved onto the Farr Institute Scotland.

Through Chief Scientist Office (CSO) infrastructure investments NHS Research Scotland (NRS) safe havens have now been established in the four lead NHS Boards, the nodes of NHS Research Scotland (Greater Glasgow & Clyde, Lothian, Grampian and Tayside), in addition to the National Safe Haven in NHS NSS which from April 2014 is also funded by CSO as an NRS safe haven. These safe havens are in varying stages of development and so far have evolved with little central co-ordination or standardisation. Each will have individual responsibility to operate at all times in full compliance with all relevant codes of practice, legislation, statutory order and in accordance with current good professional practice. There is an opportunity however to collaborate to share best practice; this will be co-ordinated by the Scottish Informatics and Linkage Collaboration (SILC; see Box 7).

The National Safe Haven is supplemented by a research advisory service, eDRIS (Electronic Data Research and Innovation Service - Box 5)[21].

Box 5: Electronic Data Research and Innovation Service (eDRIS)

eDRIS was established by NHS NSS in January 2013 as a service to support the use of health data and electronic medical records held across NHS Scotland institutions for the purpose of research, service and quality improvement, planning, public health, health surveillance and epidemiology. A key remit is to support the NHS making better use of its own data to develop and improve service delivery. eDRIS is a portal to national data and the National Safe Haven, and also works in conjunction and collaboration with the local safe havens established in partnership with the main academic institutions in Scotland. An important role of eDRIS will be to facilitate access to datasets held by other bodies in the NHS or other area of the public sector.

The service is part of NHS NSS's contribution to SHIP and seeks to support researchers from project conception to completion. Each study receives a dedicated research coordinator to ensure that the project runs smoothly by providing support with; study design and feasibility, advice on metadata, coding and terminology, liaison between data suppliers, approvals to procure and link datasets, and when required data analyses and interpretation.

2.5 Governance Framework

The SHIP Blueprint[22] (Box 6) and associated governance framework[23] define standards and processes for the use of non-consented linked data for health-related research purposes in Scotland. They seek to provide data controllers and Privacy Advisory Committees (PACs) with a common framework of reference for deciding which linkages should be approved and which checks and balances should be in place.

Box 6: Promoting efficient and secure data sharing: the SHIP Blueprint

The Walport/Thomas Data Sharing Review Report17 recommended specific actions to reduce the regulatory burden to secure access to health service data. These included simplifying the legal framework governing data sharing and providing authoritative guidance to its interpretation, and the establishment of safe havens, which would provide a technical and administrative solution to the proportionate and safe sharing of data for research.

The SHIP Blueprint22, addressing challenges raised in the Walport/Thomas report, delineates data sharing and linkage governance standards to serve as a benchmark for data controllers and researchers, and an infrastructure that supports a network firmly embedded within NHS Scotland comprising a national and local safe haven(s) that operate in conjunction with a research portal to facilitate efficient secure access to electronic health data.

Overview of the SHIP Infrastructure

Figure 1: Overview of the SHIP Infrastructure showing the interactions between individuals and organisations.

The network of NHS safe havens are expected to conform to a charter of agreed principles to provide a secure environment for the linkage, storage and analysis of non-consented patient data. These principles are based on the framework of principles and good practice set out in the Scottish Government's Guiding Principles for Data Linkage23 and the SHIP Blueprint22. This Charter will underpin the establishments and operation of the network of Safe Havens (see 3.1).

The technical infrastructure outlined in the SHIP Blueprint is designed to provide confidence to data controllers that only authorised researchers will have access to anonymised patient-level data and only summary data can be removed from the facility following statistical disclosure control. Furthermore, safe havens will not per se maintain 'data warehouse' functions, but may do so to efficiently manage their own data. Linked data sets created for the purposes of particular projects will be held in an accredited safe haven for a specific duration. They will be subject to a plan of analysis and curation as specified in the project's data sharing agreement between contributing data controllers.

A different approach was taken in England, in accordance with the UK Government's Open Data Policy, to support the use of anonymised NHS data for both academic and commercial research. The Clinical Practice Research Datalink (CPRD: www.cprd.com/home/), established by the Department of Health and the Medicines and Healthcare Regulatory Authority to facilitate access to English NHS data, with its partner the Health and Social Care Information Centre (HSCIC) are able to authorise the release of appropriately anonymised patient level datasets to enable researchers to conduct their own in-house analysis of the data. This may promote the emergence of private sector service providers enabling access to, and analysis of, NHS data. The implications of diverging governance models for research collaboration need to be kept under review.

2.6 National Data Linkage Framework

The Scottish Government has signalled its commitment to using data linkage for statistics and research through the development of a National Data Linkage Framework[24],[25],[26] that aims to promote the linkage of health with non-health administrative data. This initiative, jointly led by the Scottish Government's Chief Statistician, National Records of Scotland's Registrar General, and the Director of NHS NSS's Information Services Division will contribute to the Scottish Informatics and Linkage Collaboration (Box 7). The administrative and research centre for this initiative, will be co-located with the Farr Institute Scotland.

Box 7: Scottish Informatics and Linkage Collaboration

Published in November 2012, Joined Up Data for Better Decisions (the Data Linkage Framework)24 sets out an ambitious pathway to realise the benefits from linking cross-sectoral administrative and survey data at a population level. The key benefits include: accelerating cycles of improvement based on evidence to inform public policy and strategic planning, spending and delivery decisions. The framework seeks to overcome barriers to data linkage by acting as a focus to first improve the quality and consistency of existing administrative data systems. This should deliver data that are capable of being linked, and to expand access to facilities to enable secure data sharing, linkage and analysis. This should also aid data controllers and other decision makers to take a more proportionate approach to managing risks associated with data-linkage. It aims to build on existing successful programmes, such as SHIP, to create a culture where legal, ethical and secure data-linkage is widely understood and accepted by the public.

The foundation stone of the Data Linkage Framework is a set of Guiding Principles which are intended to promote the public interest in scientifically sound, ethically robust research while appropriately protecting privacy25. Building on these Principles, a Scottish Informatics and Linkage Collaboration (SILC) has been established, incorporating Farr Institute Scotland and the Scottish Administrative Data Research Centre (ADRC). SILC is an overarching structure responsible for the provision of an integrated national data service to support health and non-health statistical and research activity and includes eDRIS (Box 5), the National Safe Haven, an Indexing Service provided by National Records of Scotland (NRS) that handles all personally identifying data (and have no access to "payload" or characteristic data), and a Linking Service, provided by NHS NSS (which will have no access to personally identifying data). The Index Service matches data provided by data controllers to a linking population spine to allow personal identifiers to be replaced using anonymous keys. SILC will also act as a forum to co-ordinate sharing of best practice within the network of accredited safe havens across Scotland.

The oversight and terms of reference of SILC will be published in Spring 2015

2.7 Connectivity with UK-wide health informatics research and administrative data linkage initiatives

In Sections 2.1 to 2.6 we have described key aspects of the development of health informatics research in Scotland. The potential benefits and challenges of electronic health records (EHR) research have also been recognised by the major UK health and social research funders and there have been a number of important recent developments at UK level. In 2012, the Medical Research Council (MRC), in collaboration with a range of other funders (including the CSO), invested £19 million in four research centres through the Health Informatics Research Centres initiative. Scotland was awarded £4 million, and was also invited to lead the UK Network, for which a further £1.5 million has been allocated. In February 2013, the MRC invited the Health Informatics Research Centres to bid for a further £20 million to create a UK Institute of Health Informatics. This award, plus the identification of an additional £2.5 million by the Scottish Government and NHS NSS, provides an opportunity to drive a step-change in the scale of health informatics research in Scotland and participate fully in the development of a UK wide Institute. This was named as the Farr Institute after William Farr, the 19th century epidemiologist, regarded as one of the founders of medical statistics.

The Scottish hub, Farr Institute Scotland, encompasses four areas of investment: (1) a building in the Edinburgh Bioquarter and refurbishment of accommodation at Ninewells Hospital, Dundee, to provide a multidisciplinary environment for health informatics research; (2) high performance statistical computing facilities; (3) investment in enabling datasets; and (4) linkage to the safe havens located at the NHS Research Scotland nodes, in order to create a single networked Institute.

The Institute will focus on the following areas of priority:

  • Providing strong leadership and governance in health and biomedical informatics
  • Providing a cohesive, high quality research and data infrastructure for transformational health informatics research based in Scotland
  • Promoting close integration with health service infrastructure, with opportunities for capacity building in key skills
  • Anticipating new models of research encompassing genetic data, imaging data, integration of unstructured and heterogeneous data, and non-health data
  • Creating a multidisciplinary environment that brings together NHS Scotland, academia and industry to deliver novel approaches to discovery from data
  • Developing strong links to other centres of excellence across the UK and internationally, through initiatives such as the Global Alliance[27], to promote standards for secure sharing of clinical and genetic data

Having a physical focus for the Institute is essential for scientific leadership, delivery-focused management, capacity building and world-class facilities for collaborative research, but NHS-academic-industry collaboration will be the core of all its activities.

The Farr Institute Scotland will be closely integrated with NHS infrastructure and will be managed as a joint academic/NHS collaboration. It will be configured as one hub of a Scottish network of safe havens, with regional hubs in the NRS regional nodes in Aberdeen, Dundee, Edinburgh and Glasgow. A senior management team has been created to ensure good governance and delivery of the vision of the Institute. Leadership has been strengthened by co-opting experienced NHS and University leaders, who can bring a distinctive set of skills, including lay representation, informatics, clinical medicine, general practice, the law, ethics, NHS, epidemiology, social science and the pharmaceutical industry. An International Advisory Board has been established to provide independent oversight of the Institute, and ensure the highest standards of research excellence are combined with clear strategic direction.

A final important development to note is that the UK Administrative Data Taskforce (ADT), led by the Economic and Social Research Council (ESRC), MRC and the Wellcome Trust, called for the establishment of Administrative Data Research Centres (ADRC) in each of the four UK countries18 (Box 8). In 2013, Scotland was awarded £7.9 million to establish the Scottish ADRC.

Box 8: Administrative Data Taskforce (ADT)

ADT was established by ESRC in collaboration with the MRC and the Wellcome Trust, with the purpose of improving access to public sector administrative data (social security, tax and education records, for example) for research and policy purposes. The report, published in December 2012, noted that the UK has an opportunity to be a world leader in research using de-identified administrative data routinely collected by government departments, agencies and statutory bodies, but accepts that access to such data has been difficult, due mainly to the concerns that data controllers have over the identification of individuals, and legal restrictions on data sharing between government departments. It concludes that improvements in procedures for access to and linking between such data are urgently required, supported by new legislation to permit sharing and linkage of data.

ADT proposed the establishment of an Administrative Data Research Centre (ADRC) in each of the four countries of the UK. The ADRCs will be responsible for undertaking secure linkage of de-identified data. Access to data will be managed through an Information Gateway. A UK Governing Board has been established to provide the governance structure for the ADRCs.

ADT recommends that government administrative data should be made available at no cost to publicly-funded researchers. Initially, ADRCs will not engage with the private sector, though the Governing Board will, at an early stage, investigate guidelines for such engagement.

ADT proposed that the ADRCs collaborate to produce plans for public engagement and debate about the academic and wider social and economic benefits of research using administrative data.

2.8 Connectivity between Universities, the NHS and Industry in Scotland

Scotland has a strong international profile in life sciences and medicine, with significant MRC and EU grant portfolios and national pooling initiatives. It is also home to leading international institutions in informatics and bioinformatics. The Scottish Government has long recognised the importance of the life science sector to national economic development. Universities in Scotland have built strong long standing partnerships with the NHS, both at the level of local NHS Boards and nationally. SHIP itself is a good example of collaboration between leading universities and NHS NSS. Interaction between the universities and NHS and other centres of excellence in Scotland and the UK will be essential to facilitate the development of new clinical information systems to record and analyse the new genetic datasets and other biologic data as they emerge. Scotland has also recently committed substantial funds to innovation centres in digital healthcare and stratified medicine (led by informatics and medical scientists).

The potential for Scotland to be world-leading in health and biomedical informatics was acknowledged by the Council of Economic Advisers First Annual Chair's Report to the First Minister published on 28th March 2013[28]: This supported the development of a Health and Biomedical Informatics Research Strategy for Scotland and the creation of a Scottish Health and Biomedical Informatics Research Institute, which would focus on addressing the challenges of handling the linking of medical and genetic information and other heterogeneous data types in order to maximise the value of these unique sources of information across Scotland. It emphasised that in order to strengthen Scotland's international position in this area there should be a clear link between the ambitions and milestones set out in the Strategy and its funding.


Email: Pamela Linksted

Back to top