Introduction International sharing of health data opens the door to the study of the so-called ‘Big Data’, which holds great promise for improving patient-centred care. Failure of recent data sharing initiatives indicates an urgent need to invest in societal trust in researchers and institutions. Key to an informed understanding of such a ‘social license’ is identifying the views patients and the public may hold with regard to data sharing for health research.
Methods We performed a narrative review of the empirical evidence addressing patients’ and public views and attitudes towards the use of health data for research purposes. The literature databases PubMed (MEDLINE), Embase, Scopus and Google Scholar were searched in April 2019 to identify relevant publications. Patients’ and public attitudes were extracted from selected references and thematically categorised.
Results Twenty-seven papers were included for review, including both qualitative and quantitative studies and systematic reviews. Results suggest widespread—though conditional—support among patients and the public for data sharing for health research. Despite the fact that participants recognise actual or potential benefits of data research, they expressed concerns about breaches of confidentiality and potential abuses of the data. Studies showed agreement on the following conditions: value, privacy, risk minimisation, data security, transparency, control, information, trust, responsibility and accountability.
Conclusions Our results indicate that a social license for data-intensive health research cannot simply be presumed. To strengthen the social license, identified conditions ought to be operationalised in a governance framework that incorporates the diverse patient and public values, needs and interests.
- information technology
- patient perspective
- scientific research
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Large-scale, international data sharing opens the door to the study of so-called ‘Big Data’, which holds great promise for improving patient-centred care. Big Data health research is envisioned to take precision medicine to the next level through increased understanding of disease aetiology and phenotypes, treatment effects, disease management and healthcare expenditure.1 However, lack of public trust is proven to be detrimental to the goals of data sharing.2 The case of care.data in the UK offers a blatant example of a data sharing initiative gone awry. Criticism predominantly focused on limited public awareness and lack of clarity on the goals of the programme and ways to opt out.3 Citizens are becoming increasingly aware and critical of data privacy issues, and this warrants renewed investments to maintain public trust in data-intensive health research. Here, we use the term data-intensive health research to refer to a practice of grand-scale capture, (re)use and/or linkage of a wide variety of health-related data on individuals.
Within the European Union (EU), the recently adopted General Data Protection Regulation (GDPR) (EU 2016/679) addresses some of the concerns the public may have with respect to privacy and data protection. One of the primary goals of the GDPR is to give individuals control over their personal data, most notably through consent.4 Other lawful grounds for the processing of personal data are listed, but it is unclear how these would exactly apply to scientific research. Legal norms remain open to interpretation and thus offer limited guidance to researchers.5 6 In Recital 33, the GDPR actually mentions that additional ethical standards are necessary for the processing of personal data for scientific research. This indicates a recognised need for entities undertaking activities likely to incite public unease to go beyond compliance with legal requirements.7 Complementary ethical governance then becomes a prerequisite for securing public trust in data-intensive health research.
A concept that could be of use in developing ethical governance is that of a ‘social license to operate’.7 The social license captures the notion of a mandate granted by society to certain occupational groups to determine for themselves what constitutes proper conduct, under the condition that such conduct is in line with society’s expectations. The term ‘social license’ was first used in the 1950s by American sociologist Everett Hughes to address relations between professional occupations and society.8 The concept has been used since to frame, for example, corporate social responsibility in the mining industry,9 governance of medical research in general8 and of data-intensive health research more specifically.7 10 As such, adequate ethical governance then becomes a precondition for obtaining a social license for data sharing activities.
Key to an informed understanding of the social license is identifying the expectations society may hold with regard to sharing of and access to health data. Here, relevant societal actors are the subjects of Big Data health research, constituting both patients and the general public. Identification of patients’ and public views and attitudes allows for a better understanding of the elements of a socially sanctioned governance framework. We know of the existence of research papers that have captured these views using quantitative or qualitative methods or a combination of both. So far, systematic reviews of the literature have limited their scope to citizens of specific countries,11 12 qualitative studies only13 or the sharing of genomic data.14 Therefore, we performed an up-to-date narrative review of both quantitative and qualitative studies to explore predominant patient and public views and attitudes towards data sharing for health research.
We searched the literature databases PubMed (MEDLINE), Embase, Scopus and Google Scholar in April 2019 for publications addressing patients’ and public views and attitudes towards the use of health data for research purposes. Synonyms of the following terms (connected by ‘AND’) were used to search titles and/or abstracts of indexed references: patient or public; views; data sharing; research (See box 1 and online supplementary appendix 1). To merit inclusion, an article had to report results from an original research study (qualitative, quantitative or mixed methods) on attitudes of individuals regarding use of data for health research. We restricted eligibility to records published in English and studies performed between 2009 and 2019. We chose 2009 as a lower limit because we assume that patients’ and public perspectives might have changed substantially with increasing awareness and use of digital (health) technologies. Systematic reviews and meta-analyses synthesising the empirical literature on this topic also qualified for review. Reports from stakeholder meet-ups and workshops were eligible as long as they included patients or the public as participants. Since we were only interested in empirical evidence, expert opinion and publications merely advocating for the inclusion of patients’ and public views in Big Data health research were excluded. Studies that predominantly reported on views of other stakeholders—such as clinicians, researchers, policy makers or industry—were excluded. Articles reporting on conference proceedings, or views regarding (demographic) data collection in low or middle income countries or for public health and care/quality improvement were not considered relevant to this review. Despite our specific interest in data sharing within the European context, we broadened eligibility criteria to include studies performed in the USA, Canada, Australia and New Zealand. Additional articles were identified through consultation with experts and review of references in the manuscript identified through the literature database searches. Views and attitudes of patients and the public were identified from selected references and reviewed by means of thematic content analysis.
Key search terms
(patient* OR public OR citizen*)
(attitude* OR view* OR perspective* OR opinion* OR interview* OR qualitative* OR questionnaire* OR survey*)
(“data sharing” OR “data access” OR “data transfer”)
Asterisks (“*”) are used as a wildcard to allow any given search terms to be truncated or remain the same.
Searches in PubMed (MEDLINE), Embase, Scopus and Google Scholar resulted in a total of 1153 non-unique records (see online supplementary appendix 1). We identified 27 papers for review, including 12 survey or questionnaire studies (quantitative), 8 interview or focus group studies (qualitative), 1 mixed methods study and 6 systematic reviews (see table 1). Most records were excluded because they were not relevant to our research question or because they did not report on findings from original (empirical) research studies. Ten studies reported on views of patients, 11 on views of the public/citizens and 6 studies combined views of patients, research participants and the public.
Willingness to share data for health research
Reviewed papers suggest widespread support for the sharing of data for health research.
Four systematic reviews synthesising the views of patients and the public report that willingness for data to be linked and shared for research purposes is high11–14 and that people are generally open to and understand the benefits of data sharing.15
Outpatients from a German university hospital who participated in a questionnaire study (n=503) expressed a strong willingness (93%) to give broad consent for secondary use of data,16 and 93% of a sample of UK citizens with Parkinson’s disease (n=306) were willing to share their data.17 Wide support for sharing of data internationally18 19 and in multicentre studies20 was reported among patient participants. Goodman et al found that most participants in a sample of US patients with cancer (n=228) were willing to have their data made available for ‘as many research studies as possible’.21 Regarding the use of anonymised healthcare data for research purposes, a qualitative study found UK rheumatology patients and patient representatives in support of data sharing (n=40).22
Public respondents in survey studies recognised the benefits of storing electronic health information,23 and 78.8% (n=151) of surveyed Canadians felt positive about the use of routinely collected data for health research.24 The majority (55%) of a sample of older Swiss citizens (n=40) were in favour of placing genetic data at disposal for research.25 Focus group discussions convened in the UK showed that just over 50% of the members of the Citizens Council of The National Institute for Health and Care Excellence (NICE) said they would have no concerns about NICE using anonymised data derived from personal care records to evaluate treatments,26 and all participants in one qualitative study were keen to contribute to the National Healthcare Service (NHS)-related research.27
Motivations to share data
Patients and public participants expressed similar reasons and motivations for their willingness to share data for health research, including contributing to advancements in healthcare, returning incurred benefits and the hope of future personal health benefits (tables 2–4).
In the two systematic reviews that addressed this topic, sharing data for ‘the common good’ or ‘the greater good’ was identified as one of the most prevalent motivations.12 14
For patients specifically, to help future patients or people with similar health problems was an important reason.14 16 One survey study conducted among German outpatients found that 72% listed returning their own benefits incurred from research as a driver for sharing clinical data.16 Patients with rare disease were also motivated by ‘great hope and trust’ in the development of international databases for health research.19 Among patients, support of research in general,16 the value attached to answering ‘important’ research questions,20 and a desire to contribute to advancements in medicine14 were prevalent reasons in favour of data sharing. Ultimately, the belief that data sharing could lead to improvements in health outcome and care was reported.20
Only one original study research paper addressed public motivations. This study found that older citizens mentioned altruistic reasons and the greater good in a series of interviews as reasons to share genetic data for research.25 In these interviews, citizens expressed no expectations of an immediate impact or beneficial return but ultimately wanted to help the next generation.
Perceived benefits of data sharing
Patients and the public perceive that data sharing could lead to better patient care through improved diagnosis and treatment options and more efficient use of resources. Patients seem to also value the potential of (direct) personal health benefits.
Two systematic reviews reported on perceived benefits of data sharing for health research purposes. Howe et al mentioned perceived benefits to research participants or the immediate community, benefits to the public and benefits to research and science.15 Shabani et al also listed accelerating research advancement and maximising the value of resources as perceived benefits.14
Surveyed patients perceived that data sharing could help their doctor ‘make better decisions’ about their health (94%, n=3516)28 or result in an increased chance of receiving personalised health information (n=228).21
In the original studies reviewed, advantages and potential benefits of data sharing were generally recognised by public and patient participants.22 29 Data sharing was believed to enable the study of long-term treatment effects and rare events, as well as the study of large numbers of people,24 to improve diagnosis25 and treatment quality,20 23 as well as to stimulate innovation30 and identify new treatment options.25 A cross-sectional online survey among patient and citizen groups in Italy (n=280) also identified the perception that data sharing could reduce waste in research.30
Perceived risks of data sharing
The most significant risks of data sharing were perceived to results from breaches of confidentiality, commercial use and potential abuse of the data.
Systematic reviews report on patients’ and public concerns about confidentiality in general,13 15 sometimes linked to the risk of reidentification,14 concerns about a party's competence in keeping data secure,12 and concerns that personal information could be mined from genomic data.14 A systematic review by Stockdale et al identified concerns among the public (UK and Ireland) about the motivation a party might have to use the data.14
Patients in a UK qualitative study (n=40) perceived ‘detrimental’ consequences of data ‘falling into the wrong hands’, such as insurance companies.22 Respondents from the online patient community PatientsLikeMe were fearful of health data being ‘stolen by hackers’ (87%, n=3516).28
Original research studies flagged data security and privacy as major public concerns.16 18 20 25 26 29–32 More specifically, many studies found that participants worried about who would have access to the data and about risk of misuses or abuses.13 15 18 25 27 33 A large pan-European survey among respondents from 27 EU member states revealed public concerns about different levels of access by third parties (48.9%–60.6%, n=20 882).23 Overall, reviewed papers suggest that patients and the public are concerned about the use of their data for commercial purposes.14 27 For example, the NICE Citizens Council expressed concerns about the potential for data to be sold to other organisations and used for profit and for purposes other than research.26 The Citizens Council also highlighted the need for transparency about how data are used and how it might be used in the future and for ensuring the research is conducted according to good scientific practice and that data are used to benefit society. Concerns about control and ownership of data were identified13 33 and about re-use of data for purposes that participants do not agree on.30 Fear of discrimination, stigmatisation, exploitation or other repercussions as a consequence of data being shared was widely cited by individuals.14 15 18
Barriers to share data
Studies showed that patients and the public rarely mention barriers to data sharing in absolute terms. Rather, acceptance seemed to decrease if data sharing was financially motivated, and if people did not know how and with whom their data would be shared.
First, individuals often opposed data sharing if it was motivated by financial gain or profit20 or if the data were shared with commercial/private companies.14 15 In one large pan-European survey (n=20 882), respondents were found to be strongly averse to health insurance companies and private sector pharmaceutical companies viewing their data.23 Second, lack of understanding and awareness around the use of data was viewed as a barrier to data sharing.15 22 Third, lack of transparency and controllability in releasing data were mentioned as factors compromising public trust in data sharing activities.14 22
Factors affecting willingness to share data
A wide range of factors were identified from the literature that impacted individuals’ willingness to share data for health research, including geographical factors, age, individual-specific and research-specific characteristics.
McCormack et al found that European patients’ expressions of trust and attitudes to risk were often affected by the regulatory and cultural practices in their home countries, as well as by the nature of the (rare) disease the patient participant had.18 Shah et al conducted a survey among patients in four Northern European countries (n=855) and found a significant association between country and attitudes towards sharing of deidentified data.34 Interestingly, Dutch respondents were less likely to support sharing of their deidentified data compared with UK citizens.
Among a sample of surveyed patients with Parkinson’s disease (UK), a significant association was found between higher age and increased support for data sharing.17 According to a study based on semistructured interviews with older Swiss citizens, generational differences impacted willingness to share.25 With respect to public attitudes towards data sharing, findings of one systematic review suggest that males and older people are more likely to consent to sharing their medical data.27 A systematic review by Shabani et al suggests that patient and public participants with higher mean age are substantially less worried about privacy and confidentiality than other groups.14
A systematic review into patients’ and public perspectives on data sharing in the USA suggests that individuals from under-represented minorities are less willing to share data.11 A large multisite survey (n=13 000) among the US public found that willingness to share was associated with self-identified white race, higher educational attainment and lower religiosity.31 In another systematic review, race, gender, age, marital status and/or educational level all seemed to influence how people perceived sensitivity of genomic data and the sharing thereof.14 However, a UK study among patients with Parkinson’s disease found no clear relationship between data sharing and the number of years diagnosed, sex, medication class or health confidence.17
Factors that clearly positively affected attitudes towards data sharing were perceptions of the (public) benefits and value of the research,13 20 fewer concerns and fewer information needs,31 and higher trust in and reputation of individuals or organisations conducting and/or overseeing data sharing.12–14 35 Conversely, willingness decreased with higher privacy and confidentiality concerns11 and higher distrust of the government as an oversight body for (genetic) research data.35
Privacy measures increased people’s willingness to share their data for health research, such as removal of social security numbers (90%, n=3516) and insurance ID (82%, n=3516), the sharing of only summary-level or aggregate data20 and deposition of data in a restricted access online database.29 Expressions of having control over what data are shared and with whom positively affected attitudes towards data sharing.34 In one study, being asked for consent for each study made participants (81%) feel ‘respected and involved’, and 74% agreed that they would feel that they ‘had control’.14 With respect to data sharing without prospective consent, participants became more accepting after being given information about the research processes and selection bias.27 Less support was observed for data sharing due to financial incentives25 and, more specifically, if data would be shared with private companies, such as insurance or pharmaceutical companies.11 25
Conditions for sharing
Widespread willingness to share data for health research very rarely led to participants’ unconditional support. Studies showed agreement on the following conditions for responsible data sharing: value, privacy, minimising risks, data security, transparency, control, information, trust, responsibility and accountability.
One systematic review found that participants found it important that the research as a result of data sharing should be in the public’s interest and should reflect participants’ values.15 The NICE Citizens Council advocated for appropriate systems and good working practices to ensure a consistent approach to research planning, data capture and analysis.26
Privacy, risks and data security
The need to protect individuals’ privacy was considered paramount11 14 21 34 and participants often viewed deidentification of personal data as a top privacy measure.11 24 30 36 One survey among US patients with cancer found that only 20% (n=228) of participants found linkage of individuals with their deidentified data acceptable for return of individual health results and to support further research.21 Secured access to databases was considered an important measure to ensure data security in data sharing activities.30 34 A systematic review of participants’ attitudes towards data sharing showed that people established risk minimisation as another condition for data sharing.15 Findings by Mazor et al suggest that patients only support studies that offer value and minimise security risks.20
Transparency and control
Conditions regarding transparency were information about how data will be shared and with whom,14 35 the type of research that is to be performed, by whom the research will be performed,16 information on data sharing and monitoring policies and database governance,35 conditions framing access to data and data access agreements,24 28 30 and any partnerships with the pharmaceutical industry.19 More generally, participants expressed the desire to be involved in the data sharing process,35 to be notified when their data are (re)used and to be informed of the results of studies using their data.15 Spencer et al identified use of an electronic interface as a highly valued means to enable greater control over consent choices.22 When asked about the use of personal data for health research by the NHS, UK citizens were typically willing to accept models of consent other than the ones they would prefer.37 Acceptance of consent models with lower levels of individual control was found to be dependent on a number of factors, including adequate transparency, control over detrimental use and commercialisation, and the ability to object, particularly to any processing considered to be inappropriate or particularly sensitive.37
Information and trust
One systematic review identified trust in the ability of the original institution to carry out the oversight tasks as a major condition for responsible data sharing.14 Appropriate education and information about data sharing was thought to include public campaigns to inform stakeholders about Big Data32 and information communicated at open days of research institutions (such as NICE) to ensure people understand what their data are being used for and to reassure them that personal data will not be passed on or sold to other organisations.26 The informed consent process for study participation was believed to include information about the fact that individuals’ data could potentially be shared,15 30 the objectives of data sharing and (biobank) research, the study’s data sharing plans,29 governance structure, logistics and accountability.33
Responsibility and accountability
Participants often placed the responsibility for data sharing practices on the shoulders of researchers. Secondary use of data collected earlier for scientific research was viewed to require a data access committee that involves a researcher from the original research project, a clinician, patient representative and a participant in the original study.36 Researchers of the original study were required to monitor data used by other researchers.36 In terms of accountability, patient and public groups in Italy (n=280) placed high value on sanctions for misuse of data.30 Information on penalties or other consequences of a breach of protection or misuse was considered important by many.31 35
In this study, we narratively reviewed 27 papers on patients’ and public views on and attitudes towards the use of health data for scientific research. Studies reported a widespread—though conditional—support for the linkage and sharing of data for health research. The only outlier seems to be the finding that just over half (n=25) of the NICE Citizens Council answered ‘no’ to the question whether they had any concerns if NICE used anonymised data to fill in the gaps if NICE was not getting enough evidence in ‘the usual ways’.26 However, we hasten to point out that the question about willingness to share is different from the question whether people have concerns or not. In addition, after a 2-day discussion meeting Council members were perhaps more sensitised to the potential concerns regarding data sharing. Therefore, we suggest that the way and context within which questions are phrased may influence the answers people give.
Overall, people expressed similar motivations to share their data, perceived similar benefits (despite some variation between patients and citizens), yet at the same time displayed a range of concerns, predominantly relating to confidentiality and data security, awareness about access and control, and potential harms resulting from these risks. Both patient and public participants conveyed that certain factors would increase or reduce their willingness to have their data shared. For example, the presence of privacy-protecting measures (eg, data deidentification and the use of secured databases) seemed to increase willingness to share, as well as transparency and information about data sharing processes and responsibilities. The identified views and attitudes appeared to come together in the conditions stipulated by participants: value, privacy and confidentiality, minimising risks, data security, transparency, control, information, trust, responsibility and accountability.
In our Introduction, we mentioned that identifying patients’ and public views and attitudes allows for a better understanding of the elements of a socially sanctioned governance framework. In other words, what work should our governance framework be doing in order to obtain a social license? This review urges researchers and institutions to address people’s diverse concerns and to make an effort to meet the conditions identified. Without these conditions, institutions lack trustworthiness, which is vital for the proceedings of medicine and biomedical science. As such, a social license is not a ‘nice to have’ but a ‘need to have’. Our results also confirm that patients and the public indeed care about more than legal compliance alone, and wish to be engaged through information, transparency and control. This work supports the findings of a recent systematic review into ethical principles of data sharing as specified in various international ethical guidelines and literature.38 What this body of research implies is considerable diversity of values and beliefs both between and within countries.
The goal of this narrative review was to identify the most internationally dominant, aggregated patient and public views about the broad topic of data sharing for health research. We deliberately opted for the methodology of a narrative review rather than a systematic review. Most narrative reviews deal with a broad range of issues to a given topic rather than addressing a particular topic in depth.39 This means narrative reviews may be most useful for obtaining a broad perspective on a topic, and that they often are less useful in generating quantitative answers to specific clinical questions. However, because narrative reviews do not require specification of the search and selection strategy and the way of critically appraising literature can be variable, the connection between evidence generated by narrative reviews and (clinical) recommendations is less rigorous and risk of bias exists. This is something to take into account in this study. A risk of bias assessment was not possible due to the heterogeneity of the findings. We acknowledge that our methodological choices may have affected the discriminative power or granularity of our findings. For example, there is a difference between sharing of routinely collected health data versus secondary use of health data collected for research purposes. And we can only make loose assumptions about potential differences between patient and public views.
In addition, we should mention that this work is centred around studies conducted in Western countries as the whole Big Data space and literature is dominated by Western countries, higher socioeconomic status and Caucasians. However, most of the disease burden globally and within countries is most probably not represented in the ‘Big Data’ and so we have to stress the lack of generalisability to large parts of the world.
Nevertheless, we believe our findings point towards essential elements of a governance framework for data sharing for health research purposes. If we are to conclude that the identified conditions ought to act as the pillars of a governance framework, the next step is to identify how these conditions could be practically operationalised. For example, if people value information, transparency and control, what type of consent is most likely to valorise these conditions? And what policy for returning research results would be desirable? Once we know what to value, we can start thinking about the ways to acknowledge that value. A new challenge arising here, however, is what to do when people hold different or even conflicting values or preferences. Discrete choice experiments could help to test people’s preferences regarding specific topics, such as preferred modes of informed consent. Apart from empirical work, conceptual analysis is needed to clarify how public trust, trustworthiness of institutions and accountability are interconnected.
This narrative review suggests widespread—though conditional—support among patients and the public for data sharing for health research. Despite the fact that participants recognise actual or potential benefits of health research, they report a number of significant concerns and related conditions. We believe identified conditions (eg, social value, data security, transparency and accountability) ought to be operationalised in a value-based governance framework that incorporates the diverse patient and public values, needs and interests, and which reflects the way these same conditions are met, to strengthen the social license for Big Data health research.
Patient consent for publication
We thank Susanne Løgstrup (European Heart Network) and Evert-Ben van Veen (Medlaw) for their valuable feedback during various stages in drafting the manuscript.
Contributors JvD and GvT conceived the idea for this work. SK designed the study and performed acquisition of the data. SK and GvT analysed the data. All authors contributed substantially to interpretation of the data. SK drafted the manuscript, and JvD, AB, BT, MM and GvT substantively revised it. All authors approved the submitted version of the manuscript.
Funding The results presented in this short report were part of Work Package 7 of the BigData@Heart consortium, which received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No. 116055. This Joint Undertaking receives support from the European Horizon 2020 research and innovation programme and European Federation of Pharmaceutical Industries and Associations.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.