Should research samples reflect the diversity of the population?
- Correspondence to: P Allmark Samuel Fox House, Northern General Hospital, Sheffield S5 7AU, UK;
- Accepted 12 September 2003
- Revised 15 August 2003
Recent research governance documents say that the body of research evidence must reflect population diversity. The response to this needs to be more sophisticated than simply ensuring minorities are present in samples.
For quantitative research looking primarily at treatment effects of drugs and devices four suggestions are made. First, identify where the representation of minorities in samples matters—for example, where ethnicity may cause different treatment effects. Second, where the representation of a particular group matters then subgroup analysis of the results will usually be necessary. Third, ensuring representation and subgroup analysis will have costs; deciding on whether such representation is worthwhile will involve cost benefit analysis. Fourth, the representation of minorities should not be seen as mainly a locality issue.
For qualitative research it is argued that the representation of diversity is often important. Given the small samples of many qualitative projects, however, the best way to ensure representation occurs is to allow a proliferation of such research, not to stipulate such representation in samples.
Paragraph 2.2.7 of the Research Governance Framework for Health and Social Care (henceforth, RGF 2.2.7) states1:
Research and those pursuing it should respect the diversity of human culture and conditions and take full account of ethnicity,
gender, disability, age, and sexual orientation in its design, undertaking, and reporting. Researchers should take account
of the multicultural nature of society. It is particularly important that the body of research evidence available to policy
makers reflects the diversity of the population.
The companion document, Governance Arrangements for Research Ethics Committees, recommends the provision of information for research participants in languages other than English as a “locality issue” for local research ethics committees (LRECs).2 It also charges all research ethics committees, both LRECs and multicentre research ethics committees (MRECs), with the task of reassuring themselves concerning “the characteristics of the population from which the research participants will be drawn (including gender, age, literacy, culture, economic status, and ethnicity) and the justification for any decisions made in this respect” (Department of Health, para 9.14).2
These recommendations arise from a concern that the ethnic and cultural mix of research participants differs from that of the general population. The most recent census indicated that around 90% of the population of England and Wales see themselves as white whilst around 10% do not.3 Figures from the USA suggest there is under-representation of some groups in research samples; African Americans and Hispanics are among these groups.4,5 It is thought that a similar situation exists in the UK—for example, with people of Asian and West Indian origin under-represented; there is some research to support this view.6
Mastrioanni et al provide evidence from the USA that women are under-represented in research samples.7 They suggest that the desire to protect the fetus—for example, in the light of the thalidomide scandal, has resulted in women who are, or could become, pregnant being excluded from studies. Children8 and the elderly9 have also been excluded from samples. Again, the desire to protect research participants may be a factor here. There may also be less noble reasons, however, such as the exclusion of the elderly in order to test drugs on those in whom one is most likely to see beneficial effects and least likely to see harmful ones.
Anecdotally it seems that thus far RECs have interpreted the guidelines on diversity largely to mean that documents should be translated for use in localities where a high proportion of the population does not have English as a first language. They could, however, set more rigorous requirements. They might request that translators be available for interviews. They could also request the specific targeting of a population, such as gay men, in order to ensure the sample reflects cultural diversity.
In this article I argue that the response of RECs and researchers to RGF 2.2.7 needs to be more sophisticated. Simply ensuring that groups are present in samples is inadequate. I begin by suggesting that what is at the heart of the concern about the proportionate representation of diverse groups in research samples is something to do with fairness. I go on to argue that exclusion from simple participation in research is not unfair in itself. The real concern is that the results of research from which minorities are excluded will not necessarily be applicable to those minorities. It is this that would be unfair. I then examine the extent to which exclusion from taking part in research results is exclusion from the benefits of its results.
FAIRNESS AND TAKING PART IN RESEARCH
The underlying thought of RGF 2.2.7 is that it is unfair for minorities not to be represented in research. In this context, “fairness” concerns distributive justice. The idea of this is that society’s members should get their fair share of its burdens and benefits—for example,10 unemployment is one of society’s burdens. Thus if, say, Afro-Caribbeans were disproportionately unemployed, then that would be prima facie unfair and a cause for concern. Similar arguments apply in relation to the burden of imprisonment and the benefit of further education.
Is being excluded from research samples a similar cause for concern? I shall divide this question into two further ones. First, I shall ask whether simply taking part in research projects is a benefit of which excluded minorities are unfairly deprived. Second, I shall ask whether this exclusion might lead to minorities being deprived of the benefits of the research results.
In answering both questions it is useful to divide research projects into two broad groups: quantitative and qualitative studies. Quantitative studies, such as randomised controlled trials, are usually concerned primarily with the effects of different treatments, such as drugs or devices. They ask questions such as whether treatment A is better than treatment B in terms of an end point, such as mortality or morbidity. These end points can be measured in figures; statistical work can then deliver an answer to the overall research question. Qualitative studies, by contrast, gather data that are primarily verbal rather than numerical. Their focus is on social phenomena of which the researcher is trying to develop an understanding. This division is a simplification. It is possible—for example—to do quantitative research into social phenomena and qualitative research into treatments. I do not think, however, that the simplification masks any important ethical issues.
In the case of quantitative research it seems unlikely that groups excluded from, say, a trial comparing a new treatment are missing out on a benefit. Clearly, those in the trial stand a chance of benefiting from a new, better treatment but the treatment could turn out to be worse. Qualitative research is almost always non-therapeutic or, to use the preferred language of the latest Helsinki declaration, it confers neither possible great benefit nor possible great harm.11 This being the case it is clear that those excluded from such research are not being deprived of a benefit and, as such, are not suffering unfairness.
There is, of course, the Hawthorne effect12; as a rule, participants in research projects do better than non-participants, whatever the research and whatever arm of that research the participants are in. The Hawthorne effect is fairly small, however, and its existence is hardly sufficient to drive a campaign to ensure research samples reflect cultural diversity. At most it would seem to imply only that we should recruit as many participants as possible, whatever their cultural background.
Thus it seems that the exclusion of a group from taking part in research is not, as such, unfair; however, this conclusion rests upon an important assumption—namely that the research itself is ethically sound. In particular, it is important that the researchers are at a point of equipoise or, to use Bayesian terminology, that their prior probabilities do not suggest that one arm of the study is significantly superior to another. If this condition were not met then at least two types of problem could emerge. The first is that an unrepresented group could be denied cutting edge treatment that we have strong grounds to believe superior. The second is that a represented group could be exposed to a risky new treatment that the researchers hope is superior but which we have strong grounds to believe may have significant negative effects. Both types of problem appear to have arisen in the USA. In particular, the economically poor (who are often disproportionately from particular ethnic groups) are either deprived of treatments or are used in research for treatments that are then only available, once proven effective, to the rich (Mastrioanni et al,7 pp 75ff).
I conclude, therefore, that the exclusion of groups from taking part in research is not, as such, unfair provided the research itself is ethically sound. I doubt that this is a controversial conclusion. The more important question is whether such exclusion deprives minorities of the benefits of research results. It is this to which I turn next.
FAIRNESS AND THE BENEFITS OF QUANTITATIVE RESEARCH RESULTS
I shall focus first on quantitative research examining the treatment effect of drugs. Drugs vary in their effects on individuals. If they did not, there would be no need for large scale drug trials; one could simply look at the effect on one person and know that the effect will apply to all others. These differences in drug effects are called “polymorphism”. There seem to be three causes of polymorphism.13,14
The first is environmental—for example, differences in diet (such as the presence of high fat or low salt) make some drugs more effective.
The second reason is cultural. This is closely linked to the first reason. As an example, there may be some cultures in which compliance with drug treatment is an issue, or others where there is likely to be the presence of alternative medicines, such as acupuncture, that may disrupt pharmacokinetics, or others where high fat diet is present and so forth.
The third reason is genetic. In particular, people inherit genes that control their liver metabolism and can result in them being either slow or fast metabolisers of drugs. Slow metabolisers will find a drug less effective and are more likely to experience drug toxicity.
Environmental and cultural reasons are probably only markers for a primary cause. Thus, when we see a difference in treatment effect based on culture we look for a specific behaviour that is the cause of that difference—for example, a high salt diet negating the effect of an antihypertensive drug. When we see a difference based on genes, however, we believe it is the genes themselves that are the cause. For this reason most recent research into polymorphism has centred on genetics as a cause.
There are several types of genetic variability.15 Race or ethnicity is said to be implicated in some of these types such that certain racial or ethnic groups will have higher proportions of, say, slow metabolisers than others. Japanese and Inuit populations—for example—have a high proportion of rapid acetylation metabolisers; European and African populations have an equal proportion of slow and rapid metabolisers. Examples of drugs affected include clonazepam (an anti-epilepsy agent) and the stimulant, caffeine.
Thus, we should expect to see differences in the treatment effects of some drugs broadly based around racial or ethnic divisions.16 This gives us grounds to believe it would be unfair if the affected ethnic and racial groups were excluded from drug research participation.
However, caution is necessary here. Some American authors have recently questioned the idea that racial or ethnic differences based on genetics are likely to be the true basis for large differences in treatment effect. It is already well established that the genetic differences between racial groups are minute17 and that there are greater differences within racial groups than between them.18
From the field of biological anthropology, Marks is scathing concerning the use of racial categories as markers for genetic difference. Essentially he argues that race is a social construct with no firm basis in the real world and certainly not in genetics. Marks does go on to say, however, that people exhibit genetic variation based on geography; they are most like their near neighbours and least like people distant from them.19 From this it follows that in the case of ethnic groups it is more plausible that there will be a genetic line from child to parents to grandparents and so on. As such, genetic differences between ethnic groups may be more marked than between racial ones. The cultural processes that maintain the idea that there exist different ethnic groups lead to some genetic differences between those groups.20
Ethnicity is, then, largely a matter of self perception rather than objective fact. This leads to a further problem of how people end up in particular categories—for example, people categorised as Afro-Caribbean may have very different ethnic backgrounds; some may have one “white” parent. MacBeth looks at various attempts to define ethnic groups and concludes that accurate definitions are impossible “because of the absence of meaningful boundaries”.21
Defining someone’s gender is generally less problematic, and the existence of gender-based polymorphism is fairly well established for some drugs—for example, alcohol. Indeed, not only do men and women process some drugs differently, women may exhibit polymorphism during different stages of the menstrual cycle as well as pre- and postmenopause (Mastrioanni et al,7 pp 140–1). Finally, there is evidence that age is a factor in polymorphism. Children8 and the elderly22 process drugs differently from other groups.
With these points in place, then, I turn back to RGF 2.2.7 and the question of how RECs and researchers should respond to the fact that differences in treatment effect are present between groups, whatever the reason. I have a number of suggestions.
Identify the research where representation matters
In the first place, most of the available research is from the USA and is concerned with ethnic and racial groups.Research Governance Framework for Health and Social Care para 2.2.7 also picks out other groups, such as those based on sexuality, gender, and age. Of these, gender and age are both plausible bases for polymorphism.
For any research proposal it would seem useful to identify those where it is likely that a difference in treatment effect is present. If one were undertaking research into a new anti-anxiety drug—for example, of a type like diazepam—then it seems plausible that the treatment effects will be different for those of Chinese or Japanese origin. For other drugs it will seem unlikely that there will be differences. Johnson16 tells us:
“[It] should be possible to identify those drugs most likely to exhibit differences in their pharmacokinetics. For example,
a drug which is eliminated entirely by the kidneys through filtration and reabsorption and is not highly bound to plasma proteins...is
highly unlikely to exhibit racial differences in its kinetics.”
Similarly, knowledge of the way groups based on gender or age process drugs should allow identification of new products where polymorphism is a possibility. If, then, the research is into a drug where it is highly unlikely that there will be different treatment effects among particular groups we should be unconcerned about the make up of the research sample. The results of the research will apply across all groups no matter who the research participants were.
Up to now I have focused on the treatment effects of drugs; I would suggest that the argument applies a fortiori to medical devices, where I can think of very few examples where one would expect different treatment effects (although age might be a factor in some cases, perhaps).
If representation matters then proper subgroup analysis is necessary
If there are grounds to expect different treatment effects between groups then simple representation in the sample will be insufficient. For example, four sample includes 3% of an ethnic group for whom treatment effects differ from the rest of the sample then this result will be swallowed up by the overall result. Thus, the researcher will need to do a specific subgroup analysis in order to identify the different treatment effects. This has two significant implications for RECs and researchers.
The first is that if a REC specifies that a group must be present in a quantitative trial then they must also specify that subgroup analysis must be done or, at least, that the data must be collected in such a way that subgroup analysis can be done in the future. It is tokenism to insist on the participation of particular groups unless something is to be done with the results. The second significant implication is that the proportion of such groups in the sample is irrelevant. What matters is whether there are sufficient numbers to perform the subgroup analysis.
There may also be an issue for the trial steering committee. If the REC has specified recruitment of particular groups then one task for the committee would be to ensure that the necessary recruitment occurred. This could require targeted recruitment in some cases.
(There may be very rare exceptions to this rule—for example, a researcher might be looking at how acetylator status affects the therapeutics of a drug. The researcher would be checking the status of each participant and comparing it against the reaction to a drug. In such a case it would be a good idea for the researcher to ensure a high representation of ethnic minorities because this would ensure a wide spread of slow to fast acetylators; however subgroup analysis would be unnecessary.)
The decision on whether representation of minorities should be sought will involve a judgement of cost and benefit
The potential number of groups on whom subgroup analysis could be performed is immense. Obtaining the representation and performing the analysis will, however, cost research money. Simply translating an information sheet into another language will be costly and having an interpreter much more so. These costs should not be imposed unless there is a plausible benefit (as already stated in point 1). Furthermore, such costs should only be imposed if the likely benefit is worthwhile. As an extreme example, seeking Inuit representation in UK samples would seem to give little benefit for great cost. Conversely, seeking the representation of South Asian populations would seem worthwhile, but (to restate) only where there is a plausible expectation of treatment differences. More difficult decisions would lie with groups such as those of Chinese or Japanese origin who are not greatly represented in the UK population.
There is at least one other cost implication. Quantitative trials are powered to give results that are statistically significant for the population as a whole. Where subgroup analysis is performed the results may suggest differences between the main population and the subgroup, but further research will be necessary to confirm these. Thus a decision must be made whether to power the original research sufficiently to do subgroup analysis in the first place, and then whether to do further research in the light of a suggestive but underpowered result.
Representation is only partly a locality issue
At present, ethics committees treat the representation of minorities as a locality issue. Thus, a researcher might be asked to provide translated information sheets in an area that has a high proportion of, say, South Asians. There is a danger of this being tokenism. If minorities are to be represented it is because of the potential for differences in treatment effect; this is an issue of science. Scientific matters are not, and should not be seen as, locality issues.
There is another sense, however, in which it is reasonable to treat representation as a locality issue. This would be where the researchers or the MREC have identified representation of a particular group as scientifically important. The LREC could then say that this group is present in greater than average numbers in their area and that, therefore, there should be an effort to enrol them. But this should not be a matter of course. The LREC in an area with a high proportion of a particular group should not insist on its representation in every trial: only in those trials where the group can plausibly be expected to experience different treatment effects.
Clearly, representation is not a locality issue for at least two of the other groups mentioned in RGF 2.2.7. One would assume that age and gender differences are fairly evenly distributed throughout the population. The same would not be true of economic status and sexuality; however, for these types of group it seems very unlikely that there will be genetic-based differences in treatment effect. As such, their representation is more likely to be of import for qualitative research. It is to that topic I turn next.
FAIRNESS AND THE RESULTS OF QUALITATIVE RESEARCH RESULTS: LET A THOUSAND FLOWERS BLOOM
Factors such as ethnicity, sexuality, gender, and economic status are likely to be markers for significant cultural differences. These differences may be of great importance for the social phenomena studied by qualitative research. As such, groups excluded from qualitative research may be deprived of its benefits.
A hypothetical example may illustrate this point. Imagine that a series of interview studies suggest that cancer patients like to receive information about their diagnosis without family members present. These studies have been conducted only upon a sample that excludes the members of one group who feel very strongly that they want family members present when diagnostic information is discussed. The interview studies are used, however, to make a policy recommendation—namely that patients should be alone when their cancer diagnosis is discussed. Clearly this disadvantages the unrepresented group.
Thus we have a prima facie case to say that the exclusion of a group from qualitative research that has policy implications is unfair. And although most qualitative research is small scale and does not have policy implication on its own, each small scale study contributes to an overall picture. That overall picture will be deceptive and will lead to unfair policy decisions.
I doubt, however, that the stipulation of particular cultural mixes in samples would be an appropriate response to this problem. The sample sizes in qualitative research are often very small; set this alongside the fact that the number of cultures is very great and the impossibility of meeting such a stipulation becomes apparent. People vary in their ethnicity, class, gender, disability, age, and sexual orientation, as the Research Governance document tells us; they also vary in their class, urban or rural status, and in subcultures (such as the “drug culture” or the “gun culture”). Qualitative researchers could not reflect such diversity in each of their samples.
In fact, RGF 2.2.7 does not need to be interpreted as requiring such a stipulation. The final sentence tells us:
It is particularly important that the body of research evidence available to policy makers reflects the diversity of the population.1
I suggest the best way to ensure this is simply to encourage qualitative research. Such researchers are curious about the world; left to their own devices, social scientists want to examine cultural differences. A glance through the journal, Social Science in Medicine, should be enough to convince the reader of this. There are, of course, many examples elsewhere.23–26 In general, if we “let a thousand flowers bloom” in qualitative research, then the body of research evidence that reflects the diversity of the population will, for the most part, develop. Where there are gaps then the appropriate response would be to commission specific research, not to prevent other research.
One caveat is necessary. Qualitative research is not always small scale—for example, a government department might commission research into attitudes to organ donation with a view to changing the policy from one where it requires consent from relatives to one of presumed consent. Were groups to be excluded from this study then the recommendations that it makes could be entirely inappropriate for those groups. In cases such as these there seem to be much stronger grounds for RECs to say that the researchers must ensure that the samples in this type of qualitative research are culturally representative.
This paper has focused on one question: whether research samples should reflect the diversity of the population. This question arose, in part, from RGF 2.2.7. I have suggested that taking account of the multicultural nature of society, as RGF 2.2.7 requires, is not a straightforward matter. There is a danger of the arbitrary and pointless imposition of barriers to both qualitative and quantitative research. One example would be insisting that every research project that takes place in areas with a high proportion of some ethnic groups uses translated information sheets. Such a response is inadequate and could be viewed as mere tokenism.
For quantitative research, careful analysis is needed to identify those projects where factors such as age, gender or ethnicity may have significant treatment effects. If the analysis confirms that one such factor does, then further thought is needed to decide whether the cost of ensuring representation of that group, followed by subgroup analysis and possible further research, is worthwhile. It is perhaps worth adding that this issue may gradually fade in importance as treatments come on line that are individually tailored rather than based on fairly crude subgroupings of people (Burroughs et al13).
For qualitative research, cultural factors will be important in many, perhaps most, cases. But the best way to ensure they are taken into account is to allow the research to proliferate and to commission research into neglected areas.
Of course, this does not exhaust the issues relating to research and cultural diversity. There are a number of questions that follow from, or are related to, the one tackled here. In the first place, if we are to seek the involvement of minorities then how should this be done? Ashcroft et al27 suggest there is insufficient research evidence available on cultural barriers to—for example, participation in randomised controlled trials. They also suggest there is a need to protect groups from inappropriate involvement in research—for example, where language or cultural barriers make informed consent unlikely.
There is also a further point: RGF 2.2.7 talks of the need to ensure that available research evidence reflects diversity. One possible interpretation of this is that the research evidence should meet the needs of a diverse population. At present, some argue, this does not occur—for example, some of the needs of women are not met in research because the necessary research is not done. On a global scale, it is said that “less than 10% of worldwide health research is devoted to diseases that account for 90% of the global burden of disease”.28 These are important issues concerning justice. They move beyond the question of research samples, however, to questions of research commissioning and, as such, they are beyond the scope of this paper.
I would like to thank Su Mason, Paul Ramcharan, Su Read, Steve Baker, and the two referees of this journal for their comments.