Media attention to retracted research suggests that a substantial number of papers are corrupted by misinformation. In reality, every paper contains misinformation; at issue is whether the balance of correct versus incorrect information is acceptable. This paper postulates that analysis of retracted research papers can provide insight into medical misinformation, although retracted papers are not a random sample of incorrect papers. Error is the most common reason for retraction and error may be the principal cause of misinformation as well. Still, one-quarter of retracted papers are fraudulent, and misinformation may also arise through fraud. This paper hypothesises that error and fraud are the main sources of misinformation and that error is more common than fraud. Retraction removes misinformation from the literature; bias is non-retracted misinformation. Bias arises when scientific impropriety results in false research findings. Impropriety can involve experimental design, data collection, data analysis, or data presentation. Yet impropriety also arises through earnest error or statistical naiveté; not all bias is fraud. Several measures are proposed to minimise misinformation in the medical literature, including: greater detail in the clinical trial registry, with rigorous definition of inclusion and exclusion criteria and primary endpoints; clear statistical criteria for every aspect of clinical trials, especially sample size; responsibility for data integrity that accrues to all named authors; increased transparency as to how the costs of research were paid; and greater clarity as to the reasons for retraction. Misinformation can arise without malicious intent; authors of incorrect papers are owed a presumption of incompetence, not malice.
- Data fabrication
- data falsification
- head injury
- professional misconduct
- scientific research
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- Data fabrication
- data falsification
- head injury
- professional misconduct
- scientific research
‘Never ascribe to malice that which is adequately explained by incompetence.’
Napoleon I (1769–1821)
The perfectly objective, totally correct, and completely unbiased research paper has not yet been written. Therefore, every published paper contains misinformation—wrong or incorrect information—and the issue is whether the balance of information versus misinformation is acceptable. Only history can judge whether a particular paper is more right than wrong, but we know that misinformation has a major impact in medicine. In 2009, Scott Reuben, an anaesthesiologist at Baystate Medical Center, was forced to retract 21 papers published over the course of 13 years, and there is a suggestion that his papers may have resulted in many patients being undermedicated for postsurgical pain.1 Therefore, we must wonder, how does medical misinformation arise? and what can be done to minimise the impact of misinformation on patient care?
Misinformation in the medical literature
Media attention to recent retracted research papers suggests that misinformation is rife in the medical literature. For example, a paper claiming that the measles–mumps–rubella vaccine harms children led to a clamorous anti-vaccine movement, but the paper was eventually retracted when it became clear that financial conflicts of interest had tainted the work.2 The number of papers that are retracted yearly for fraud has increased sharply over the past decade, which may reflect either a real increase in the incidence of fraud or a greater effort on the part of journals to police the literature.3 Papers retracted for fraud (fabrication or falsification) may represent a calculated effort to deceive; such papers are often written by ‘repeat offender’ authors who target journals with a wide readership.4
It has been estimated that half the medical literature is wrong.5 This jarring conclusion is the good news; underpowered, early-phase clinical trials are true perhaps 25% of the time. If 30 000 genes are tested in a gene-association study, but only 30 are likely to be linked to disease—in other words, when tested relationships exceed true ones by 1000-fold—the likelihood that a claimed relationship is true is very nearly zero.5 Most study designs in most settings can lead to erroneous results because of two types of error inherent to every study. Type 1 (α) error is false rejection of a null hypothesis, equivalent to finding a relationship where none exists. Type II (β) error is false acceptance of a null hypothesis, equivalent to overlooking a real relationship. All studies are vulnerable to one or the other source of error.
Under what circumstances are research findings most likely to be wrong? Mathematical analysis suggests that circumstance favours misinformation when:5
Sample sizes are small;
Effect sizes are small;
Many hypotheses are tested in a single dataset;
There is greater flexibility as to study design and outcome assessment;
The area of research is highly competitive, so that preliminary results are reported.
Basically, the hotter the field, the more likely it is to be corrupted with misinformation.5 It is worth noting that a topic can be perceived as hot for many reasons: a devastating illness is clearly a hot topic for pharmaceutical companies; a new tool or a new hypothesis is often a hot topic for basic scientists.
Misinformation is not limited to isolated studies; established ideas and meta-analyses can also be wrong. A study evaluated the proportion of highly cited (ie, established) papers that are subsequently overturned.5 A total of 49 clinical papers was identified, each of which had been cited more than 1000 times, so these papers had an enormous impact—yet a substantial fraction of them was invalidated by later studies. Among 45 highly cited papers claiming efficacy for a particular medication, seven were subsequently contradicted, seven were weakened, 11 were unchallenged (no subsequent papers, neither confirmatory nor contradictory) and 20 papers were replicated. That 11 papers went unchallenged is actually worrisome, because clinicians view these findings as established fact without replication. The replication rate of highly cited papers was 44%, while the rate at which such papers were weakened or refuted was 31%. Overall, 83% of non-randomised studies (five of six) were wholly or partly refuted, as were 23% of randomised clinical trials (RCT) (nine of 39); thus, even RCT were invalidated. The only reliable predictor of robustness to replication was sample size; large studies are more likely to be supported.5
The published record in hepatology was evaluated, to see how often ‘established wisdom’ stands the test of time.6 A total of 474 papers was selected to cover the time from 1945 until 1999. Among 474 conclusions in these papers, 60% were deemed to be correct in 2002. Forty per cent of the papers were thus flawed by misinformation—wrong or incorrect information—although it was not known as misinformation at the time of publication. Overall, 19% of conclusions were obsolete (eg, immunoglobulins prevent hepatitis A infection, but an effective hepatitis vaccine is now available), and 21% of conclusions were incorrect. Surprisingly, half the papers had conclusions thought to be true 45 years post-publication.6
Misinformation is not randomly distributed in the medical literature.6 In general, studies thought to be correct were published more recently, but this is misleading; one cannot conclude that recent studies are right, because there has been less time to refute them. Still, research quality has improved over time; more than half the RCT—and all the meta-analyses—were published after 1980. There were more substantiated conclusions in meta-analyses (82%) than in RCT (62%) and non-randomised studies (50%), but meta-analyses were still wrong 18% of the time.6
How can a meta-analysis be vulnerable to misinformation? A distillation of relevant knowledge from many sources would seem less vulnerable to the circumstances that favour misinformation. Yet recent evidence suggests that multiple publication—duplicative publication of data in multiple papers—can seriously prejudice meta-analyses.7 A Swedish regulatory authority required submission of all patient data for evaluation, before marketing approval for selective serotonin-reuptake inhibitors for major depression. Access to full patient documentation enabled researchers to detect duplicative or selective publication, by comparing each paper with the underlying data. There was clear evidence of multiple publication; 21 patient studies contributed to at least two papers, and three studies contributed to five papers. There was also evidence of selective publication; studies that reported significant positive drug results were more often published as stand-alone studies than were studies with negative results. Meta-analyses limited to publicly available data are thus likely to be based on unreliable evidence.7
Industry sponsorship affects the likelihood of publication.8 Research funded by pharmaceutical companies is less likely to be published overall, although published studies are typically of high quality. If a company funds research on a specific medication, then published results tend to favour that product.8 These findings suggest two plausible interpretations: either the publication is misleading (eg, highlight the good news, bury the bad) or drug makers are clever enough to predict which medications are likely to be successful.
Publication bias and selective outcome reporting may affect more than 50 medical indications, including depression, attention-deficit disorder, Alzheimer's dementia, pain, diabetes, anxiety, migraine, ulcer, cardiovascular disease (CVD), irritable bowel, influenza, urinary incontinence and elevated cholesterol.9 The reason for publication bias is not only that journals reject papers with negative findings; in addition, authors tend not to submit such papers. Although the evidence in each case is not strong, the weight of evidence is convincing.9
How does misinformation arise in a RCT?
Claims about pharmaceuticals are often at the centre of charges of medical misinformation (table 1). Many medications have been associated with serious adverse events or harmful drug interactions that could or should have been identified in a RCT. Off-label marketing and misleading advertising can be used to market a medication for an indication not adequately supported by a RCT.
Yet there is a tendency to dismiss the idea that medications are associated with misinformation because pharmaceutical companies rely on RCT and it is hard to see how a RCT—the ‘gold standard’ of medical research—could be wrong. To evaluate how misinformation can arise even in the context of a large and seemingly well-designed RCT, we examine evidence that supports the use of rosuvastatin in asymptomatic people.10
The ‘Justification for Use of Statins in Prevention: an Intervention Trial Evaluating Rosuvastatin’ (JUPITER) randomly assigned 17 802 apparently healthy men and women to rosuvastatin or placebo.10 Subjects were eligible for enrolment if they had no history of CVD and normal low-density lipoprotein (LDL) cholesterol levels, but elevated levels of C-reactive protein (CRP). The use of CRP as an entry criterion was controversial; CRP was interpreted as an inflammatory marker, but critics later pointed out that the first author holds a patent on a CRP test.10
JUPITER claimed a clear benefit for healthy people taking rosuvastatin.10 The primary endpoint was the first occurrence of a CVD event and rosuvastatin resulted in a 43% reduction in this endpoint. Yet only 393 CVD events were observed, perhaps because healthy people had been enrolled. Furthermore, median follow-up was only 1.9 years because an independent data safety and monitoring board closed the trial, citing an unspecified stopping rule.10
Replication of the JUPITER trial was problematical. AURORA enrolled 2776 patients with end-stage renal disease on maintenance haemodialysis, and patients were randomly assigned to placebo or rosuvastatin.11 After 3.2 years of follow-up, rosuvastatin reduced LDL-cholesterol by 43%, but there was no corresponding reduction in morbidity or mortality, even in this sickly population.11
It was probably a major error to treat asymptomatic people in the JUPITER trial; the perception of benefits and risks associated with statins has evolved. Physicians now question the wisdom of lowering LDL-cholesterol to a specific target, because cholesterol lowering is seldom associated with benefit and can be associated with risk.12 There is also little evidence that all-cause mortality is reduced by statins; a meta-analysis of mortality in 65 229 patients concluded that statins are not associated with significant survival benefit.13 Ancillary benefits of statins are also minimal; among 2.0 million people in the UK, statins were not associated with a reduction in the risk of rheumatoid arthritis, venous thromboembolism, dementia, Parkinson's, osteoporotic fracture, or most cancers.14 Although the oesophageal cancer risk was significantly reduced by statins, the risk increased significantly for acute renal failure, liver dysfunction, myopathy and cataract.14
The early termination of JUPITER was almost certainly a second major error; truncated trials are known to have implausibly large benefits.15 16 Comparison of 91 truncated RCT with 424 similar, non-truncated RCT found that risk appears 29% lower in truncated trials.16 Risk underestimation is worse when fewer than 500 outcome events are observed.16 Truncated trials also shorten the interval during which adverse events associated with treatment can be identified. In short, early stopping overestimates benefit and underestimates harm.16 Early stopping is apparently more common in industry-funded trials; among 143 trials stopped early for benefit, average recruitment was only 63% of the planned sample, and median follow-up was just 13 months.17 Truncated trials often fail to report what informed the decision to stop early and show implausibly large treatment effects, and they are becoming more common,17 while costing the sponsor less money to complete. This is a worrisome confluence of market forces.
It was probably a third major error to use CRP as an entry criterion for JUPITER.15 Mortality in JUPITER was less than expected, even in the placebo arm,10 which demonstrates that CRP is a poor predictor of CVD risk. Meta-analysis of 83 RCT found that elevated CRP is associated with just a 19% increase in the risk of CVD events, among 61 684 patients with coronary artery disease.18
Is JUPITER tainted by misinformation? One analysis concluded that it is:19
JUPITER was terminated for arbitrary and unspecified reasons: there were only 240 unambiguous CVD events among 17 802 study participants, which is unexpectedly low, and there was apparently no difference between medication and placebo in terms of mortality from myocardial infarction plus stroke;
Some findings may have been a statistical fluke: the proportion of myocardial infarctions that were fatal was 9% in untreated patients and 29% in treated patients, when 50% would be expected;
The role of the sponsor was pervasive: the ‘sponsor collected the trial data and monitored the study sites’, although the manuscript was not subject to sponsor approval;
Serious conflicts of interest were minimised in the disclosure statement.
In general, what problems can distort the results of a RCT? Without doubt, the worst problem is lack of statistical power, and this is aggravated if studies are truncated. Random fluctuations in endpoint occurrence among placebo-treated patients can have an enormous impact on the calculated benefits of treatment. The exclusion of specific patients from a study can mean the sample is a poor surrogate for the general population, and any protocol violation is problematical, especially if study participants are excluded in a systematic manner.
Additional problems that distort RCT have been described:5 stretching protocol-specified inclusion or exclusion criteria to accommodate the realities of patient recruitment, so that ‘found’ patients can be included or troublesome patients excluded; using novel post-hoc analyses, to highlight unanticipated relationships; reporting results for hypotheses undescribed in the protocol, especially if such hypotheses are reported as if they were primary; and altering study-related definitions to increase sample size or otherwise increase statistical significance.5
Misinformation results from error and fraud
Analysis of retracted papers may yield insights into the nature of medical misinformation.2 We do not propose that retracted papers are a random sample of papers with misinformation; retracted papers may be more naive, more egregious, or more obviously wrong, otherwise they might never be retracted. Yet, in the end, papers with misinformation may be more similar to retracted papers than to papers that stand the test of time.
Error is the most common reason for retraction,3 and it is fair to assume that error is the most common cause of misinformation overall. Nevertheless, 26.6% of retracted papers are fraudulent,3 and it is also fair to assume that at least some non-retracted misinformation arises through fraud. We postulate that error and fraud are the main sources of scientific misinformation and that error is more prevalent than fraud. Retraction of a paper removes misinformation from the literature; bias is non-retracted misinformation.
This definition of bias explicitly avoids any notion that bias relates to the correctness or relative merit of competing viewpoints. Every person who comes to a paper—whether an author, an editor, a referee, or a reader—brings a different viewpoint. Who can know which viewpoint is the more correct? Therefore, any definition of bias that hinges on a claim that one viewpoint is more correct than another is doomed to fail, because the claim of correctness is moot until long after publication.
Bias arises when scientific impropriety results in false research findings.5 Impropriety can involve experimental design, data collection, data analysis, or data presentation. Some forms of impropriety that result in bias should probably also result in retraction, as bias can potentially result from data fabrication or falsification. Yet bias can also result from earnest error, statistical naiveté, or other innocent causes; not all bias is fraud. Efforts to ‘prove’ bias in the literature have largely failed to prove anything more insidious than error,3 with a few notable exceptions.9
Journals are claimed to be part of the marketing department of pharmaceutical companies,20 yet studies demonstrating that authors have a pecuniary interest in trials are provocative, not definitive. If an investigator believes in a treatment, then he or she may invest in that treatment, either financially or emotionally. This may explain why author conclusions in RCT tend to favour interventions, even if financial conflicts are openly declared.21
The fact that drug company RCT favour intervention does not prove that such trials are biased; companies are shrewd enough to back a winner. Industry-funded trials are positive 85% of the time, whereas government-funded trials are positive 50% of the time. This may relate to the fact that clinical trials funded by industry are usually late-phase trials,22 which are more likely to have a positive outcome because more is known at trial inception. Nevertheless, industry is more likely to fund trials that use an active comparator (not placebo), and it is harder to prove efficacy against an active comparator, which may be why industry trials are often large and multicentre.22 Logistic difficulties associated with large or complex industry trials may explain why publication is often slower following completion.22 Finally, legal considerations also enforce caution; academic scientists who err in a clinical trial are largely immune from prosecution, but a company that makes an error may lose millions of dollars.
Selective outcome reporting is apparently common in RCT. Among 122 papers evaluated, which reported 3736 outcomes from 102 clinical trials, 50% of efficacy outcomes and 65% of harm outcomes per trial were incompletely reported.23 Yet complete reporting would require reporting details about 31 outcomes per paper (=3736/122); neither editors nor readers would be likely to tolerate such a laundry list of outcomes. Furthermore, when so many outcomes are evaluated, the repeated measures problem is intractable. Therefore, it should not be surprising that, ‘statistically significant outcomes had a higher odds of being fully reported compared with nonsignificant outcomes’.23
It has been postulated that many authors ‘spin’ results, resulting in an inaccurate description of study findings.24 Yet spin is defined in such a way that every author is guilty of it; spin occurs when authors ‘shape the impression of their results for readers….’ ‘Spin’ can involve arguments that seek to minimise the impact of negative information; among 72 published RCT in which every primary outcome was found to be statistically non-significant, several strategies were used to argue that treatment was still beneficial.24 For example, attention was directed to significant within-group comparisons that may have been accidental findings. Up to 40% of papers had multiple types of ‘spin’ identified, although the definition of spin was unacceptably broad. Nonetheless, ‘spin’ may not be very important; papers with spin were published in low-impact journals and cited just four times.24 If spin is published, it seems as much the fault of editors and referees as of the researcher called upon to write about trivia.
How can misinformation be minimised?
It is obvious why scientists care about minimising misinformation in the literature; future progress can only be built upon a solid foundation of rigorously tested, carefully replicated science. It may be less obvious why other entities—pharmaceutical companies, medical writers, marketing consultants and so on—must have the same goal, because financial interests are not as obviously contingent upon minimising misinformation. Yet medical misinformation harms everyone, not least because it reduces public trust in the enterprise.
What measures might reduce misinformation, especially in the RCT literature? Clinical trial registration was a crucial first step, but trials should probably be registered in greater detail.5 For example, central registration of experimental design and analytical strategy would help authors resist the temptation to alter a protocol a posteriori. Clear inclusion and exclusion criteria should be stipulated, with definitions of primary endpoints; if such records became available in a public database, this would make data fabrication and falsification more easily detected. It may also be necessary to grant public access to published study data,9 perhaps in an anonymised database. If anonymised patient data were routinely made available, this might prevent some fraud, because fabricating results would entail fabricating a large volume of material in a database.
Clear statistical criteria should be used in every clinical trial, most especially in calculating the required sample size. If study enrolment is calculated based on power considerations, then that enrolment total should be obtained unless patient safety precludes it. Many outcome events should accrue before stopping a trial,25 and stopping rules should be very conservative. There are ethical problems associated with early stopping; truncated trials tend to overestimate treatment efficacy, and inflated estimates of efficacy put patients at risk.25
Responsibilities for data integrity must be clearly articulated.2 Named authors should be responsible for every aspect of a paper that bears their name; authors must be intimately familiar with the data underlying a paper and must be able to vouch for its integrity. Co-authors cannot be routinely excused for fraud on the part of the first author. The benefit of a successful paper accrues to all authors, so the blame for a fraudulent paper should diffuse as widely.2
The potentially corrosive influence of money on outcome in RCT must be acknowledged. Such influence can come in the form of how RCT are structured,19 terminated,16 17 25 analysed,5 drafted,26 reported,7–9 21–24 or publicised.20 It is not possible to purge money from the system, but it is essential to make the flow of money transparent, so that potential influences can be identified. Each published RCT must address the issues of financing: Who paid for the trial? What did each author do? Who paid the authors? Did the authors have pecuniary interests in the study? What roles were filled by non-authors? Who can vouch for the underlying data? The flow and potential influence of money must be obvious to every reader.
When the system to minimise misinformation fails, as it surely will, readers must be fully alerted; currently, 31.8% of retracted papers are not noted as such in the literature.3 There should also be greater clarity as to the reason(s) for retraction, to educate fledgling authors about normal publishing practices. This is crucial when misinformation might plausibly arise from the cultural difference between science and marketing; science typically does not acknowledge the profit motive in research, whereas marketing typically does not acknowledge the uncertainty of scientific findings.
Many questions remain to be resolved. Which medications are supported by misinformation? To whom do pharmaceutical companies owe first allegiance; shareholders or patients? How common is misinformation in the non-pharmaceutical literature? How many cases of perceived bias are actually cases of research fraud? In striving to address these questions, we must remember that misinformation can arise without malicious intent; we owe authors of incorrect papers a presumption of incompetence, not malice.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.