This study provides current data on key questions about retraction of scientific articles. Findings confirm that the rate of retractions remains low but is increasing. The most commonly cited reason for retraction was research error or inability to reproduce results; the rate from research misconduct is an underestimate, since some retractions necessitated by research misconduct were reported as being due to inability to reproduce. Retraction by parties other than authors is increasing, especially for research misconduct. Although retractions are on average occurring sooner after publication than in the past, citation analysis shows that they are not being recognised by subsequent users of the work. Findings suggest that editors and institutional officials are taking more responsibility for correcting the scientific record but that reasons published in the retraction notice are not always reliable. More aggressive means of notification to the scientific community appear to be necessary.
Statistics from Altmetric.com
Previous studies of retraction of scientific articles have shown that the dominant reason for retraction is research error or inability to reproduce results (generally referred to as “inability to reproduce”).1 2 Citation continues long after retraction, raising questions about the adequacy of current methods of notification.3 Here we update these findings. We report changes in rate and agent of retraction and in length of time between publication and retraction and examine the role of publicity on post-retraction citation rate.
The National Library of Medicine’s PubMed database identifies retracted articles. Retractions for articles published in the decade 1995–2004 were retrieved on 9 June 2005 using publication type “retraction in”. Each case was coded to indicate who retracted the publication, the portion (part or whole) retracted, the reason for retraction and the length of time in months between publication and retraction. Ten per cent of the records were coded independently, yielding a 3% disagreement, which was negotiated to agreement.
The impact factor for the journal in which the article was published and retracted was obtained from Journal Citation Reports (2003).4 A journal’s impact factor is the number of citations in the current year to papers published in the previous 2 years, divided by the total number published in the same 2 years.5 Those with the highest impact factor are considered the most prestigious and their papers are likely to be widely cited, making it important that retractions be noted. PubMed citations for the years 1995–2004 to the search terms “retraction”, “research misconduct”, “research fraud” and “plagiarism” reported in the news sections of Science and Nature and matched to cases in our data set were retrieved as proxy measures of publicity to the scientific community.
Citation activity was obtained from Science Citation Index (SCI) and Social Science Citation Index (SSCI), through the Web of Science (http://portal.isiknowledge.com/portal.cgi?Init = Yes& SID = E2ni9cFg43Mik3akg6g). In addition to citation activity before and after each retraction, a subset of all cases in the biomedical sciences in which the full article was retracted for reasons of confirmed data fabrication/falsification, research error or inability to reproduce was examined. Those articles cited at least 10 times 4 years or more after retraction (that is, with high late post-retraction citation rates) were analysed. In these cases, each citation was checked for whether it acknowledged the retraction. Descriptive statistics, analysis of frequency, regression and correlation analysis were used to analyse the data.
The number of records in PubMed for the study period was 5 041 587; of these, 328 (0.0065%) were retracted. We analysed the 315 cases (96%) that were in English. As shown in table 1, the largest proportion of retractions were by all authors, followed by some authors and by editors/publishers. The primary reasons for retraction were research error, inability to reproduce, research misconduct and plagiarism. Usually (in 90% of cases) the whole article was retracted. In the span of 10 years, the 315 retracted articles cumulatively were cited 3942 times before retraction and 4501 times after retraction.
Table 2 presents months between publication and retraction, citations before and after retraction and journal impact factor. Thirty-two cases of retraction (10%) resulted in publicity in the news sections of Science or Nature, most frequently reporting on cases of research misconduct.
Compared with an earlier study of retractions for the years 1966–1994,1 the present study found (1) a significantly higher rate of retraction (0.0021% versus 0.0065%, p<0.0001 by test of proportions), (2) a decrease in the mean time from publication to retraction (from 28 months to 21 months), and (3) a decrease in the proportion of retractions initiated by authors (from 81% to 67%) (see table 1), suggesting that others have become involved in oversight.
Continued citation of invalid, retracted work, already documented in 19903 and in 1998,1 remains a problem, even though retractions are now clearly identified in the Medline and PubMed databases.6 Our findings show that the studies highly cited before retraction remained highly cited post-retraction (r = 0.60, p<0.0001), with those in higher-impact journals more highly cited after retraction (r = 0.61, p<0.0001). Differences in the average number of citations before (F = 5.39, p<0.0001) and after retraction (F = 3.02, p = 0.0113) among the six reasons for retraction (table 3) were statistically significant, with significant differences between research misconduct and plagiarism (t = 2.90, p = 0.0040) and between inability to reproduce and plagiarism (t = 2.50, p = 0.0131) after retraction. The average number of citations of manuscripts whose retractions had publicity (29.38) was significantly larger than of those with no publicity (14.27) (t = 2.98, p = 0.0031), reflecting the high impact factor of the journals (Science, Nature) in which publication was tracked.
There were 10 papers with high late post-retraction citation rates and these were cited a total of 225 times 4 or more years after retraction (range 10–96). In nine of these cases, the rate of acknowledgement of the retraction in these late citations was less than 3%. The paper with 96 late post-retraction citations dealt with a supposedly settled controversy with significant environmental implications7 and was immediately questioned by other scientists; the retraction of this paper was acknowledged in 29% of the post-retraction citations.
DISCUSSION AND CONCLUSIONS
An early study by Parrish noted that retraction notices often obfuscated the reasons for retraction, particularly in cases of research misconduct8 Things haven’t changed. We matched retractions required by findings of confirmed data fabrication and/or falsification by the US Office of Research Integrity during the time frame of this study (1995–2004). Twenty-six such retractions were found (26/315), involving 17 authors. Twelve retraction notices straightforwardly cited data fabrication or falsification as the reason for retraction. The rest (14) gave as reasons “unable to reproduce”, “data are invalid”, “for complicated reasons”, “not supportable by reproducible evidence” or “appears to have been falsified” (without identifying which author was at fault), with the remainder offering no reason for the retraction. Since other agencies also sanction authors for research misconduct, a more thorough study of this issue would extend beyond the Office of Research Integrity.
In summary, rates of retraction, although very low, have increased between the periods 1966–1997 and 1995–2004, and retractions are occurring more quickly and are more frequently initiated by parties other than authors. The post-retraction citation rate remains high, especially associated with retraction for reasons of research misconduct and inability to reproduce. Some9 suggest that journals should require authors to attest that they have checked their manuscripts’ reference list against the National Library of Medicine (NLM) master list of retracted articles, perhaps by way of a web-based program.
These findings raise implications for the scientific community, institutions, professional associations and journals. International standards are clear that the text of the retraction should explain why the article is being retracted;10 these standards are in some instances not followed. A commitment of journals to publish retraction notices for a period of time (instead of just once or not at all) would be likely to better inform the scientific community that the findings from those papers should not be relied upon. Professional associations should strongly support these efforts through their codes of conduct and through policies of the journals they sponsor. While some error is inevitable, the public’s trust that error is being minimised is important to justify their continued investment in science.
Further study of the impact of continued use of retracted data on end users such as patients (who may be harmed) and on the use of research resources (which could be wasted) as well as on downstream research that depends on upstream research will provide a perspective on costs of the current system relied upon for correction of the scientific record. Asking those who cite retracted articles why they continue to cite and testing new mechanisms for effective notification of the scientific community will provide options for improving on current practices. Better understanding of the myriad kinds of research error and inability to reproduce will be likely to help further define best scientific practices.
Competing interests: None declared.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.