Editorial Assessment
The author uses 12 previously reported estimates from studies that focus on different research quality characteristics and construct samples from different literatures to estimate that approximately 1 in 7 scientific papers are “fake.” All three reviewers, however, call into question the estimate’s accuracy, and the article itself notes reasons to be skeptical. Even setting aside the intrinsic difficulties given the available evidence, the article does not use a systematic or rigorous method to compute the reported estimate. Thus, the reported estimate could be overstated or understated. The author also argues that the proportion of scientific outputs that are fake is a more relevant statistic than the oft-cited percentage of scientists who admit to faking or plagiarizing (Fanelli, 2009). The author calls for better recognition of the problem and better funding so that metaresearchers can conduct large-scale studies capable of producing more reliable overall estimates. The reviewers noted some strengths. For example, two reviewers noted that the research question is important and that updated estimates are needed. One reviewer noted the importance of understanding the increase in the percentage of fake scientific outputs given changes in available technology helpful in committing fraud and found the estimate of 1 in 7 urgently concerning despite its roughness. The reviewers also point to weaknesses. Reviewer 1 worries that no published estimate tells us much about the overall proportion of fake studies. This reviewer proposes that the author take a different approach by determining which data are needed to accurately estimate the proportion, collecting that data, and using it to compute a reliable estimate. This reviewer also suggests adding references to support claims made throughout the article. The second co-authored review report notes three concerns. First, the co-reviewers emphasize that the author calls his own claims into question. Second, they argue that the author is incorrect in claiming that his article is “in opposition” to Fanelli (2009) because both articles fail to provide a reliable estimate of the amount of scientific output that is fake. Finally, the co-reviewers draw inferences from a dataset they constructed to argue that the author is incorrect in his characterizations of how others have interpretated Fanelli (2009). Reviewer 3 notes that the author’s focus on articles rather than scientists deemphasizes the important human dimension of fakery. This reviewer suggests emphasizing reputational harm caused by false positives. In sum, all three reviewers are unpersuaded by the author’s claim that approximately 1 in 7 scientific papers are fake.
Recommendations from the Editor
The value of the article lies not in its too-roughly calculated estimate but in its attempt to highlight both an important yet unanswered question and the difficulties that hinder our ability to reliably answer it. The article also provides a useful summary of the bourgeoning literature and the challenges of drawing broad inferences from it. The author should consider highlighting these points rather than the rough estimate of the rate of falsification and fabrication. The author should change the article’s title to reflect the skepticism about the estimate that runs throughout the article so as not to confuse readers about what we can reliably take away from the article.
The following are specific suggestions:
Adding reference or links to Table 1 would help readers find details related to the listed items.
p. 6 (“The following (Table 1) is a selection of events which took place after the figure above was established.”): Clarify why 2005 is the first year of interest in the events table (e.g., change sentence to “The following (Table 1) is a selection of events that took place during or after 2005, the final year of publication of the studies Fanelli used to compute the 2% figure.”).
p. 7 (“Significantly, all of the above happened after the figure of 2% was collected.”): change to “… after publication of the studies on which the 2% figure is based.”
The link in footnote 5 no longer works. The report can be found at https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1262&context=scholcom. Consider changing all links to permalinks.
p. 16: list the 29 observations of data sleuth estimates in a footnote.
Footnote 6: the link pulls up the Retraction Watch Database for Nature Publishing Group. I accessed the link on Jan 15, 2025, and the database found 1,610 items. It’s not clear how you computed 667 (12,000 / 667 = 18). If “Retracted Article” is chosen for “Article Type(s),” the count is 1.
p. 18 (“the presumably higher number of papers containing questionable research practices (which are far more commonly admitted to) is presumably higher still.”): Consider citing to published estimates, which are mostly produced using surveys, and note that this literature suffers from the same problems as the literature that estimates FF.
p. 18: add citations to articles that address each of the harms caused by false results.
Bottom of p. 19 (“In particular, it seems likely that FF rates change by individual field – in doing so, they may present specific rather than general threats to human health and scientific progress.”): Providing some explanation for field-specific rates might help the reader assess the claim. For example, it’s possible that rates are similar across fields because those willing to commit fraud or to fabricate data likely randomly distribute themselves across fields, and journal editors and referees are roughly equally likely to fail to detect falsification and fabrication. Does any evidence call these possibilities into question?
Competing interests: none
Editors
Senior Editor
Kathryn Zeiler
Handling Editor
Kathryn Zeiler
Peer Review 1
This manuscript attempts to provide an answer to the proportion of scientific papers that are fake. The presence of fake scientific papers in the literature is a serious problem, as the author outlines. Papers of variable quality and significance will inevitably be published, but most researchers assess manuscripts and papers based on the assumption that the described research took place. Papers that disguise their identities as fake papers can therefore be highly damaging to research efforts, by preventing accurate assessments of research quality and significance, and by encouraging future research that could consume time and other resources. As the manuscript describes, fake papers are also damaging to science by eroding trust in the scientific method and communities of scientists.
It is therefore clear that knowing the proportion of fake scientific papers is important, that the author is concerned about the problem, and that the author wants to arrive at an answer. However, as the manuscript partly recognises, the question of the overall proportion of fake scientific papers is currently difficult to answer.
The overall proportion of fake papers in science will represent the individual proportions of fake papers in different scientific disciplines. In turn, the proportions of fake papers in any single discipline will reflect many factors, including (i) researcher incentives to produce fake papers, (ii) the ease with which fake papers can be produced and (iii) published, (iv) the ease or likelihood of fake papers being detected, before or (v) after publication, and (vi) the consequences for authors if they are found to have published fake papers. Some of these factors are likely to vary between different disciplines and in different research settings. For example, it has been suggested that it is similarly difficult to invent some research results as it is to produce genuine data. However, in other fields, it is easier to invent data than to generate data through experiments that remain difficult, expensive and/or slow. It is also likely that factors such as the capacity to invent fake papers, detect fake papers, as well as incentives and consequences for researchers could vary over time, particularly in response to generative AI.
As someone who studies errors in scientific papers, I don’t believe that we currently have a good understanding of the proportions of fake papers in any individual scientific field, at any time. There are some fields where we have estimates of individual error types, but these error types are likely to wrongly estimate the overall proportions of fake papers. Rather than attempting to answer the question of the overall proportion of fake scientific papers in the absence of the necessary data, it seems preferable to describe how we could obtain the data that we need to answer this question. While the overall proportion of fake scientific papers is an important statistic, most scientists will also be more concerned about how many fake papers exist in their own fields. We could therefore start by trying to obtain reliable estimates of fake papers in individual fields, working out how we need to do this, and then carrying out the necessary research. In the absence of reliable data, it’s perhaps most important that researchers are aware that fake papers could exist in their fields, so that all researchers can assess papers more carefully.
Beyond these broad considerations, the following manuscript elements could be reconsidered.
Fake science is defined as fabricated or falsified, yet this definition is sometimes expanded to include plagiarism (page 8, Table 2). However, plagiarism doesn’t equate with faking or falsifying data, and some plagiarised articles could describe sound data. Including plagiarised articles as fake articles will inevitably inflate estimates of fake papers, particularly in fields with higher rates of plagiarism.
Table 1 was stated to represent “a selection of events that took place after the figure above (ie the figure published by Fanelli (2009)) was established”, yet some listed references/ events were published/ occurred between 2005 and 2008.
It is reasonable to expect that increased capacity to autogenerate text and images will increase the numbers of fake papers, but I’m not aware of any evidence to support this. No reference is cited.
Table 2; “similar survey results”: it’s not clear how the listed studies are similar.
There are many unreferenced statements, eg page 9, “most rejected papers are published, just elsewhere”, page 19.
Some estimates of fake papers arise from small sample sizes (eg page 13).
The statement “The accumulation of papers assembled here is, frankly, haphazard” doesn’t inspire confidence in the resulting estimate.
“…it would be prudent to immediately reproduce the result presented here as a formal systematic review”- any systematic review seems premature without reliable estimates.
“The false positive rate (FPR) of detecting fake science is almost certainly quite low”- this seems unlikely to be correct. False positive rates depend on the methods used. Different methods will be required to detect fake papers in different disciplines, and these different methods could have very different false positive rates, particularly when comparing the application of manual versus automated methods that are applied without manual checking.
Page 2: I could not see the n=12 studies summarised in a single Table.
Page 10: “All relevant studies were included”…. “The list below is comprehensive but not necessarily exhaustive”- these statements contradict each other.
Disclosure: Jennifer Byrne receives NHMRC grant funding to study the integrity of molecular cancer research publications.
Peer Review 2
Review written in collaboration with Maha Said (Orcid) and Frederique Bordignon (Orcid)
The title of the article makes a simple striking claim about the state of the scientific literature with a numerical estimate of the proportion of “fake” articles. Yet, by contrast to this title, in the text of the article, Heathers is highly critical of his own work.
James’ peer review of Heathers’ article
James Heathers often mentions the limitations of his research thus “peer-reviewing” his own article to the extent that he admits that this work is “incomplete”, “unsystematic” and “far flung”.
“This work is too incomplete to support responsible meta-analysis, and research that could more accurately define this figure does not exist yet. ~1 in 7 papers being fake represents an existential threat to the scientific enterprise.”
“While this is highly unsystematic, it produced a substantially higher figure. Correspondents reliably estimated 1-5% of all papers contain fabricated data, and 2-10% contain falsified results.”
“These values are too disparate to meta-analyze responsibly, and support only the briefest form of numerical summary: n=12 papers return n=16 individual estimates; these have a median of 13.95%, and 9 out of 16 of these estimates are between 13.4% and 16.9%. Given this, a rough approximation is that for any given corpus of papers, 1 in 7 (i.e. 14.3%) contain errors consistent with faking in at least one identifiable element.”
“The accumulation of papers collected here is, frankly, haphazard. It does not represent a mature body of literature. The papers use different methods of analyzing figures, data, or other features of scientific publications. They do not distinguish well between papers that have small problematic elements which are fake, or fake in their entirety. They analyze both small and large corpora of papers, which are in different areas of study and in journals of different scientific quality – and this greatly changes base rates;…”
“As a consequence, it would be prudent to immediately reproduce the result presented here as a formal systematic review. It is possible further figures are available after an exhaustive search, and also that pre registered analytical assumptions would modify the estimations presented.”
Heathers has also in an interview published in Retraction Watch (Chawla 2024) acknowledged pitfalls in this article such as:
“Heathers said he decided to conduct his study as a meta-analysis because his figures are “far flung.””
“They are a little bit from everywhere; it’s wildly nonsystematic as a piece of work,” he said.”
“Heathers acknowledged those limitations but argued that he had to conduct the analysis with the data that exist. “If we waited for the resources necessary to be able to do really big systematic treatments of a problem like this within a specific area, I think we’d be waiting far too long,” he said. “This is crucially underfunded.”
Built in opposition to Fanelli 2009, but it’s illogical
Heathers states in the abstract that his article is “in opposition” to Fanelli’s 2009 PloS One article (Fanelli 2009), yet that opposition is illogical and artificially constructed since there is no contradiction between 2% of scientists self-reporting having taking part in fabrication or falsification and an eventual much higher proportion of “fake scientific outputs”. Like most of what is wrong with Heather’s article, this is in fact acknowledged by the author who notes that the 2% figure “leaves us with no estimate of how much scientific output is fake” (bias in self-reporting, possibility of prolific authors, etc).
Fanelli 2009 is not cited in the way JH says it is cited
Whilst the opposition discussed above is illogical, it could be that the 2% figure is mis-cited by others as representing an estimate of fake scientific outputs thus probably underestimating the extent of fraud. Heathers suggests that this may indeed be the case, but also contradicts himself about how (Fanelli 2009), or the 2% figure coming from that publication, is typically used.
In one sentence, he writes that “the figure is overwhelmingly the salient cited fact in its 1513 citations” and that “this generally appears as some variant of “about 2% of scientists admitted to have fabricated, falsified or modified data or results at least once” (Frank et al. 2023)
whilst and in another sentence, he writes that “the typical phraseology used to express it – e.g. “the most serious types of misconduct, fabrication and falsification (i.e., data fraud), are relatively rare” (George 2016).
Those two sentences cited by Heathers are fundamentally different, the first one accurately reports that the 2% figure relates to individuals self-reporting, whilst the second one appears to relate to the prevalence of misconducts in the literature itself. How Fanelli 2009 is cited in the literature is an empirical question that can be studied by looking at citation contexts beyond the two examples given by Heathers. Given that a central justification for Heathers’ piece appears to be the misuse of this 2% figure, we sought to test whether this was the case.
A first surprise was that whilst the sentence attributed to (George 2016) can indeed be found in that publication (in the abstract), first it is not in a sentence citing (Fanelli 2009) nor the 2% figure, and, second, it is quoted selectively omitting a part of the sentence that nuances it considerably: “The evidence on prevalence is unreliable and fraught with definitional problems and with study design issues. Nevertheless, the evidence taken as a whole seems to suggest that cases of the most serious types of misconduct, fabrication and falsification (i.e., data fraud), are relatively rare but that other types of questionable research practices are quite common.” (Fanelli 2009) is discussed extensively by (George 2016), and some of the caveats, e.g. on self-reporting, are highlighted.
To go beyond those two examples, we constructed a comprehensive corpus of citation contexts, defined as the textual environment surrounding a paper's citation, including several words or sentences before and after the citation (see Methods section below). 737 citation contexts could be analysed. Out of those, the vast majority (533, or 72%) did not cite the 2% figure. Instead, they often referred to this article as a general reference together with other articles to make a broad point, or, focused on other numbers in particular those related to questionable research practices (Bordignon, Said, and Levy 2024). The 28% (204) citation contexts that did mention the 2% figure did so accurately in the majority of cases: 83% (170) of those did mention that it was self-reporting by scientists whilst 17% (34) of those, or 5% of the total citation contexts analysed were either ambiguous or misleading in that they suggested or claimed that the 2% figure related to scientific outputs.
Although the analysis above does not include all citation contexts, it is possible to conclude unambiguously that the 2% figure is not overwhelmingly the salient cited fact in relation to Fanelli 2009, and that when it is cited it is often accurately, i.e. as representing self-reporting by scientists. Whilst an exhaustive analysis is beyond the scope of this peer review, it is not uncommon to find in this corpus citations contexts that have an alarming tone about the seriousness of the problem of FFPs, e.g. “…a meta-analysis (Fanelli 2009) suggest that the few cases that do surface represent only the tip of a large iceberg." [DOI: 10.1177/0022034510384627]
Thus, the rationale for Heathers’ study appears to be misguided. The supposed lack of attention for the very serious problem of FFPs is not due to a minimisation of the situation fueled by a misinterpretation of Fanelli 2009. Importantly, even if that was the case, an attempt to draw attention by claiming that 1 in 7 papers are fake, a claim which according to the author himself is not grounded in solid facts, is not how the scientific literature should be used.
Methods for the construction of the corpus of citation contexts
We used Semantic Scholar, an academic database encompassing over 200 million scholarly documents from diverse sources including publishers, data providers, and web crawlers. Using the specific paper identifier for Fanelli's 2009 publication (d9db67acc223c9bd9b8c1d4969dc105409c6dfef), we queried the Semantic Scholar API to retrieve available citation contexts. Citation contexts were extracted from the "contexts" field within the JSON response pages, (see technical specifications).
The query looks like this: semanticscholar.org
The broad coverage of Semantic Scholar does not imply that citation contexts are always retrieved. The Semantic Scholar API provided citation contexts for only 48% of the 1452 documents citing the paper. To get more, we identified open access papers among the remaining 52% citing papers, retrieved their PDF location and downloaded the files. We used Unpaywall API, which is a database to be queried with a DOI in order to get open access information about a document. The query looks like this.
We downloaded 266 PDF files and converted them to text format using an online bulk PDF-to-text converter. These files were then processed using TXM, a specialized textual analysis tool. We used its concordancer function to identify the term "Fanelli" as a pivot term and check the reference being the good one (the 2009 paper in PlosOne). We did manual cleaning and appended the citation contexts to the previous corpus.
Through this comprehensive methodology, we ultimately identified 824 citation contexts, representing 54% (784) of all documents citing Fanelli's 2009 paper. This corpus comprised 48% of contexts retrieved from Semantic Scholar and an additional 6% obtained through semi-manual extraction from open access documents. 87 of those contexts were excluded from the analysis for a range of reasons including: context too short to conclude, language neither English nor French (shared languages of the authors of this review), duplicate documents (e.g. preprints), etc, leaving us with 737 contexts. They were first classified manually in two categories, those mentioning the 2% figure and those which did not. Then, for the first category, they were further classified manually in two categories depending on whether the figure was appropriately assigned to self-reporting of researchers or rather misleadingly suggesting that the 2% applied to research outputs.
The reviewers have no competing interests to declare.
Contributions
Investigation: FB collected the citation contexts.
Data curation and formal analysis: RL and MS
Writing – review & editing: RL, MS and FB
References
Bordignon, Frederique, Maha Said, and Raphael Levy. 2024. “Citation Contexts of [How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data, DOI: 10.1371/Journal.Pone.0005738].” Zenodo. https://doi.org/10.5281/zenodo.14417422.
Chawla, Dalmeet Singh. 2024. “1 in 7 Scientific Papers Is Fake, Suggests Study That Author Calls ‘Wildly Nonsystematic.’” Retraction Watch (blog). September 24, 2024. https://retractionwatch.com/2024/09/24/1-in-7-scientific-papers-is-fake-suggests-study-that-author-calls-wildly-nonsystematic/.
Fanelli, Daniele. 2009. “How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data.” PLOS ONE 4 (5): e5738. https://doi.org/10.1371/journal.pone.0005738.
Frank, Fabrice, Nans Florens, Gideon Meyerowitz-Katz, Jérôme Barriere, Éric Billy, Véronique Saada, Alexander Samuel, Jacques Robert, and Lonni Besançon. 2023. “Raising Concerns on Questionable Ethics Approvals - a Case Study of 456 Trials from the Institut Hospitalo-Universitaire Méditerranée Infection.” Research Integrity and Peer Review 8 (1): 9. https://doi.org/10.1186/s41073-023-00134-4.
George, Stephen L. 2016. “Research Misconduct and Data Fraud in Clinical Trials: Prevalence and Causal Factors.” International Journal of Clinical Oncology 21 (1): 15–21. https://doi.org/10.1007/s10147-015-0887-3.
Peer Review 3
The provocative essay written by James Heathers is a genuine attempt to quantify the current prevalence of two growing research malpractices, namely fabrication and falsification (FF for short), which are universally recognized as gross misconducts. The matter is of interest not only to researchers themselves (including meta-scientists), but also to general audiences, since taxpayers have a natural right to oversee the rewards of Science for the society at large. The underlying assumption of the author is that the generally accepted figure of 2% of researchers involved at least once in FF should now be considered as a lower bound. This 2% rate appeared in an article authored by Daniele Fanelli in 2009, and made an impact in the scholarly community. However, a lot of water has flowed under the bridge since then, and new actors showed up: papermills, sophisticated digital tools (intended for both data fabrication and FF tracking), whistleblowers communicating via social networks, generative artificial intelligence, etc. The update proposed by James Heathers is thus certainly welcome.
The other premise of the author is that the assessment of the proportion of faking scientists is not a suitable proxy. Instead, he preferred to address a tangential issue: the estimation of the rate of scholarly papers including fabricated or falsified data. According to the author, such an approach has more benefits than drawbacks, and could be, from an idealistic point of view, fully automated. One could agree, although the fear of seeing the building of an Orwellian machinery is never far away. At the end of the process, offending papers are retracted (assuming, again, an ideal world), while the authors of the flagged papers are jailed (metaphorically or not).
A survey of more recent studies was thus carried out. Although the author acknowledges that the small sample size for his study (N = 12), as well as the large dispersion of FF estimates retrieved from this corpus, do not allow a proper meta-analysis, an alarming figure of 14.3% for the updated FF rate emerges. Moreover, this figure is consistent with independent data reported by other sleuths engaged in the fight against questionable research practices, which are mentioned in the “discussion” section of the paper. Even if estimated in a rough way, the increase of FF in less than 15 years, if confirmed by other studies, is a real threat to Science, and should be addressed urgently.
The main value of this essay is thus to raise concerns about the fast growth of FF, rather than to provide an up-to-date FF rate, which is anyway probably impossible to obtain in a reliable manner. On the other hand, an obvious weakness of the study is the chosen target: by focusing his attention on papers, James Heathers is missing the human dimension of the academic endeavour. Indeed, authors and papers are entangled bodies, and like entangled particles, they are described by a single state involving both entities: a paper does not exist without authors, and authors are invisible if they do not publish on a regular basis.
Nowadays, scientific papers are extremely complex, and almost always impenetrable to researchers outside of the involved field. However, Homo academicus (as coined by Pierre Bourdieu) is also a very complex being. This is why, despite there is an unambiguous definition for FF, the false positive and negative rates of detecting FF are unknown, as recognized by James Heather. In particular, false positive detections can be detrimental to authors. This point is mentioned en passant in the essay, but should be emphasized: it is more than just a drawback of the used methodology, since it is related to the very human dimension of the scholarly enterprise.
Perhaps a complementary perspective of the work carried out by James Heathers could be based on the following example: James Ibers (1930-2021), an old-school chemist and influential crystallographer, wrote a memoir published by the American Crystallographic Association, shortly before his death.1 He describes how, as a freshman at Caltech, he attended a mandatory one-week orientation workshop. In his own words: “The most important message I took away was the Caltech Honor Code for all undergraduates. In its simplest terms: You can’t cheat in Science because you will eventually be found out. I have adhered to that Code as a husband, a father, a scientist, a teacher, a research director, and all others I have dealt with”. How many of us can ensure, without hesitation, that they stand next to Ibers? What is the tolerable threshold of cheaters in Science? 2%? 14.3%? More?
James Heathers ends his article with a worrying sentence: “Priorities must change, or science will start to die”. Perhaps, however, Science is already as dead as a dodo.
1 https://chemistry.northwestern.edu/documents/people/james_ibers.aca.memoir.2020.pdf
Declaration of competing interest. The author has no conflicts of interest to disclose.
Metadata
2