Peer Review Cracks Under AI Paper Flood

AI-generated research papers have crossed a dangerous threshold. They’re now good enough to pass as legitimate science, and that’s breaking the academic publishing system from the inside, according to The Verge AI.

The Verge AI reports on a case at the University of Zurich, where postdoctoral researcher Peter Degen traced a sudden spike in citations of his supervisor’s 2017 statistics paper. The trail led to a Guangzhou-based company on Bilibili selling tutorials that promise publishable research in under two hours using AI writing tools. The output: hundreds of formulaic papers churning predictions from the public Global Burden of Disease dataset, each one slightly different, each one technically novel.

What stands out here is the inversion at the heart of the problem. Better AI means worse science. When models hallucinated obvious nonsense (the infamous rat anatomy diagrams, the leftover “as an AI assistant” phrases), reviewers could catch the junk. Now the slop reads cleanly. It cites real papers. It analyzes real datasets. It just doesn’t mean anything.

The new paper mill playbook

Matt Spick, an associate editor at Scientific Reports, told The Verge AI he spotted the pattern after receiving three near-identical submissions analyzing the US NHANES dataset. The recipe is simple:

  • Grab a large public health dataset
  • Run every possible pairwise correlation
  • Find a pair nobody’s published on yet
  • Write it up as a discovery

That’s how you get “papers” linking years of education to postoperative hernia complications. Statistical noise, dressed as insight. Spick’s question lands hard: “What am I supposed to do with that? Leave school early so that I won’t get a postoperative hernia complication later?”

Why this matters now

Paper mills aren’t new. The black-market trade in authorship slots has run for a decade, with publishers and “science sleuths” playing whack-a-mole against fraudsters. Generative AI didn’t create the incentive structure. Academic careers still depend on publication counts, and desperate researchers still buy their way in.

What AI changed is the unit economics. Producing a convincing fake paper used to require effort. Now it requires a prompt. Degen put the consequence plainly: “It’s a huge burden on the peer-review system, which is already at the limit. There’s just too many papers being published and there’s not enough peer reviewers, and if the LLMs make it so much easier to mass produce papers, then this will reach a breaking point.”

The broader AI industry keeps pitching science as the killer use case. Accelerated discovery. Cancer cures. Drug pipelines compressed from years to months. Those promises rest on an assumption that the scientific record stays trustworthy enough to build on. If the literature fills with plausible-sounding garbage, downstream AI systems trained on that literature inherit the rot. Models eat papers. Papers feed models. The feedback loop is already running.

What practitioners should do

For anyone working at the intersection of AI and research:

  • Treat single-paper findings as weaker evidence. Especially correlations mined from public datasets. Demand replication or mechanism before building on a claim.
  • If you publish, expect more scrutiny. Editors are getting paranoid, and rightly so. Pre-register hypotheses. Share code. Make your work obviously not machine-generated by being obviously specific.
  • If you build AI tools for science, ship detection alongside generation. Selling “write a paper in two hours” without selling “catch papers written in two hours” makes you part of the problem.
  • Watch the publishers. Expect submission fees to rise, expect AI-disclosure requirements, expect retraction rates to spike before the system stabilizes.

What’s coming next

The sleuths will keep evolving their tools. Tortured-phrase detectors, image-duplication checks, citation-network analysis, template matching. Publishers will roll out AI-assisted screening, ironically using LLMs to filter LLM output. Some journals will collapse under the volume. Others will pivot to slower, more selective models with higher fees.

The deeper fix is the incentive system itself: tying careers to publication counts created the demand that AI now supplies at infinite scale. Until tenure committees and grant agencies change what they measure, the slop wins.

Full reporting at The Verge AI.

Scroll to Top