ArXiv just drew a hard line on sloppy AI use in scientific papers. According to The Verge AI, the preprint server announced it will hand out one-year bans to researchers who submit papers containing clear evidence that authors never bothered to check what their LLM produced. After the ban ends, those authors face an extra hurdle: any future arXiv submission must first clear a reputable peer-reviewed venue before it’s allowed back on the platform.
The Verge AI reports that arXiv’s updated Code of Conduct keeps the principle simple. Sign your name on a paper, own everything in it, no matter how the words got there. Hallucinated citations, biased passages, plagiarized text, broken references, factual errors: all of it lands on the authors, not the model.
What counts as “incontrovertible evidence”
ArXiv spelled out the kind of giveaways that will trigger penalties:
- Fabricated or hallucinated references that don’t exist in the real literature.
- Leftover meta-comments from the chatbot, like “here is a 200 word summary; would you like me to make any changes?”
- Instructional residue such as “the data in this table is illustrative, fill it in with the real numbers from your experiments.”
If reviewers spot artifacts like those, arXiv’s position is blunt: nothing else in the paper can be trusted either.
Why this matters
ArXiv has become the default front door for physics, computer science, and AI research itself. Most of the breakthrough papers practitioners read every week land there before they ever touch a journal. That makes the platform’s standards a de facto floor for the field.
The slop problem isn’t hypothetical. Over the past two years, reviewers across multiple disciplines have flagged papers stuffed with phantom citations, telltale chatbot phrases like “as an AI language model,” and entire passages that read like first-draft Claude or ChatGPT output. ArXiv had no formal teeth before. Now it does.
What stands out here is that arXiv isn’t banning AI assistance. It’s banning negligence. Use a model to draft, edit, or polish: that’s fine. Ship the output without reading it: you’re out for a year, and your path back runs through traditional peer review.
What researchers should do now
- Strip every meta-comment and prompt artifact before submission. Search the document for phrases like “as a language model,” “here is,” “would you like.”
- Verify every single reference. Hallucinated citations are the single most common giveaway and the easiest one for reviewers to catch.
- Spot-check numbers, equations, and tables against your actual experiment data, not the placeholder values a model invents.
- Treat LLM output as a draft, not a deliverable.
The bigger picture
This is the first major scientific platform to attach a concrete, multi-step penalty to sloppy AI use, and it likely won’t be the last. Conference organizers and journals have been wrestling with the same problem and watching arXiv’s move closely. Expect similar policies from venues like NeurIPS, ICML, and the big publishers within the next year.
The message to researchers is clear. Generative AI is a tool, not a co-author. The signature on the paper is still yours, and so is the responsibility for everything under it.
Full details at the original report from The Verge AI.