Yesterday a build dropped on GitHub that promises to end the AI slop era once and for all.
The project is called make-no-mistakes. And according to u/1glasspaani, the Redditor who shared it to r/PromptEngineering, it’s “arguably the most comprehensive piece of code the industry has seen since gstack.”
That’s a big claim. The benchmark numbers back it up. Sort of.
What Is Make-No-Mistakes
The team at thesysdev built this as an agent skill. The premise is clean: your AI keeps generating vibecoded slop because no one told it not to. This project tells it not to.
The skill is designed to ensure models produce “accurate text” instead of hallucinated garbage. The mechanism? Add it to your agent workflow. The model now knows not to make mistakes. The instructions are clear. The logic is airtight. The feature description is thorough.
That’s the whole thing. That’s the insight.
Independent testing confirmed a 0.067% performance improvement. Testing was conducted on the 18th shot at temperature 0.0. P-values are available upon request. They are, as the README notes, large.
The Twist
This is satire. Really good satire.
Make-no-mistakes is a precise parody of how AI tools get announced in 2026. The README has the full toolkit: vague circular definitions, benchmark numbers that technically exist but carry zero information, superlatives stacked on superlatives, and a “definitive solution” to a problem everyone agrees is real. The writing style even mimics the confident, breathless cadence you see in real product launches. Every paragraph sounds like it was generated by a model trained exclusively on YC application essays and Product Hunt descriptions.
One commenter in the thread said it perfectly: “I honestly can’t tell if this is a shitpost cause the MD is super detailed.”
That confusion is the joke. The README is detailed in exactly the way that says nothing at all. It’s the AI slop era, written as a README.
The project opens by criticizing “the enshittification of applications due to vibecoded AI slop” and then offers a fix for it that is itself a form of slop. That’s a well-constructed bit. The recursion is intentional and the execution is precise.
Why This Hit a Nerve
The post split the comment section perfectly.
Some people immediately got it and piled on the joke. Others were genuinely enthusiastic, treating it as a real tool without catching the satire. A few called out the 0.067% number as a “rounding error” and asked why anyone would bother.
That split tells you something. We’ve been so conditioned to accept meaningless benchmarks and circular feature descriptions that a parody version just looks normal. The satire lands because it doesn’t need to exaggerate anything. It just copies the genre conventions exactly. No distortion required. The genre is already doing all the work.
There’s also a second layer here. The people who engaged earnestly weren’t dumb. They were responding the way years of AI marketing trained them to respond. See a detailed README, see benchmark numbers with decimal places, assume legitimacy. That’s a Pavlovian response the industry spent two years building.
That’s the actual critique inside the bit. And it’s a fair one.
The Workflow (For the Full 0.067%)
If you want to deploy make-no-mistakes yourself:
- 🔍 Find the repo by searching “make-no-mistakes thesysdev” or through the original Reddit thread
- Read the README carefully. Pay attention to the benchmark methodology. Appreciate the p-value disclosure.
- 🤖 Add the skill to your agent workflow following the instructions
- 📊 Run your own benchmarks. Compare before and after. Report your numbers with confidence.
- Ship. You have now solved the slop problem. Mention it in your next launch post.
Pro Tips
The real use of this project is as a diagnostic tool.
Take your current product README or landing page and read it next to the make-no-mistakes README. Does it sound similar? Vague promises, circular feature descriptions, metrics that feel meaningful but say nothing specific?
If yes, that’s useful data. Fix it before it hits Product Hunt. A good test: can someone read your copy and explain back to you, in plain words, what your product actually does and why it beats alternatives? If the answer involves words like “comprehensive,” “state-of-the-art,” or “next-generation” without a concrete mechanism behind them, start over.
Second tip: the comment thread in the original post is worth five minutes of your time. The spread of reactions, from genuine enthusiasm to “this is a rounding error,” tells you more about how AI tools get evaluated right now than any proper benchmark study would.
The community’s instinct to engage with this earnestly is the whole point of the joke.
Check It Out
The original discussion is in r/PromptEngineering, posted by u/1glasspaani. The repo is open-source, free, and methodologically transparent.
0.067% is more than zero. Sometimes that’s all you need to ship. 🚢
Frequently Asked Questions
Q: Is a 0.067% performance improvement actually worth it?
Good catch, users in the comments asked the same thing. The value depends on your use case and scale; at enterprise scale, even 0.067% can compound. But you’re right to be skeptical without statistical validation. Look for error bars, p-values, and reproducibility details in the repository to confirm this is real improvement, not statistical noise.
Q: Is this tool legitimate or is it satire?
The detailed documentation and comprehensive approach suggest genuine work, but the bold framing (“definitive solution,” dramatic critique of “AI slop”) does read tongue-in-cheek. Either way, the best approach is to evaluate the code and methodology yourself, does it actually solve your problem? That’s what matters.
Adding this skill gave our AI 0.067% performance boost. Announcing make-no-mistakes
by u/1glasspaani in PromptEngineering