AI Coding Productivity: Myths vs Reality Research

The story everyone tells about AI coding is that it makes developers faster and better. The faster part holds up. The better part is where things get shaky, and according to TechCrunch AI, a growing pile of research suggests the productivity gains coders feel may not be real at all.

Here’s the detail that should stop you cold. In February 2026, the respected AI research lab METR tried to repeat a study it had run in 2025 measuring how long open source developers took on tasks by hand versus with AI. They couldn’t run it. Developers refused to participate because they wouldn’t work without AI, even for a short study. That’s not a small thing. The original 2025 study found AI actually slowed developers down. It generated code faster, sure, but then they burned the saved time hunting bugs, steering the model, and waiting on it.

What stands out here

Developers think AI made them twice as valuable. METR’s May survey, where technical staff self-reported their gains, says exactly that. But self-perception and measured output are two different animals, and the gap between them is becoming the real story of 2026.

Look at what’s happening inside big companies:

Amazon shut down Kirorank, its internal token-tracking leaderboard, after employees gamed it by running AI agents excessively and racking up costs, the Financial Times reported.
Uber blew through its entire 2026 AI budget in four months. COO Andrew Macdonald said on a podcast the spending didn’t produce a measurable bump in projects or productivity.

This is the unraveling of “tokenmaxxing,” the 2026 trend of treating token usage as a proxy for productivity. Turns out more tokens just means more spend, not more output.

The maintenance bill nobody priced in

The sharpest argument comes from programmer and author James Shore, whose blog post went viral on Hacker News. “You write code twice as quick now? Better hope you’ve halved your maintenance costs,” he wrote. “Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture.”

The numbers back the worry, even if some sources have skin in the game:

Aiswarya Sankar, CEO of reliability startup Entelligence AI, says companies spend 44% of their tokens fixing bugs their own AI generated.
CodeRabbit, a code-review tool company, analyzed open source pull requests and found AI produced 1.7x more problems than human code.

Yes, those are self-serving stats from firms selling AI review tools. But independent work points the same way. Researchers at Singapore Management University warned in April that AI-generated code “can introduce long-term maintenance costs into real software projects.” When the people selling the cure and the neutral academics agree on the diagnosis, it’s worth listening.

The vendor fix versus the human fix

Ask the AI coding agent companies how to solve this and the answer is convenient: use more AI agents to fix the code the AI wrote. Cognition CEO Scott Wu, who builds the agent Devin, suggests exactly that. But even he rates Devin somewhere between a junior and mid-level programmer depending on the task. That’s not hand-it-off-and-forget-it. That’s a junior dev you still have to check.

The SMU researchers offer a more grounded path, and Wu agrees with the core of it. The takeaways for anyone shipping code with AI:

Know AI’s limits as well as you know your languages. Learn precisely what it does and doesn’t do well. That knowledge is now a core skill, not a nice-to-have.
Build QA systems designed for AI output. Generic testing won’t catch the failure patterns AI introduces.
Review AI work like a junior dev’s. Carefully, every time. Speed at generation means nothing if review collapses.
Keep humans on the big calls. Architecture and security design stay human. That’s where the real leverage is.

Why this matters now: the industry is mid-pivot. The token-counting hype is fading, budgets are getting scrutinized, and the maintenance costs are starting to land on real teams. The developers gripping their AI tools the tightest may be the ones least prepared for the bill. Smart teams will treat AI as a fast but sloppy junior, not a senior they can trust unsupervised.

More detail and the full research roundup are available at the original TechCrunch AI report.

Read original article

What stands out here

The maintenance bill nobody priced in

The vendor fix versus the human fix

Related: