AI Writing Code: Claude's 80% Self-Improvement Leap

Here’s a number that stopped me cold: more than 80% of the code merged into Anthropic’s own codebase is now written by Claude, up from low single digits just a year ago.

That stat comes straight from a new paper Anthropic published on recursive self-improvement, which is basically AI helping build the next version of AI. I found this breakdown through Matthew Berman, who walked through the whole thing on a livestream. The original poster (Anthropic) is essentially showing the world how AI is taking over more and more of its own development.

The big idea: humans are getting pushed further from the work

The author maps the evolution in stages:

🔹 2021-2023: engineers write code directly
🔹 2025-2026: a person prompts an agent, the agent writes code
🔹 Soon: agents spawn sub-agents, humans barely touch it

The final stage closes the loop. Claude improving Claude, with only one bottleneck left: compute.

The numbers behind the trend

The creator lays out a wild progression in how long a task AI can handle on its own:

March 2024: tasks that take a human about 4 minutes
A year later: tasks taking around 90 minutes
Recently: 12-hour tasks

Task length has been doubling every four months, down from every seven. Berman points out the acceleration is itself accelerating. On research reproduction benchmarks, AI went from succeeding 20% of the time to nearly 100% in about 15 months.

The catch nobody’s saying out loud

Here’s the part I found most honest. Anthropic admits engineers produce 8x more lines of code but only report about 4x more actual output. So the AI-written code is roughly half as valuable per line. As the post’s author puts it, you can have a 1000-horsepower engine, but it means nothing if the tires can’t grip the road.

The other missing ingredient is taste. AI can take an underspecified problem and run with it, but deciding what to build next, spotting truly novel research directions, that’s still human work. Berman frames it well: execution used to be the hard part, now ideas are.

Three futures to chew on

The expert behind the paper sketches three outcomes:

The trend stalls, but today’s tools spread everywhere
Huge productivity gains, humans stay in the loop setting direction
Full recursive self-improvement, AI building its own successors

What struck me is the jobs angle. More code means more need for marketers, salespeople, support, and reviewers. Productivity going up can mean more human roles, not fewer.

A couple of spicy takes from the breakdown

Anthropic reportedly cut competitor XAI’s access to its models, then quietly kept a frontier model (“Mythos”) internal instead of releasing it. Berman reads this as compounding their lead while wrapping it in safety language.
The “we should all slow down” message lands differently when you’re the one in first place. Easy to call for a pause when you’re winning the race.

My favorite line from the whole thing, borrowed from Karpathy: “You can outsource your thinking, but you cannot outsource your understanding.” As we hand more to the machines, holding onto that understanding is the whole game.

Want the full walkthrough, the graphs, and the three-futures debate? Watch the complete video. It’s worth your time.

The big idea: humans are getting pushed further from the work

The numbers behind the trend

The catch nobody’s saying out loud

Three futures to chew on

A couple of spicy takes from the breakdown

Related: