AI Coding Loops: From Manual Prompts to Autonomous Agents

Most people who code with AI do the same dance. You prompt the agent, wait, check the result, then prompt again. Over and over. This past weekend a different idea took over the conversation, and it flips that whole routine on its head.

The video comes from Matthew Berman, who breaks down a shift two of the biggest names in AI coding are pushing right now. He points to Boris Cherny from Anthropic and Peter Steinberger, who’s tied to the OpenAI world. Boris said something that stuck with me: “I don’t prompt Claude anymore. I have loops that are running. My job is to write loops.” Peter’s tweet on the same idea hit 5 million views in under 24 hours. So let’s unpack what they actually mean.

The old way vs the new way

Here’s the contrast at the heart of it.

Old way: You’re the engine. Every step needs your prompt. The agent waits on you, you wait on the agent, repeat.
New way: You design a loop. You hand the agent a goal and a trigger, then you walk away. The loop keeps the agent running until that goal is met.

The creator frames it simply. You stop being the one who prompts. You become the one who builds the thing that prompts. That’s the mental jump, and it’s bigger than it sounds.

What a loop actually is

The author strips it down to two ingredients:

A trigger: what kicks the loop off.
A goal: and crucially, the goal has to be verifiable somehow.

That verification piece is the key. The agent needs a way to know it succeeded. He points out this is basically reinforcement learning wearing a different hat. In RL, you need a verifiable reward so the model knows it hit the target.

Goals come in two flavors:

Deterministic: All tests pass. The function runs clean. No errors. Easy to check.
Non-deterministic: An LLM looks at the work and decides “yeah, that’s done.” Fuzzier, riskier, harder to pin down.

The three kinds of triggers

The creator shows this inside Cursor’s “Automations” tab and in Claude Code. He boils triggers down to just three types:

An action happens: like a pull request opening.
A schedule: a cron job. Every 30 minutes, every hour, every day.
A human kicks it off: you type the goal and say go.

His own example is clean: every time a PR opens in his project, the loop reviews it, fixes issues automatically, commits back, and makes sure all tests and CI stay green. That’s the whole loop. Genuinely that simple to start.

In Claude Code there’s even a literal /loop command. He shows /loop every 5 minutes, "Compare what we have built with our full spec, spec.md, and continue building until we complete the full spec." Every five minutes a fresh agent wakes up, figures out what’s left, and builds it. It keeps going until the spec is done.

Loop vs automation, the difference that matters

This distinction from the author is worth holding onto. An automation just runs a series of steps. A loop has a decision inside it. The loop itself judges whether the goal is reached. It’s not blindly executing lines of code, it’s checking “am I there yet?” and deciding to continue or stop. That decision-making is what makes it a loop.

The honest catches

I appreciated that he didn’t oversell this. The criticism is real, and he lays it out:

Hard to set up well. Basic loops are easy. But a full “software factory” that ships features on its own? Defining the end state of a fuzzy goal is genuinely tough. Get it wrong and the agent burns tokens forever.
Expensive. Pulling the human away from the keyboard means more tokens, and more tokens means a scary bill. He notes Peter Steinberger showed around 1.3 million dollars in monthly token usage. Anthropic and OpenAI give staff effectively infinite tokens, which is exactly why only the top sliver of engineers can experiment here.
Spec writing is the bottleneck. For real features, you have to define the full spec up front. He admits that’s hard for him too, since half the fun of building is exploring as you go.

His take, which I share: what’s expensive today gets cheap tomorrow. That pattern has held across the whole history of tech. So even if you can’t run this now, knowing how it works puts you ahead.

Try this without going broke

If you want a taste without a trillion-dollar bill:

Start with a deterministic goal. “Make all tests pass” beats “build me a great feature.”
Use a PR-open trigger for auto-review and fixes. Low risk, clear finish line.
Write a real spec.md before you loop on a feature, so the agent has something concrete to check against.
Watch your token spend closely on any fuzzy goal. Set limits.

The creator’s big-picture point is that engineers are slowly, then suddenly, going to stop writing prompts entirely and start designing these factories instead. He even ties it to recursive self-improvement, the moment AI starts designing its own loops. We’re not there. Humans still set the direction. But the direction is clear.

Watch the full video for the live demos in Cursor and Claude Code. Seeing the loop fire on a real PR makes the whole idea click. 🚀