AI models are now building better versions of themselves

Most people missed this, but we’ve quietly crossed a massive threshold in AI development. Models aren’t just getting smarter. They’re actively helping build the next version of themselves.

This hit me hard when I watched a breakdown from this AI professional who laid out the evidence from multiple frontier labs. The creator, Matthew Berman, walked through exactly how this recursive self-improvement loop is already running at MiniMax, OpenAI, Anthropic, and Google.

Here’s what’s actually happening right now:

🔄 MiniMax 2.7 helped design itself

The Chinese AI lab MiniMax released their 2.7 model and openly stated it “deeply participated in its own evolution.” The model updated its own memory, built dozens of complex skills for reinforcement learning, and improved its own learning process based on experiment results. According to the author, AI now handles 30-50% of the overall research workflow. A human designs experiments with AI, the AI writes the code, runs it, analyzes results, and reports back. Then the loop repeats.

🧠 OpenAI’s GPT 5.3 Codex created itself

OpenAI’s announcement was blunt: “GPT 5.3 Codex is our first model that was instrumental in creating itself.” The Codex team used early versions to debug its own training, manage deployment, and diagnose test results. Early checkpoints of the model were optimizing later checkpoints of the same model. Sam Altman previously set a goal of having an automated AI research intern by September 2026 and a true automated AI researcher by March 2028. Based on what the expert showed, we’re way ahead of schedule.

🔬 Anthropic is doing it too (quietly)

Anthopic hasn’t used the words “self-improving” directly, but the evidence is clear. They’ve built autonomous loops where Claude Code writes features, runs tests, and iterates continuously. As the contributor points out, their laser focus on coding isn’t just about revenue. When your AI is great at coding and deep research, it’s building the tools to make the next version of itself better.

⚡ You don’t need a frontier lab to do this

Andrej Karpathy open-sourced a project called “auto research” that lets anyone set up autonomous AI research loops. You point a frontier model at a problem, and it designs experiments, runs them, reviews results, and keeps iterating. Karpathy achieved the fastest GPT-2 training time on Earth after a single night of running it.

The original poster took this even further with OpenClaw. Despite having no ML background, he set up overnight autonomous fine-tuning runs. A frontier model designs experiments, fine-tunes open-source models like Qwen 2.7B, tests against baselines, and adjusts approach automatically. All without human involvement.

This connects back to Leopold Aschenbrenner’s situational awareness paper, which predicted this self-improvement loop would arrive sooner than expected. According to the mind behind this breakdown, we’re standing right at the bottom of an exponential curve.

The takeaway is straightforward: you no longer need deep ML expertise to run AI research. You just need to direct a model and let it iterate. The momentum is building and the acceleration is real.

👉 Watch the full video for the complete walkthrough of each lab’s approach and links to Karpathy’s auto research repo.

Scroll to Top