🤖 LLMs can now write better optimization algorithms than the tools we spent years crafting.
Here’s what happened. A research team handed an LLM a 9-line random-search stub, a budget of 2,000 evaluations, and 5 rounds of contrastive feedback. The LLM rewrote itself until it beat Optuna’s TPE algorithm on 53 out of 55 standard benchmarks. That’s 96%.
No hand-tuning. No architecture decisions. Just feedback loops where the model sees what worked, what didn’t, and writes a better version. They called the method ContraPrompt.
🔬 The win rate is impressive. What’s more impressive is what the LLM independently discovered along the way:
- Corner enumeration: probing the edges of the search space first. Most practitioners skip this entirely and lose coverage they didn’t know they were missing.
- Differential evolution seeding: borrowed from classical optimization theory, without being told it existed. The model reverse-engineered a known technique from scratch.
- Multi-phase refinement: structuring the search in stages, exactly the way a seasoned expert would do it by hand.
Nobody taught it any of this. It figured it out from contrastive feedback alone.
💡 If you’re working on hyperparameter tuning, prompt optimization, or any problem that involves searching over a parameter space, read the full writeup at vizops.ai. It’s one of the cleaner benchmark studies you’ll find this year.
What you’re watching here is the early stage of something bigger: AI systems that improve their own tooling through iteration. Not a concept. 53 out of 55 benchmarks.
We let an LLM write its own optimizer — it beat Optuna on 96% of standard benchmarks
by u/se4u in PromptEngineering