Amazon just redesigned a core piece of its cloud to deal with a problem that’s about to get much bigger: AI agents don’t behave like people. On Thursday, AWS launched the next generation of OpenSearch Serverless, a fully managed search and vector database built specifically for agentic workloads, according to TechCrunch AI. The pitch is simple. The system scales up instantly when agents fire off tasks, and scales all the way down to zero when nothing’s happening.
Why this matters: the internet was built for humans who search, click, scroll, and stream at a steady pace. Agents don’t. They spin up sub-agents that query hundreds of databases, hit APIs, and read documents in seconds, then vanish just as fast. Infrastructure designed for predictable human traffic buckles under that pattern. That’s the realization spreading across the industry right now.
What AWS Actually Changed
The headline technical shift is that the new OpenSearch Serverless decouples compute from storage. That’s the whole game.
- Scale to zero: Compute spins up in seconds for traffic bursts, then drops to nothing when agents go idle. Customers pay $0 during idle time.
- No more reserved waste: “Previously, even in our prior Serverless version, you had to have at least one instance operational and running because storage and compute were coupled,” Amazon OpenSearch GM Tia White told TechCrunch. You were paying for idle compute whether you used it or not.
- Native integrations: At launch it plugs into AI development platforms like Vercel and Kiro, so developers can ship production search and vector backends for agents without managing the infrastructure underneath.
White’s parking analogy lands well. The old model was paying for a parking space you reserved 24/7. The new one is a metered spot, where you pay only for the minutes you’re actually parked.
The Numbers Behind the Shift
Agents are still a small slice of internet activity, but machine traffic is already substantial and climbing fast. TechCrunch AI cites Cloudflare data showing bots made up 31% of all HTTP traffic over the last six months, with AI crawlers, search engines, and assistants accounting for roughly a quarter of bot requests.
The forecast is the part worth circling. “Non-human traffic will exceed human traffic sometime in the first half of 2027,” Cloudflare senior product manager Lai Yi Ohlsen told TechCrunch. That’s the world AWS is building for.
The demand is coming from two directions. Google said at its I/O conference that users will soon delegate tasks to AI systems, things like researching purchases, booking travel, and browsing the web. And enterprises are deploying agents internally and for their customers, generating a steady stream of machine-to-machine traffic behind the scenes.
AWS Isn’t Alone Here
This is an industry-wide repositioning, not a one-company bet. Per TechCrunch AI:
- Databricks and Snowflake are recasting themselves as AI memory and retrieval systems for enterprise data.
- Microsoft has updated Azure to handle agent bursts and let agents share memory.
- Cloudflare rolled out infrastructure last month aimed at giving agents persistent environments and instant scalability, a close cousin to what Amazon is doing.
What stands out is the timing logic. “Agents are moving from experimentation into production, and they create traffic patterns that previous infrastructure simply wasn’t designed for,” White said. “They spike without warning, they go idle without notice, and enterprise needs search that keeps up without paying for empty or idle compute.”
Why You Should Care
There’s a flywheel hiding in this story. The more companies deploy agents, the more pressure builds to redesign infrastructure around machine workloads. That redesign, in turn, makes agents cheaper and easier to run at scale, which pulls in more deployment. AWS is betting that cost structure, pay-per-use instead of pay-to-reserve, becomes the default expectation for agentic backends.
One caveat to keep in mind: this is fresh, and the article doesn’t include pricing specifics beyond the $0-when-idle model or independent benchmarks on how the scaling holds up under real production bursts. Worth watching as early adopters on Vercel and Kiro put it through its paces.
For the full breakdown, including more from Amazon’s team, check the original report at TechCrunch AI.