The free AI party is ending, and the invoice is being written right now. Earlier this month, Anthropic slammed the brakes on third-party agent tools running on Claude, forcing users into pricier tiers if they wanted to keep building. According to The Verge AI, this move is just the opening salvo in a much larger correction: investors who poured hundreds of billions into OpenAI, Anthropic, and the rest are finally demanding returns, and the cost is flowing downstream to everyone using these tools.
This is significant because it marks the end of the subsidy era. For years, you’ve been getting access to models that cost more to run than you were paying for. That gap was venture capital eating the difference, betting that scale would eventually produce profit. The math is now catching up.
The Trillion-Dollar Hole
Will Sommer, a senior director analyst at Gartner, laid out the numbers for The Verge AI, and they’re brutal. Capital investment in AI data centers between 2024 and 2029 will hit roughly $6.3 trillion. To avoid a write-down, providers would ideally deliver a 25 percent return on invested capital, in line with what Amazon, Microsoft, and Google typically earn. Drop below 12 percent and institutional money walks. Below 7 percent, you’re looking at what Sommer called “an unmitigated disaster for all of the investors in this technology.”
To clear even that 7 percent floor, AI companies would need to generate close to $7 trillion in cumulative revenue through 2029. That’s about $2 trillion a year by the end of the period. OpenAI has already trimmed its spending commitments from $1.4 trillion down to $600 billion through 2030, and Sommer’s best-case forecast still shows the company hitting only a fraction of what it needs.
The Token Math Doesn’t Work
Here’s where it gets wild. Providers make money selling tokens, the units of text, image, or audio data their models process. Google announced it was handling 1.3 quadrillion tokens in October. Add up every major provider and you get 100 to 200 quadrillion tokens a year.
To hit those revenue targets, providers would need to push 10 sextillion tokens annually. That’s a 50,000 to 100,000x jump in consumption by 2030, and that’s assuming a generous 10 percent profit margin per token. Right now, most labs are losing money on every token they process, and they don’t have the compute capacity to scale that high even if they wanted to.
What’s Changing for Users
The squeeze is already visible if you know where to look:
- Enterprise pricing shifts at OpenAI and Anthropic, pushing heavy users into premium tiers
- Ads in the product, with OpenAI introducing in-platform advertising
- Third-party restrictions, like Anthropic cutting off agent tools that ran on consumer subscriptions
- Usage caps creeping into previously unlimited plans
“Is the era of basically free or close-to-free AI kind of coming to an end here? It’s too soon to say for certain, but there are some signs.” (Mark Riedl, Georgia Tech)
What Practitioners Should Do Now
If you’re building on top of these APIs, the cheap-token assumption is a ticking bomb in your unit economics. A few moves worth making this quarter:
- Audit your token burn. Know exactly how much each user costs you at current rates, then model what happens if that doubles.
- Build model-agnostic. If Anthropic jacks prices, you want the ability to swap in Gemini, open-source models, or a cheaper tier without rewriting your stack.
- Cache aggressively. Prompt caching and response caching are free money when token prices climb.
- Push logic to smaller models. Use flagship models only where they actually matter. Haiku, Flash, and the smaller Llamas handle most routine work at a fraction of the cost.
The free ride was always going to end. The question now is whether the correction comes as a gradual squeeze or a sudden repricing that breaks a lot of business models at once. Either way, the cheapest tokens you’ll ever use are the ones you’re burning today.
Full breakdown at The Verge AI.