The GPU market just broke its own rules. H100 rental prices, which had been steadily declining for over two years, reversed course in late 2025 and have been climbing sharply ever since, according to Latent Space.
This isn’t how hardware depreciation is supposed to work. NVIDIA’s H100 chips, first shipped in October 2022, followed a predictable price decline through most of their lifecycle. Latent Space tracked this depreciation cycle back in October 2024, noting it was moving faster than previous generations. The bottom came after the DeepSeek R1 shock. But since December 2025, rental prices have surged, and they keep going.
As Latent Space reports, the H100 is now “worth more today than it was 3 years ago.” Several forces are driving this:
- A general chip shortage that started tightening in late 2025
- The reasoning model and agent inflection point of December 2025, which dramatically increased demand for inference compute
- Better reasoning models and inference software that make the H100 far more productive than anyone expected four years ago
The practical result: a chip that was supposed to depreciate on a 4-7 year schedule is actually appreciating. For data center operators and GPU cloud providers, this changes the economics completely. Capacity that was losing value is now gaining it, as long as the trend holds.
🔍 Anthropic’s “Mythos” Leak and Capybara Tier
The other major story in this Latent Space briefing: Anthropic appears to be preparing a new model tier above Opus. A leaked “Claude Mythos” post, now pulled, was preserved by community members. Fortune corroborates that Anthropic is introducing something called Capybara, described as “larger and more intelligent” than Claude Opus 4.6.
Early reports suggest Capybara posts stronger scores on coding, academic reasoning, and cybersecurity benchmarks. Rollout is reportedly constrained by cost and safety concerns. Speculation around a ~10T parameter class model remains unconfirmed, but the direction is clear: Anthropic is betting hard on scale.
The timing is notable. The Financial Times reports Google is close to funding Anthropic’s data center buildout. Frontier AI competition is increasingly gated by power and capital expenditure, not just algorithms.
🌐 Open Models Keep Closing the Gap
Zhipu’s GLM-5.1 is now available to all coding plan users, and community reaction frames it as further evidence that open and semi-open Chinese models are narrowing the gap with closed alternatives.
Local deployment keeps getting more practical. Latent Space highlights several examples: swapping paid TTS subscriptions for local Qwen 3.5 14B setups, strong economics running Qwen 27B with Hermes Agent, and new quantization work that fits Qwen3.5-35B into 24GB VRAM with roughly 1% performance loss.
🤖 Agents Are Becoming Real Products
Nous Research’s Hermes Agent is emerging as the focal point for open-source agent development. The integration with Hugging Face as a first-class inference provider, featuring 28 curated models plus broader access, signals a shift from demo-stage agents to production-ready ones with memory, persistent machine access, and model choice.
The infrastructure layer is maturing fast. LangChain pushed production-oriented materials including agent eval checklists, IDE-style UI guidance, and prompt promotion/rollback tools. Artificial Analysis introduced AA-AgentPerf, a benchmark focused on real coding-agent trajectories with 100K+ sequence lengths.
The stack is moving from “chatbot with tools” to proper software lifecycle primitives for agents. That’s meaningful.
What This Means
The H100 price reversal is the headline, but the underlying story is bigger. Demand for inference compute is outstripping supply at exactly the moment when reasoning models and agents need more of it. Data center operators who held onto H100 inventory are sitting on appreciating assets. Those who sold early left money on the table.
For the broader AI industry, this means compute costs aren’t going down anytime soon, even as open models and quantization techniques try to push in the other direction. The tension between scaling up and making AI accessible just got sharper.
More details are available in the full Latent Space report.