Slash AI Inference Costs: Gimlet Labs Raises $80M for Efficiency

Gimlet Labs just closed an $80 million Series A to tackle one of AI’s most expensive problems: wasted hardware. The round was led by Menlo Ventures, as reported by TechCrunch AI.

The core idea is deceptively simple. Most AI workloads only use 15 to 30 percent of the hardware they’re deployed on. That translates to hundreds of billions of dollars in idle resources across the industry. Gimlet Labs built what it calls the first “multi-silicon inference cloud”: orchestration software that splits AI workloads across whatever hardware is available, simultaneously.

Why This Matters

AI inference is becoming the industry’s biggest cost headache. Training a model is a one-time expense. Running it for millions of users is ongoing and brutal. McKinsey estimates data center spending will hit nearly $7 trillion by 2030 if the current compute-scaling trend continues.

The problem is that different parts of an AI workflow need different hardware. As Menlo’s Tim Tully explains in his investment thesis: inference is compute-bound, decode is memory-bound, and tool calls are network-bound. No single chip does it all. Gimlet’s software slices up agentic workloads, and even the underlying models themselves, so each piece runs on the best available chip for that specific task.

The result, according to Gimlet: 3x to 10x faster AI inference at the same cost and power.

Who’s Behind It

Founder Zain Asgar is a Stanford adjunct professor who previously built Pixie, an open source observability tool for Kubernetes. That company was acquired by New Relic in 2020 just two months after launching with a $9 million Series A. His cofounders Michelle Nguyen, Omid Azizi, and Natalie Serrino all worked with him at Pixie.

The team has already locked in partnerships with NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix. That’s a serious hardware coalition.

The Business So Far

Gimlet publicly launched in October with what it described as eight-figure revenues out of the gate: at least $10 million. According to TechCrunch AI, Asgar says his customer base has more than doubled in four months and now includes a major model maker and an “extremely large cloud computing company,” though he wouldn’t name either.

This isn’t a tool for everyday developers. Gimlet targets the largest AI model labs and data centers: the companies burning through the most compute.

The Funding Picture

With its seed included, Gimlet has raised $92 million total. The angel list reads like a who’s who:

Sequoia’s Bill Coughran
Stanford Professor Nick McKeown
Former VMware CEO Raghu Raghuram
Intel CEO Lip-Bu Tan

Other institutional investors include Factory (which led the seed), Eclipse Ventures, Prosperity7, and Triatomic. The round was oversubscribed after VCs learned Asgar had competing term sheets.

What Comes Next

What stands out here is the timing. The AI industry is entering its inference-cost reckoning. Training costs grab headlines, but inference is where the real money burns. A software layer that makes existing hardware dramatically more efficient, without requiring new chips, could reshape how data centers allocate resources.

The 30-person team now has serious capital to scale. If Gimlet’s 3x-10x efficiency claims hold up across production workloads, the company sits at a critical chokepoint in the AI infrastructure stack.

For the full details on the funding and Gimlet’s technology, check out the original report on TechCrunch AI.

Read original article

Why This Matters

Who’s Behind It

The Business So Far

The Funding Picture

What Comes Next

Related: