Origin Lab just closed an $8 million seed round to do something nobody else has cracked at scale: license video game data to the labs building AI world models. Lightspeed Ventures led the round, according to TechCrunch AI, with SV Angel, Eniac, Seven Stars, and FPV joining in. Twitch co-founder Kevin Lin and Cruise founder Kyle Vogt added angel checks on top.
The pitch is simple. World-model labs like Yann LeCun’s AMI Labs and Fei-Fei Li’s World Labs need massive amounts of physical-world data to train systems that understand how objects move, fall, collide, and behave. That data is hard to find. It’s not sitting on the open web like text. But it does exist in one place: the video game industry, which has spent decades building photoreal physics simulations and 3D assets.
What Origin Lab Actually Does
Think of it as a marketplace with conversion services bolted on. Game studios bring assets they’ve already built and monetized once. Origin Lab transforms those assets into training-ready data, then sells access to AI labs. The conversion work ranges from straightforward (running renders of existing 3D models) to involved (automating hours of in-game walkthrough footage with the right annotations).
“The AI systems that are being built now need to understand how the physical world works and how things move,” co-CEO Anne-Margot Rodde told TechCrunch AI. “That data essentially lives in video games.” Her co-founders are Antoine Gargot and Colin Carrier.
Why This Matters Now
World models are the next frontier after LLMs. They’re what’s supposed to power humanoid robots, autonomous vehicles, and any AI that needs to predict physical outcomes. Unlike language models, which feed on text scraped from the internet, world models are starving. There’s no Common Crawl for physics.
Labs have wanted video game footage for years, but the path has been messy. In December 2024, OpenAI’s first Sora release caused a minor scandal when outputs appeared to regurgitate footage from popular games and Twitch streams. The implication was that Sora had been trained on streaming data without clear licensing. Amazon has been more open about wanting to use Twitch footage for training. Origin Lab is betting that a clean, licensed pipeline beats the grey-zone scraping that’s caused headaches across the industry.
What stands out here is the structural play. Origin isn’t trying to build a world model. It’s building the picks-and-shovels infrastructure that every world-model lab will eventually need.
The Data-Vendor Playbook
Lightspeed’s Faraz Fatemi, who led the deal, told TechCrunch AI the bet is straightforward: data vendors serving frontier labs can scale revenue fast. Scale AI proved the template. “We’ve seen how sharp the revenue scaling can be for data vendors that are serving the major labs,” he said. “These are very well-capitalized businesses, and the bottleneck for all of them is data.”
That last line is the whole thesis. Frontier labs have money. They have compute. What they don’t have is enough of the right data to train the next generation of models.
What to Watch
A few things worth tracking:
- Which game studios sign on first. The big publishers (EA, Activision, Ubisoft) sit on the most valuable archives. Indie studios will be faster movers.
- Licensing terms. Per-asset? Per-frame? Revenue share? The pricing model will shape how the rest of the market forms.
- Competitive response. Expect Scale AI, Surge, and others to look hard at this category. Game-engine companies like Unity and Epic could also move in directly.
- Quality of converted data. A rendered scene from a game isn’t automatically useful training data. The conversion layer is where Origin either wins or gets commoditized.
If the world-model thesis pans out, the companies supplying training data could be as valuable as the labs themselves. Origin Lab just placed its bet on that exact spot. Full details at TechCrunch AI.