Microsoft built its own brain, no OpenAI required

“This is about taking first steps towards true self-sufficiency in AI.”

That one line from Microsoft AI’s CEO Mustafa Suleyman tells you everything about where this week went. Microsoft doesn’t just want to rent intelligence from OpenAI anymore. It wants to grow its own.

I caught this whole breakdown from the creator behind the Future Tools channel, who flew out to Microsoft Build and actually sat down with Suleyman to ask the questions the rest of us were thinking. He drank from the firehose so we don’t have to, and there was a LOT this week.

Here’s the part that matters. Microsoft dropped seven in-house models in one go. A new flagship reasoning model, a coding model called MAI Code One Flash, image models, and two standouts the creator flagged hard:

  • MAI Transcribe 1.5, which he says is currently the best transcription model on the planet, around five times faster than rivals
  • MAI Voice 2, a speech model running across 15 languages

Suleyman was also blunt about training data. He told the creator they licensed it carefully, paid a lot for it, and deliberately skipped open-source datasets to avoid hidden security bugs. The pitch is trust. Build on us, the data is clean.

🔑 Other Microsoft moves worth knowing:

  • Microsoft Scout, an always-on personal agent. The creator points out it’s literally powered by OpenClaw’s open-source tech, now wired straight into Windows, Teams, Outlook, and SharePoint.
  • A new GitHub Copilot desktop app that looks like Codex but lets you pick any model from any provider. The creator rushed home to try it and hit a wall: signups were paused. So temper your excitement for a bit.
  • Project Solara, putting agents into physical devices like a desk speaker and a camera badge. Even the creator admits he’s not sure who it’s for yet.
  • A Mayo Clinic partnership on a healthcare model. Suleyman told him he believes “medical super intelligence” is 2 to 3 years out.

Then Nvidia crashed the same week at Computex with the RTX Spark, a combined GPU and CPU with up to 128GB of unified memory. The big idea the creator explains well: run smart models locally. That means privacy, offline use on a plane, and saving the heavy cloud models for tasks that actually need them. The catch? He expects these machines, like the new Surface Laptop Ultra, to be expensive as anything, likely starting near 4 grand.

The rapid-fire round had real gems. My favorite catch from his roundup:

  • Miso One, an open-source voice model the creator says is so emotive it would fool him while scrolling TikTok
  • Ideogram 4.0, open-weight, great at text in images and native transparency
  • Reve 2.0, which jumped to number two in text-to-image
  • Codex on Windows now controlling desktop apps, plus role-based plugins for sales, design, and analytics

I was genuinely impressed watching him test the local models running on a laptop. That shift from cloud to your own machine feels like the quiet story under all the headlines.

If you want a simple way to think about all of it, steal Suleyman’s own test that he shared on stage: does the tech make our collective lives healthier and happier? If yes, keep it. If no, question it.

The creator narrowed about 70 stories down to this handful, so the full video is where the demos, voice samples, and his Suleyman interview really land. Go watch it, then tell me which of these you want pulled apart next. 🎯

Scroll to Top