Microsoft just made its biggest Copilot move yet: it’s putting OpenAI’s GPT and Anthropic’s Claude to work together inside the same product. The Information reports that Microsoft debuted new Copilot upgrades on March 30 that combine models from both AI labs in a novel “Critique” workflow.
Here’s how it works: GPT drafts a response to a research query, then Claude reviews that draft for accuracy, completeness, and citation quality before the user ever sees it. Think of it as a built-in peer review system, except both peers are large language models from competing companies.
The results speak for themselves. Microsoft says this multi-model approach delivered a 13.8% improvement on the DRACO benchmark, an industry measure of deep research quality. That score puts Copilot’s Researcher agent ahead of standalone deep-research tools from OpenAI, Google, Perplexity, and Anthropic individually.
“Having various different models from different vendors in Copilot is highly attractive, but we’re taking this to the next level, where customers actually get the benefits of the models working together,” said Nicole Herskowitz, corporate vice president of Microsoft 365 and Copilot.
Why This Matters
For years, Microsoft went all-in on OpenAI. Billions in investment, exclusive model access, deep product integration. This move signals a real strategic shift: Microsoft is no longer betting on a single model provider. It’s treating AI models like components you mix and match for better outcomes.
This is significant for a few reasons:
- Model-agnostic enterprise AI is here. Microsoft is proving that the best results come from combining models, not picking a winner. Enterprises building AI workflows should take note.
- Competition benefits everyone. When GPT and Claude check each other’s work, the output improves. This validates what many AI practitioners have argued: ensembles and model diversity beat single-model dependence.
- Anthropic gets massive distribution. Claude is now embedded in Microsoft 365, which serves hundreds of millions of enterprise users. That’s a distribution win Anthropic couldn’t easily achieve on its own.
Copilot Cowork Goes Live
Microsoft also announced that Copilot Cowork, its tool for delegating long-running, multi-step tasks inside Microsoft 365, is now available through the Frontier early access program. Cowork lets AI agents handle complex workflows that unfold over time, not just one-shot queries.
The company plans to make the Critique workflow bi-directional in the future, meaning GPT would also review Claude’s drafts. That two-way system could push quality even higher.
What Comes Next
This sets a new expectation for enterprise AI products. If Microsoft is blending rival models for better performance, other platforms will face pressure to do the same. The era of single-model lock-in is fading.
For AI practitioners and enterprise buyers, the takeaway is clear: the best AI system isn’t necessarily the best single model. It’s the best orchestration of multiple models. Microsoft just made that argument very hard to ignore.
Full details are available in the original report from The Information.