We usually treat AI research as a linear conversation where the model does one thing, checks the result, and moves to the next step. But a recent experiment just demonstrated what happens when you stop treating Large Language Models (LLMs) like chat bots and start treating them like dispatch centers for parallel workers. This innovator, u/MathematicianBig2071, set up a fascinating cage match between standard Claude Code and a version supercharged with a specific Model Context Protocol (MCP) server to solve a massive data problem.
The task was relatable to anyone who has ever applied to university: rank 446 colleges based on highly specific, personal criteria. If you have ever used tools like Naviance, you know how clunky this process usually is. The author wanted to see if an AI could handle this large-scale prioritization better than a human scrolling through spreadsheets. The results highlighted a massive gap between “smart” models and “tool-equipped” models.
The Linear Struggle
The experiment ran two sessions side-by-side. On the left was “vanilla” Claude Code, the out-of-the-box experience. The author noted that this session was painful to watch. Without specialized tools, the AI tried to act like a human developer.
It spent minutes planning a strategy, reading API documentation, and attempting to query databases directly. When it hit rate limits, a common hurdle in automated research, it panicked. It switched strategies to downloading full datasets, but couldn’t find the right URLs. It bounced between GitHub repositories and file downloads, eventually settling for slightly outdated data. Even then, it had to write Python scripts to fuzzy match the data, leading to errors where schools didn’t match or filters failed. It took over 20 minutes and burned 50,000 tokens just to get a “reasonable” result.
The Parallel Twist
The second session used the exact same prompt but had access to an “everyrow” MCP server. This is where things got interesting. Instead of trying to do the work itself linearly, the model acted as a manager.
The AI recognized it had a tool designed for this exact scenario. It immediately called a ranking tool that dispatched optimized research agents for every single row in the dataset simultaneously. It didn’t need to write Python scripts or worry about API rate limits in the same way. It simply assigned 446 digital interns to go visit websites, read news articles, and gather data independently.
How the Workflow differed
The difference in execution speed and logic was stark. Here is how the MCP-enabled session tackled the problem:
- Instant Delegation: Rather than planning a complex research strategy, the model immediately identified the “Rank” tool as the correct path.
- Parallel Execution: The system dispatched agents to evaluate all 446 schools at the same time. ⚡
- Live Updates: Progress reports rolled in as individual agents finished their specific research tasks.
- Synthesis: Within 8 minutes, the task was done. The model printed out top picks, with each score annotated with the specific research that informed it. 📝
Why This Matters
The quality of the final output was comparable between the two sessions. Both identified a mix of prestigious programs and underrated gems that fit the user’s criteria. However, the path to get there was completely different. The vanilla model struggled with the “how” of the problem, wasting tokens on debugging and data retrieval. The MCP-equipped model focused entirely on the “what,” leveraging agents to handle the grunt work.
This highlights a critical shift in how we should be building with LLMs. The future isn’t just about smarter models; it’s about giving those models the right interfaces, like MCP servers, to interact with data efficiently. The author proved that when you decouple the reasoning engine from the execution layer, you save time, money, and sanity.
If you want to see the video comparison of these two approaches running side-by-side, check out the original discussion in the PromptEngineering subreddit.
I Ranked 446 Colleges by the criteria I care about in under 8 Minutes
by u/MathematicianBig2071 in PromptEngineering