Prompt Routing Layer: Stop Hallucinations Reliably

Engineers have been building elaborate system prompts for years. Most of them follow the same pattern: define a persona, set some rules, add examples, ship it. PEOS Router v2 is a different animal.

It dropped on r/PromptEngineering this week, buried inside 4,000 words of dense specification. Most people skipped it. Fair. Technical specs without a clear value proposition on page one tend to get scrolled past. But the core idea is worth pulling out: instead of writing a prompt that answers your question, you write a prompt that decides which prompt should handle your question. A routing layer, not an execution layer. Think of it less like a prompt and more like a dispatcher that reads incoming requests, categorizes them, and hands them off to the right handler before a single word of output is generated.

That’s the twist. You’re not prompting for output. You’re prompting for a classification and routing decision first, every single time. The model never jumps straight to answering. It stops, reads the request against a decision tree, and determines the appropriate mode of response before engaging. That pause is intentional. It’s where most production prompts fail, because they skip straight to generation and trust that the model will self-correct if the framing is wrong. It usually doesn’t.

And the way it enforces honesty is genuinely unusual. When evidence is thin, the model can’t fill the gap with confident-sounding language. It’s required to say exactly: “I cannot confirm this.” No softening, no burying uncertainty in polished prose, no phrases like “it’s generally understood that” followed by something the model just invented. That one rule makes it more reliable than most role-play prompts in circulation. In practice, this means outputs get shorter when certainty is low, which is the opposite of what most prompts produce. Most prompts reward fluency over accuracy by default. This one punishes fluency that outpaces evidence.

Here’s a stripped-down version of the routing logic you can run yourself:

🗂️ Classify the request first. Task type (write, analyze, decide, code, research), stakes level (low/medium/high), and verification need (light/standard/strict). Do this before anything else. A low-stakes creative writing request and a high-stakes financial analysis request should never enter the same execution path, even if they’re phrased similarly. The classification step is what separates them.
📋 Pick the execution mode. Choose from: Workflow, Tool Safety, Structured System, Production Reliability, or Human Review. Pick the one that actually changes behavior, not just framing. The spec is explicit here: the mode you select must alter what the model does, not just how it describes what it’s doing. Cosmetic mode selection is listed as a failure pattern.
🔀 Route to the right action. If you have a prompt library, pull the best match. If not, generate a Prompt Card on the fly. If the input is rough fragments or underspecified, rewrite the request into something executable first. If it’s already execution-ready, skip straight to running it. The routing step saves time precisely because it prevents you from generating output against a poorly scoped request.
Define a terminal condition upfront. Decide what “done” looks like before starting. This is harder than it sounds. Most prompts have no exit condition, which is why models keep adding sections, summaries, and caveats after the useful content ends. A terminal condition forces you to define the actual deliverable, and it forces the model to stop the second that deliverable is complete.
🚫 Label every claim. Fact, Inference, or Speculation. Nothing gets to hide behind fluent language and implied certainty. This applies inside the routing layer too. If the model isn’t sure which execution mode fits, it should say so rather than defaulting to the highest-confidence-sounding answer.

Pro tip: The most underrated part of the spec is the failure patterns it explicitly bans. “Prestige prompting” (act as a world-class expert) is listed as a control failure because it substitutes authority framing for actual structural reliability. “Coherence theater” (output that sounds rigorous but has no structural strength under the surface) is another. Grafting those two exclusion rules onto any production system prompt will immediately tighten output quality. You can do it in two lines: one that bans expert persona framing, one that requires every structural claim to connect to a verifiable prior step. Test it against your current prompts. The gap between what sounds rigorous and what actually is rigorous tends to be wider than expected.

The community response was mostly eye rolls. One comment said “just Google it.” That’s a category error. Another said this is overengineered for problems that don’t exist in most workflows. That one has more merit, but it also misses the audience. This spec isn’t aimed at people writing one-off prompts. It’s aimed at teams building systems where a single misrouted request produces downstream failures that are expensive to diagnose. For that use case, the overhead of a routing layer pays for itself fast.

🔧 If you’re building anything where reliability matters more than speed, the full spec is worth the read. It’s long. It’s also one of the more honest framings of what prompt engineering actually is: a control discipline, not a copywriting exercise. The value isn’t in the routing logic itself. It’s in what the routing logic forces you to decide before you write a single word of your actual prompt. Most people skip that step entirely and wonder why their outputs are inconsistent.

PEOS Router v2
by u/Alternative-Body-414 in PromptEngineering

Related: