If your AI keeps reaching for words like “clean,” “clear,” or “purify” when describing data operations, you’re dealing with the purity metaphor problem. A Redditor shared a systematic fix using hard pre-chat instructions to engineer that entire vocabulary out of your model’s output.
The Purity Metaphor Problem
Words like “clean data,” “clear output,” and “tidy results” feel intuitive. The original poster noticed they’re actually doing quiet damage in technical contexts. They’re metaphors borrowed from housekeeping, and they introduce ambiguity where precision is needed.
Consider what “clean” actually means in a data context. Depending on who you ask, it might mean removing null values, stripping duplicate rows, normalizing date formats, trimming whitespace, or dropping statistical outliers. That’s five completely different operations hiding behind a single word. When your AI documentation says “clean the dataset before processing,” a new engineer has no idea which of those five things to do. They’ll guess. Sometimes they’ll guess wrong.
The author’s core observation: “split” is more accurate than “separate.” “Reduce” is harder to misinterpret than “simplify.” “State” leaves less room for confusion than “explain.” The LLM defaults to purity language because it’s everywhere in training data. A one-time instruction doesn’t stick. You need a structural fix.
The Prompt System
Here’s the exact configuration the author posted. This goes into your system prompt or pre-chat block:
STYLE:no purity metaphor
HG_STT=1
BAN:/\b(clean(\W|$)|clear(\W|$)|clar\w*|puri\w*|impure|dirty)\b/i
BLOCK:tidy,neat,refine,purify,transparent,crisp
REDIR:stable,cohere,lock,distinct,defined,structured
REWRITE:separate->split; simplify->reduce; explain->state
HIT->REGEN
"clean"→""
What Each Directive Does
There are five layers to this system, each catching what the previous one misses:
- BAN: A regex pattern that catches root forms, case-insensitive. “Clean,” “clarity,” “purify,” “impure,” “dirty” all get flagged at the word boundary. The
\banchors matter here: they prevent false positives on words like “nuclear” while still catching every conjugation of the banned roots. - BLOCK: Soft synonyms carrying the same metaphor. “Tidy,” “crisp,” “transparent,” “neat” are blocked as a list. These are the words that slip through regex because they don’t share the same root, but they carry the same vague housekeeping connotation.
- 🔄 REDIR: The approved vocabulary the AI should use instead. Stable, cohere, lock, distinct, defined, structured. These aren’t just replacements for style; they’re terms with tighter technical definitions that force more specific thinking about what the output actually is.
- REWRITE: Specific word-level swaps. “Separate” becomes “split.” “Simplify” becomes “reduce.” “Explain” becomes “state.” Each swap removes a degree of interpretive freedom from the output.
- HIT→REGEN: The enforcement layer. If any banned word appears in the output, the whole response regenerates. No partial fixes. This is what separates the system from a style guide that gets ignored under pressure.
The final line, "clean"→"", is the hard suppression fallback: map the word directly to an empty string as a last line of defense.
Why Structural Beats Instructional
The insight here is that “use precise language” as a prompt instruction is too weak. LLMs are trained on text where purity metaphors appear constantly, and they’ll keep surfacing under pressure (long conversations, complex tasks, context drift).
Context drift is the specific failure mode worth understanding. In a short conversation, an instruction like “avoid vague terms” holds. Push past a certain length and the model starts optimizing for coherence and fluency over the rules you set at the top. It’s not ignoring you. The earlier instructions just carry less weight relative to the immediate context of producing a smooth sentence.
Pattern-based banning works differently. It intercepts at the output level, not the intention level. The REGEN trigger is what makes it stick: instead of quietly accepting a slipped term, you force a full retry. After a few regenerations within a session, the model adjusts its register. It learns, within that session, that certain paths lead to dead ends.
The author links to an ongoing bug report for this pattern, tracking cases where AI systems misuse these metaphors in technical contexts. It’s a living document, worth checking if you’re building prompts for controlled vocabularies.
🛠 Use Cases
- Data pipeline documentation: When “cleaning” means five different operations depending on context, precise language prevents misunderstandings downstream. “Deduplicate, then normalize timestamps, then drop nulls” is a spec. “Clean the data” is a wish.
- Code explanation tools: “Clean code” means something different to every developer; “structured,” “decoupled,” or “minimal” are actually debuggable terms with agreed-upon definitions in most engineering orgs.
- Compliance and legal drafting: Any domain with a controlled vocabulary where metaphors create liability. “Transparent reporting” sounds good until someone asks what it means in a regulatory filing.
- Technical writing assistants: Engineering docs where soft language creates ambiguity in specs or requirements. A requirement that says “the output should be clean” will generate a support ticket. A requirement that says “the output must contain no null values and conform to ISO 8601 date format” will not.
Adapting the Pattern
You don’t need to ban purity metaphors specifically. The structure here generalizes: pick your domain vocabulary problem, build a BAN list of the fuzzy terms, define a REDIR list of precise replacements, set specific REWRITE rules for the most common swaps, and add the REGEN trigger to enforce. The same architecture handles medical language drift, marketing jargon creep, or any domain where you need consistent technical register across a long conversation. A medical writing assistant, for example, might ban “healthy” and “normal” in favor of “within reference range” or “asymptomatic.” A legal drafting tool might block “fair” and “reasonable” in favor of jurisdiction-specific standards language.
The Redditor is still refining this system and welcomes tips from the community. If you’ve tackled vocabulary drift in your own prompt engineering work, the original thread in r/PromptEngineering is the place to share what’s worked.
Misuse of purity metaphor, how’s it going? Using a lot of hard pre-chat to try to stop misuse of words like ‘clean’ and ‘clear’ for data, my latest efforts! Any tips welcome.
by u/decofan in PromptEngineering