New open-source framework just shipped that drops AI detection scores by 55% without touching temperature. The twist lands at step 2: it’s not rewriting your output. It’s poisoning the model’s working memory before a single token generates. If you’ve been trying to solve the detection problem at the output layer through paraphrasing tools, temperature tweaks, or post-processing filters, you’ve been pointing at the wrong place. GhostCoT goes upstream, corrupting the statistical signature before the visible text even exists.
This is GhostCoT, built by u/Confident-Edge-9306 and posted to r/PromptEngineering. Detectors like Fast-DetectGPT look for smooth, high-probability token sequences. Human writing is messier. When a person types, they second-guess word choices, restart sentences mid-thought, and pull from associations that don’t follow clean probability gradients. AI models, left to their defaults, produce text that clusters in the highest-probability zones of the distribution every single time, which is exactly what detection algorithms are trained to catch. GhostCoT introduces that mess at the architecture level, not the output level, by exploiting how the Transformer KV Cache works. The KV Cache stores key-value pairs from earlier tokens and uses them to condition everything that follows. Inject the right kind of statistical noise there, and the model’s entire generation process carries that noise forward.
Here’s the mini-workflow:
- 🧠 Wrap your prompt with a
<thought_process>block that runs before the final output. This isn’t a scratchpad for your own notes. It’s a structured internal monologue that forces the model through deliberate reasoning steps, including counterarguments, false starts, and hedged assumptions, before committing to any output tokens. The goal is to warm the KV Cache with uncertain, exploratory content rather than confident generation patterns. - That block fills the KV Cache with low-probability “blueprint” tokens and this is where the detector gets confused. Normal AI generation warms the cache with high-probability context that signals machine confidence. GhostCoT warms it with the kind of rougher, exploratory context a human thinker produces instead. When the actual output starts generating, it’s conditioned on a fundamentally different probability landscape. The detector reads the token probabilities, expects a smooth distribution, finds a rougher one, and can’t reliably classify the text as machine-generated.
- 🔀 Add Compute Shifting directives that separate strategy (cliché detection) from execution (writing) to prevent the model defaulting to safe patterns. In practice this means instructing the model to first identify predictable phrases or transitions it would naturally reach for, flag them, and then consciously route around them during the actual write. Splitting metacognition from generation produces more varied syntax and lower curvature scores without making the content harder to read.
- Apply Subject Erasure and Syntactic Shredding to inject the statistical noise human writing naturally carries. Subject Erasure means occasionally dropping the grammatical subject in ways that still read naturally, mirroring how people write when they’re in a flow state. Syntactic Shredding breaks the predictable subject-verb-object templates that models over-rely on and replaces them with varied constructions. Neither technique changes your meaning. Both change the measurable fingerprint.
- ✂️ Use the Sliding Window chunking strategy on anything longer than a few paragraphs. The window moves through your document in overlapping segments, applying the full GhostCoT template to each chunk independently. This keeps KV Cache influence fresh across the entire document length rather than letting it fade as token distance from the original injection point grows.
Benchmarks from 10-chunk adversarial testing: AI probability dropped from 0.9546 to 0.4284. Curvature fell 68%. Logic held at 0.5482 LCS similarity. To put those numbers in context: a score above 0.5 on Fast-DetectGPT is typically flagged as AI-generated. Dropping from 0.9546 to 0.4284 puts the output below that threshold entirely. The 68% curvature reduction matters because curvature measures how smoothly probability shifts across adjacent tokens, and that smoothness is the core signal these detectors use. Keeping LCS similarity at 0.5482 confirms the logical structure and factual content of the original survived intact. The detector fingerprint got scrambled while the content stayed coherent.
Pro tip: Don’t skip the Sliding Window on long documents. Without per-chunk application, KV Cache influence fades by the third or fourth paragraph and detection scores creep back up. The chunking strategy is what makes this stable at scale. One more thing worth knowing: the <thought_process> block works better when it includes genuine friction, not just chain-of-thought reasoning. Ask the model to argue against its own first draft, identify where it sounds formulaic, and revise its approach before writing. Performative reasoning doesn’t corrupt the cache the same way adversarial reasoning does. The difference shows up clearly in the curvature scores.
Full prompt template and chunking logic are open-sourced on GitHub. Worth pulling apart if you’re building anything that needs to stay under the detection radar. The framework is modular, so you can implement the KV Cache poisoning step alone if the full pipeline is more than your use case needs. Start with step 2. That’s where most of the lift comes from. 🚀
GhostCoT: Bypassing Fast-DetectGPT via KV Cache Pollution and Implicit CoT
by u/Confident-Edge-9306 in PromptEngineering