Why AI Ideation Keeps Landing in the Same Boring Cluster (And the Prompt Structure That Breaks You Out)

Tell an LLM to “be creative” and you get the same cluster of ideas with slightly different labels. Inject a distant domain’s mechanism into your prompt and you get ideas that actually surprise you. Same model. Completely different results.

A researcher on Reddit posted their findings from testing this across 12 real ideation projects and roughly 23,000 generated outputs. The prompt structure they found outperforms every standard alternative on originality, with no penalty on usefulness. Here’s the breakdown.

The Real Problem With Standard Prompts

There are two prompts most people use for AI ideation.

The basic version: “[Your brief]. Generate 10 ideas.”

The “advanced” version: “[Your brief]. Be creative, combine concepts from distant fields. Generate 10 ideas.”

The second one feels smarter. Testing shows it produces roughly the same ideas. The reason is subtle but important.

LLMs have a gravitational pull toward the center of their training distribution. Every response lands somewhere between what your prompt requests and what the model considers statistically normal. More context fights that pull, but only redirects toward your own existing frame. You escape generic, you land in familiar.

Telling the model to “be original” doesn’t inject anything new into the idea space. It just adds a weak instruction competing against a much stronger prior. The prior wins every time.

🧲 What Domain Injection Actually Does

The fix isn’t better instructions. It’s changing what conceptual space the model is operating in. Here’s the structure:

[Your brief]

DOMAIN INJECTION:
Domain 1: [Specialist persona] + [counter-intuitive mechanism from that domain] + [bridging question toward your problem]
Domain 2: [same structure]
Domain 3: [same structure]

For each domain, construct the bridge between the brief and that domain’s mechanism, then generate 10 ideas that apply that mechanism to the brief.

Run each domain in a separate context window. Curate across all outputs afterward.

The logic comes from Koestler’s bisociation: distant domains sometimes share hidden causal structures with your problem. You’re not telling the model to try harder. You’re forcing it into a structurally different idea space entirely.

A Real Example Worth Looking At

Brief: Redesign Spotify’s Discover Weekly to break users out of their taste bubble.

Standard prompt output: add a diversity slider, show music from adjacent genres, let users set a novelty preference. Fine. Obvious. Expected.

Now inject parasitology as the domain:

“A parasitologist who studies how organisms hijack host behavior for their own reproductive benefit. Key mechanism: the host’s decision-making is redirected without conscious awareness, serving the parasite’s goals. Bridge question: what if Discover Weekly served the music ecosystem’s health rather than the user’s stated preferences?”

Ideas that come out of that collision:

  • A “host override” mode that temporarily removes the user’s listening history from the algorithm entirely
  • Recommendations driven by what’s statistically underplayed relative to quality, not what matches your taste profile

Same problem. Completely different solution space.

📊 What the Data Shows

The test covered four conditions across 23,000 outputs:

  • A, domain injection (the structure above)
  • B, bare baseline prompt
  • C, “be original, combine distant concepts” instruction
  • D, longer in-domain brief, length-matched to A

Domain injection escaped the baseline idea cluster on all 12 projects (p = 0.0002). The creativity instruction and the longer brief barely moved. It’s not the extra instructions or extra tokens doing the work. It’s the structural distance of the injected content.

Three independent LLM judges rated originality blind. Domain injection won about 2 out of 3 comparisons against every baseline, with no detected penalty on usability.

🛠️ How to Run This Yourself

  1. Write a clear brief: what problem are you solving, what makes an idea actually good for this specific context
  2. Generate 5 to 8 distant domains: ask an LLM “Give me 8 domains with no obvious connection to [your topic], each with a specialist mechanism and a bridging question toward [your topic]”
  3. Run one call per domain: fresh context window for each, no cross-contamination between domains
  4. Curate across all outputs: expect mostly noise, a few genuinely non-trivial ideas mixed in

The volume is intentional. Most domain collisions produce nothing useful. A small percentage produce ideas you’d never reach otherwise. You need enough collisions to surface those rare hits.

The Takeaway

Prompting an AI to “think outside the box” is like telling your GPS to find a more creative route. The underlying system doesn’t change. What changes is when you inject a completely different map.

Try this on your next ideation session. Start with just one domain injection. One unfamiliar field, a specialist mechanism, a bridging question toward your problem. The difference from your baseline prompt will be obvious within the first 10 ideas.

Frequently Asked Questions

Q: Which LLMs work best with this technique?

You can use pretty much any LLM for this, the principle applies universally. That said, stronger models with better instruction-following (like Claude Sonnet or GPT-4) tend to make better cross-domain connections. Start with what you’ve got; if ideas feel scattered, upgrading might help.

Q: Can I customize this method for my problem?

Absolutely. One commenter created their own variation by adding structured questions and a behavioral psychology angle. The core idea, forcing collision between distant domains, is flexible. Just customize the role, domains, and constraints for your problem; the goal is keeping the LLM out of generic territory.

Q: Do I need separate context windows for each domain?

The post recommends it to prevent domains from blending together, but that burns through tokens. You can run them sequentially in one conversation instead, you’ll get more synthesis between domains, which works great for unified thinking but might muddy things if you want distinct idea clusters. Try both and see what fits.

Q: What evidence backs up these claims?

The post cites testing across 23k outputs using embedding distance, but the actual methodology isn’t published yet, commenters asked for it. The underlying idea (that LLMs default toward statistical averages) makes sense, but if you need the full research, you’d probably have to reach out to the author.

How to consistently get non-trivial ideas from LLMs — a prompt structure that actually works (tested on 23k outputs)
by u/Dry-Writing-2811 in ChatGPTPromptGenius

Scroll to Top