Context scored 8. A polished framework scored 4. Same task, same model, real data.

A newsletter writer ran five prompts against the same go-to-market task on Claude Sonnet and scored each output on specificity, actionability, and strategic quality. The numbers were not close.

Structured framework with role assignment and chain-of-thought: specificity score of 4. Same task with messy, unedited context and zero framework: score of 8. The jump happened by adding information and removing structure. That is the kind of result that should change how you think about every strategic prompt you write.

What the experiment looked like

The task was a GTM memo for a fictional B2B SaaS tool: project management, targeting agencies, $49/user/month, $300k ARR, series A. Same model, same day, five separate conversations. Here’s the scorecard (specificity / actionability / strategy):

  • Prompt 1: basic request: 3 / 3 / 4
  • Prompt 2: role + chain-of-thought + format instructions: 4 / 5 / 5
  • Prompt 3: heavy context, no framework: 8 / 8 / 7
  • Prompt 4: heavy context + framework: 8 / 8 / 8
  • Prompt 5: voice-dictated, rambling, 400 words: 9 / 8 / 9

The “heavy context” in Prompts 3 and 4 wasn’t structured or clean. It included current sales velocity, which channels had been tried and what happened, average deal size, what the founding team’s background was, which customer segments were converting and which weren’t, and why the $49 price point was chosen. That kind of messy operational detail is what moved the scores. The basic request in Prompt 1 gave the model a category and a price. The heavy context gave it a company with a history.

The gap between Prompt 2 and Prompt 3 is the whole story. Adding the framework back on top of raw context (Prompt 4) barely moved anything. The framework was not the variable that mattered.

Why voice dictation beat everything

Prompt 5 was a transcription of the author talking through the situation out loud using an AI voice tool. Rambling. Some repetition. No cleanup. It picked up one detail none of the typed prompts had: the founder has a strong LinkedIn presence in the agency space.

The model used it. It recommended building the founder’s personal brand as a growth channel. That suggestion didn’t appear in any other output because the author didn’t think to type that detail when writing. It only surfaced because talking out loud doesn’t give your brain time to filter for what seems “relevant.” You just say everything.

Speaking is less filtered than typing. You include things you’d normally self-edit out. You repeat yourself, circle back, contradict, then correct. That texture is not noise to the model. It is signal. The repetition highlights what you actually care about. The contradictions reveal real constraints. The corrections show where the situation is genuinely uncertain. A model doing strategic reasoning wants all of that.

3 ways to apply this

🔹 Brain-dump before you format. Write your context the way you’d explain the situation to someone smart who knows nothing about it. Don’t organize. Don’t trim. Information you cut is information the model never had. If you find yourself deleting a sentence because it feels obvious or tangential, put it back. The model will decide what’s relevant. You’ve already proven you’re bad at that judgment.

🔹 Include history, not just specs. The experiment included what the company had already tried, what worked, what didn’t. That history is context the model cannot infer from the current situation alone. Give it. “We tried cold outbound and got a 0.4% reply rate” is worth ten times more than “we are targeting agencies.” One tells the model the landscape. The other tells it the terrain you’ve actually walked.

🔹 Add frameworks after, not instead. Prompt 4 outscored Prompt 3 by a point. Structure helps at the margins. It just doesn’t replace substance, and the experiment proves the margin is small. Use frameworks to shape the output format once your context is already rich. Don’t use them as a substitute for doing the harder work of actually dumping what you know.

Tips and pitfalls

Use: Real numbers, actual constraints, past attempts, what your best customers look like and why. Generic input produces generic output. The model works with what you give it. If your prompt could describe any company in your category, the output will sound like advice for any company in your category. Make it impossible to mistake your situation for someone else’s.

Skip: Elaborate prompt engineering when your context is thin. Role assignment in the experiment made output more verbose, not more specific. The role didn’t know anything the prompt didn’t tell it. Telling the model to act like a senior GTM strategist does not give it your data. It just changes the tone of the generic answer.

Watch for: The compression habit. Typing naturally makes you summarize. The details you cut to keep it clean are often exactly the details that would have made the output useful. This is especially true for context that feels embarrassing to include: “we tried this and it flopped” or “we honestly don’t know why this cohort churned.” Those admissions are exactly what a good advisor needs to give you real guidance rather than textbook advice.

Try this today

Before your next strategic prompt, spend two minutes talking through the situation out loud. Record it, transcribe it, paste it in raw without cleanup. Then ask for whatever you were going to ask for. Don’t fix the grammar. Don’t remove the parts where you repeated yourself. Just send it.

Compare that output to the last polished prompt you wrote for a similar task. Look specifically at whether the recommendations are things only you could act on, or generic moves that any company in your space has already heard. That’s the test.

The messy version will probably beat the polished one.

Frequently Asked Questions

Q: Does context really matter more than having a specific framework or role?

Yep. Prompt 3 (heavy context, no special framework) outperformed prompt 2 (structured framework, light context) across all three scoring dimensions. The takeaway: models need real, specific details about your situation, constraints, what you’ve actually tried, your real numbers, to produce useful output. Frameworks help organize the answer, but the substance comes from context.

Q: How do you scale this if you can’t paste 300 words into every chat?

Fair question. Some users are solving it by building reusable context bundles that travel across sessions (like seed.show’s approach) instead of copy-pasting manually each time. The real lever is persistent context via system messages, keeps you from re-entering the same background info over and over.

Q: Does dictation prompting actually work better than typing?

Users report yes. Dictation preserves natural detail and texture that typed prompts tend to compress and self-edit away. That “founder LinkedIn detail” mentioned in the comments? Exactly the kind of thing you’d naturally include while speaking but skip while typing. If you’re getting generic output, try dictating to an AI transcription service.

Q: Do these prompting approaches work the same across different AI models?

Not necessarily. Persona + chain-of-thought works great for Claude but can produce wordy output in GPT-4o, which prefers structured constraints instead. Persona prompts also work better for creative tasks (like GTM strategy) but structured formats tend to win for technical work. Test your approach on your actual model and task type rather than assuming one method works everywhere.

I gave the same task to claude with 5 different prompting approaches and tracked the results. sharing my data.
by u/Signal-Extreme-6615 in PromptEngineering

Scroll to Top