Politeness is burning your Claude credits. Not a little. A lot.
Most people using Claude right now are paying a hidden politeness tax on every single prompt. It adds up fast. If you’re running Claude through an API, that tax hits your invoice directly. If you’re on a subscription plan hitting usage limits, that tax is showing up as slowdowns and throttling. Either way, you are trading real value for nothing. You are paying to be polite to a language model that does not notice and does not care.
A developer on r/PromptEngineering stumbled onto something most users never figure out: the caveman theory of prompting. Short prompts. No pleasantries. No filler. Just the information Claude actually needs to do the job. Same output quality. A fraction of the tokens. The post blew up because it named something people had been feeling but couldn’t articulate. The model doesn’t reward effort. It rewards precision. Those are not the same thing, and confusing them is expensive.
The test case was brutal in its simplicity. Two prompts for the same bug fix:
Before: “hey Claude, i hope this makes sense but i’ve been working on this project and i’m running into an issue with the function on line 47, it keeps throwing a null error whenever i try to pass in a user object that might be undefined, not sure if this is a me problem or a code problem but any help would be great” (57 words)
After: “line 47. null error. fix.” (4 words)
Same result. Claude doesn’t have mornings. It doesn’t need the backstory of the what. It just needs the what. The model processes both prompts through the same attention mechanism. The polite version just costs more compute to arrive at the same answer, and that compute is yours.
🎯 Three Things the Caveman Framework Gets Right
- ⚡ Kill the ceremony. No good mornings, no apologies for asking a “weird question,” no closing thanks. Every “sorry if this is strange” is pure credit waste before you’ve said anything useful. You’re paying per token to say thank you to software. Stop. This sounds obvious until you actually audit your last 20 prompts and count how many started with some version of “I was wondering if maybe you could help me with…” That’s a whole sentence of tokens spent on throat-clearing. Nothing in that sentence is input. It’s all noise. Cut it completely and watch your prompt get sharper immediately.
- 🔧 Verbs and symbols over sentences. “Summarise.” “Fix.” “Rewrite shorter.” Instead of “can you compare option A versus option B,” just type “A vs B?” Claude knows exactly what that means. Symbols aren’t lazy, they’re efficient. Treat Claude like a search engine that thinks. A search engine doesn’t need you to explain why you’re searching. It needs the search term. The same logic applies here. “Pros/cons: React vs Vue for mobile” outperforms “I’m trying to decide between React and Vue for a mobile project and I’d love it if you could walk me through the main tradeoffs.” Both get you the same table. One costs about 15% of the tokens the other does.
- 💡 Know the one real exception. Complex creative work with a specific voice, nuanced emotional context, anything where vague input produces vague output. Caveman theory breaks there. A four-word prompt for a brand manifesto will get you something generic. A four-word prompt to write a eulogy will miss everything that matters. But that covers maybe 30% of what most people use Claude for. The other 70%? Pure ceremony that costs you nothing to cut. Debugging, summarising, reformatting, translating, extracting data, generating boilerplate, answering direct questions. All of that is caveman territory. All of it rewards brevity over warmth.
The uncomfortable part: the product doesn’t benefit from you being efficient. Nobody told you this when you signed up. You either figured it out or you didn’t. There’s no onboarding screen that says “by the way, every filler word costs you money.” That information lives in developer forums and Reddit threads, not in the interface where most people are spending their credits. The default behavior the product nudges you toward, the chat-like interface that looks like a conversation with a person, is also the expensive behavior. That’s not an accident. It’s just not your problem to solve for them.
💬 Try This Today
Next time you catch yourself typing “I hope this makes sense, but…” stop. Delete everything except the actual ask. Then cut half of what’s left. What remains is your real prompt. Run it. You’ll be surprised how little Claude was ever using the rest. Do this for a week across your regular workflow and track how your usage changes. Most people who run this experiment come back with the same report: output quality stays flat, token count drops by 40 to 60 percent, and the habit of verbose prompting starts to feel genuinely wasteful once you see the numbers. The caveman was onto something. Speak only when you have something to say. Claude will handle the rest.
Frequently Asked Questions
Q: Will caveman prompting make Claude’s replies shorter too?
Terse prompts mainly save input tokens by cutting fluff from your side. If you want shorter responses from Claude, you’ll need to explicitly ask for it (e.g., “answer in 2 sentences”). That said, terse prompts sometimes do trigger more direct answers , worth testing on your own workflows to see the effect.
Q: Why do I feel weird being terse to Claude if it saves tokens?
It’s just human nature. Developers constantly catch themselves writing “sorry” and “thanks” despite knowing it wastes tokens. There’s a deep instinct to be polite that overrides the rational “this costs money” part of your brain. Once you notice the habit, it gets easier to kill those words before sending.
Q: How much does caveman prompting actually save? Is there proof?
The OP’s example dropped from 57 words to 4 words for the same output , a 93% reduction in prompt length. Real savings depend on your tasks. Check your API dashboard to compare token usage before and after adopting this style, or use the caveman repo tools mentioned in comments to track automatically.
Q: Does it matter if I ask as a question vs give a command?
Small but real difference. Question marks (“can you fix this?”) put Claude in conversational mode; commands (“fix this”) feel more direct and sometimes trigger tighter responses. It’s a minor tweak, but worth experimenting with if you’re optimizing token usage.
i started talking to Claude like a caveman. my credits lasted 3x longer. i’m not joking.
by u/LoadOld2629 in PromptEngineering