47% fewer tokens. Better accuracy. Same data. That’s not a rounding error, that’s a format problem.
If you’re passing structured data to LLMs and still reaching for JSON, you’re probably burning half your token budget on curly braces, repeated keys, and quotation marks. A benchmark just dropped that makes this hard to ignore. The researcher behind it ran 1,170 LLM calls across 195 data retrieval questions, comparing three formats head to head: JSON (standard pretty-printed), LEAN (a compact tabular encoding), and YAML. The results are clear.
LEAN used 44% fewer tokens overall and 47% fewer tokens per LLM call compared to JSON, while scoring 87.9% accuracy against JSON’s 86.2%. YAML sat in the middle: 21% smaller than JSON, 87.4% accuracy.
What the Numbers Actually Mean
The methodology was tight. No LLM judge, no fuzzy scoring. Deterministic, type-aware string and number matching across 195 questions. Two models tested: GPT-4o-mini and Claude Haiku. Both showed the same pattern, which matters. This isn’t a quirk of one model’s training data. When two architecturally different models converge on the same result, you’re looking at something structural, not accidental.
On flat tabular data, the gap widens further. Uniform employee records: LEAN used 61.6% fewer tokens than JSON. Time-series analytics: 59.5% fewer. GitHub repos: 46.2% fewer. The more uniform your data structure, the harder LEAN dominates.
Here’s how LEAN works. Take 100 employee records. JSON handles that in 6,150 tokens. LEAN does it in 2,361. Instead of repeating every field name on every row, LEAN declares the column names once in a header, then encodes rows as pipe-delimited values. No braces. No repeated keys. No quotes. Just data. The model reads column context once and applies it across every row, which is closer to how humans actually read spreadsheets. That cognitive alignment with how tabular data is naturally parsed likely contributes to the accuracy edge, not just the token savings.
employees:
#(active|department|email|id|name|salary|yearsExperience)
true|Engineering|paul.garcia@company.com|1|Paul Garcia|92000|19
^false|Finance|aaron.davis@company.com|2|Aaron Davis|149000|18
YAML removes braces and quotes but still repeats every key on every row, which is why it lands in the middle of the efficiency rankings rather than at the top.
3 Ways to Put This to Work
- 🗃 RAG pipelines with tabular context. If you’re retrieving product records, employee data, or analytics tables and injecting them into prompts, switching to LEAN cuts your context window usage nearly in half with no accuracy cost. At scale, that’s a direct reduction in API bills. For teams running RAG on large product catalogs or CRM exports, this single format change can meaningfully extend how many records fit within a single context window, reducing the need for chunking and improving retrieval coherence.
- 📋 Structured data extraction workflows. Sending CSVs or database exports to LLMs for field lookups, aggregation, or filtering? LEAN outperformed JSON on every single dataset in the benchmark. Same question, better answer, cheaper call. This is especially valuable in automated pipelines where a human isn’t reviewing each prompt before it fires. Fewer tokens in means less noise for the model to parse before it gets to the actual question.
- 📊 High-volume batch processing. 47% token reduction compounds fast. On pipelines running thousands of LLM calls daily, this is the difference between a manageable API bill and one that quietly doubles every month. If you’re processing nightly data exports, running scheduled enrichment jobs, or building a reporting agent that queries structured data on demand, LEAN is the highest-leverage format change you can make without touching your prompts or your model.
Tips and Pitfalls
LEAN was built for tabular data. It’s strongest when your objects share identical or near-identical fields. On deeply nested configs, it still saved 35% tokens over JSON, but the advantage shrinks as structure gets more irregular. Know your data shape before committing. If you’re working with mixed schemas where some records have 5 fields and others have 25, you’ll either need to normalize the data first or accept a smaller efficiency gain.
This benchmark tests LLM reading comprehension, not generation. The model receives formatted data and answers questions about it. That’s the exact use case where LEAN wins. Generating LEAN-formatted output from scratch is a separate challenge not covered here. Don’t assume accuracy on output tasks maps from these results without testing it on your own workflows.
Claude Haiku users get an extra edge. LEAN outperformed JSON by 2.6 percentage points on Haiku while using half the tokens. If Claude is in your stack, the case for switching is even stronger than the headline numbers suggest. Given that Haiku is the model most teams reach for in cost-sensitive production pipelines, that accuracy improvement at lower cost is exactly the combination that moves the needle on build vs. buy decisions.
LEAN isn’t a standard library you can npm install. You’ll need to write a format converter for your specific data source. Budget an afternoon for the initial implementation and a round of validation against your actual data before rolling it into production. The benchmark code is publicly available in the original poster’s GitHub repo, so at least you’re not building blind.
See the Full Benchmark
The original Redditor published all benchmark code and full results with reproducible seeded data, so you can run your own tests against your own datasets. If you want to dig into the methodology or start adapting LEAN for your pipelines, the full discussion is worth reading in the original Reddit post.
I benchmarked LEAN vs JSON vs YAML for LLM input. LEAN uses 47% fewer tokens with higher accuracy
by u/Suspicious-Key9719 in ChatGPTPromptGenius