Most AI tool articles are fiction. This Redditor ran the actual experiment.

Most AI tool roundups are written after a 20-minute trial. That’s not an experiment. That’s a first impression dressed up as a verdict.

When u/AdCold1610 posted in r/PromptEngineering that they’d spent 90 days testing 47 different tools before writing a tier list, I stopped scrolling. The post is the honest breakdown this space rarely gets: no affiliate links, no hype cycle, just three months of actual use with clear verdicts on what changed how they work and what didn’t.

The old approach vs. the actual test

The standard AI tool article goes like this: pick 10 tools with the best PR, describe what they claim to do, add a stock photo, publish. It gets clicks. It teaches you nothing.

This post operates differently. The author sorted 47 tools into three buckets based on one criterion: did it actually change how they work after the novelty wore off? Not what it promised. What it delivered at month three. That’s a harder question than most reviewers bother to ask.

That’s the contrast worth paying attention to.

🔧 What actually stuck

These are tools that permanently shifted the author’s daily workflow.

NotebookLM leads the list, described as “underrated to the point it’s embarrassing.” Feed it research papers, a podcast transcript, your own notes, or a folder of client documents. It synthesizes a FAQ you couldn’t write yourself, with zero hallucinations because it only works with what you give it. The author used it to process a dense research paper in under 10 minutes and came away with a cleaner understanding than a full read-through gave them. Their verdict: the only AI tool that makes reading faster without making you less capable.

Claude (long context) gets a strong endorsement for document analysis. A 90-page legal document went in once. The summary that came out was better than what the lawyers produced. If you’re not using long-context models for document work, you’re leaving real value unrealized.

Gamma turned 3-hour presentation builds into 25-minute ones. Describe the deck, it builds the structure, you edit. That’s the whole workflow.

Perplexity replaced Google entirely for anything requiring a source trail. Not for creative work. Purely for: “I need to know something true, fast.”

The tools people use wrong

This is where the post gets genuinely useful. There’s a full category of capable tools that most people underuse because the approach is off.

ChatGPT is “phenomenal if your prompts are structured. Average if they’re not.” The author’s analogy: blaming the calculator when you typed the equation wrong. Structured means giving it a role, a context, a constraint, and a format. Not just a question. The tool isn’t the problem. The input is.

Midjourney gets used mostly to generate random art. The actual use case is mood boarding and visual thinking. Treat it as a brainstorming tool instead of a final output tool and it becomes a different product entirely.

Zapier AI automated the author’s full weekly reporting workflow. Zero code. Two hours of setup. Saves roughly 5 hours a week since.

The overhyped tier (the honest part)

Most AI writing assistants produce the same voice: “flattened, optimistic, slightly breathless.” If you’re publishing AI content without heavy editing, your work sounds like everyone else’s. That’s not a small problem.

AI video generators still have an uncanny valley problem for professional use. Browser AI extensions: the author installed and deleted 11 of them. Most just paste a chat button on top of what you’re already doing, and they ask for more permissions than they deserve.

The insight that matters more than the tier list

The author buries the real finding near the end of the post: the gap between people who get genuine ROI from AI and people who don’t isn’t the tools they pick. It’s the prompts.

Same tool. Same model. Completely different output quality depending on how you structure the input.

Context, constraints, task chaining, output formatting: that’s the skill. It’s the new version of learning Excel shortcuts or SQL queries. Most people still aren’t treating it seriously enough to actually study it. The ones who do are quietly building a compounding advantage over everyone else who just opens a chat window and types.

How to apply this

  1. Pick one tool from the “actually stuck” tier you haven’t seriously used yet. Give it one real task this week, not a five-minute demo.
  2. Audit your prompts before blaming the model. Mediocre output usually means mediocre input. Check the equation before you blame the calculator.
  3. Use Midjourney for brainstorming, not final production. The value difference is significant once you shift how you approach it.
  4. Treat prompting as a compound skill. Not a trick. Something worth deliberate study and consistent practice.

The original Reddit thread is worth your time. Head over and check the comments. The author even notes that the comments on posts like this are usually sharper than the post itself. Based on this one, that’s a high bar.

Frequently Asked Questions

Q: What makes NotebookLM so special if ChatGPT can analyze documents too?

NotebookLM stays grounded in only what you upload, zero hallucinations. People feed it research papers, transcripts, or personal notes and have it synthesize distilled summaries or even podcast-format overviews. Users even load personal archives (family records, technical manuals) and use it as a knowledge base. It’s built specifically for document synthesis, not general chat.

Q: Why do all AI-written posts and ChatGPT outputs sound identical?

Most AI writing assistants are trained similarly, so they produce the same flattened, optimistic voice. The solution is heavy editing and injecting your own perspective, or using structured, detailed prompts. ChatGPT and Claude both suffer from this if you don’t guide them with specificity.

Q: Are most of these tools just different UIs for the same underlying models?

Yes, most use Claude, ChatGPT, or Gemini under the hood. The real value is workflow optimization: NotebookLM for documents, Perplexity for sourced research, Gamma for presentations. So you’re paying for the task-specific interface and workflow, not a fundamentally different model.

Q: How do I write better prompts if prompt engineering isn’t my strength?

Here’s a meta approach: ask the AI to reformulate or optimize your prompt before answering the actual question. This metacognitive step often gets better results than you tweaking the original prompt yourself.

i tested 47 AI tools in 90 days. here’s the honest tier list nobody writes.
by u/AdCold1610 in PromptEngineering

Scroll to Top