Agent Verifier: Automated Security Audit for AI Agents

Yesterday a developer shipped Agent Verifier on GitHub. It reviews your AI agents for security issues, hallucinated tools, and infinite loops. Two steps to use it. Step 2 is almost embarrassingly simple.

For months, Claude Code has been shipping agents with the same class of bugs: hardcoded API keys, tools the model referenced but invented, retry loops with no exit condition, system prompts big enough to blow a context window. Not carelessness. Just stuff that slips past every reviewer, human or AI. The hardcoded key problem is obvious in hindsight but shockingly easy to miss during a fast iteration cycle. You write the key directly into config during a test run, forget to rotate it out, and it ends up in a commit. The hallucinated tool problem is subtler. The model writes a call to execute_sql() or fetch_calendar_events() with total confidence, the code looks plausible, the tests that don’t actually run the agent pass, and the bug sleeps until it hits production. Unbounded loops are the third category that kills you slowly. The agent retries, retries again, burns through API budget, and only stops when something external cuts it off.

Agent Verifier runs as an automated pre-ship audit. Here’s what a real output looks like:

✅ 8 checks passed | ⚠️ 3 warnings | ❌ 2 issues

❌ Hardcoded API key at config.py:12 → Move to environment variable
❌ Hallucinated tool reference: execute_sql → Tool referenced but not defined
⚠ Unbounded loop at agent/loop.py:45 → Add MAX_ITERATIONS constant

Notice the format: file name, line number, and a concrete fix suggestion, not a vague warning. That specificity matters when you’re triaging before a deadline. You’re not hunting through 400 lines trying to figure out what “possible security concern” means. You go straight to config.py line 12, move the key to an environment variable, and move on. The whole thing reads like a code review from a colleague who actually looked at the file.

One developer in the comments lost half a day last week debugging an agent that kept calling a function the model made up. The frustrating part wasn’t the time lost, it was that the error only surfaced at runtime, deep in a multi-step pipeline, after several other tools had already executed successfully. That hallucinated tool check alone is worth the install. It catches the reference statically, before you’ve run anything, before you’ve waited for an API call to time out, before you’ve stared at a traceback trying to figure out why a function that looks totally reasonable doesn’t exist anywhere in your codebase.

How to set it up

Step 1. Install once across your dev environment:

npx skills add aurite-ai/agent-verifier -a claude-code

This pulls the skill into your Claude Code environment and registers it globally. You do this once, not per project. From that point forward, every agent you build in any project on that machine has access to the verifier without any additional configuration. No YAML, no config file to maintain, no environment-specific setup.

Step 2. Inside Claude Code, point it at any agent folder and say:

verify agent

Two words. Full structured report. That’s the entire workflow.

Claude Code interprets the command, runs the static analysis against your current agent directory, and returns the categorized findings inline. The report groups results by severity so you can triage fast: critical issues first, then warnings, then what passed clean. If your agent spans multiple files, the verifier traces references across the whole module, not just the entry point. A tool defined in one file but miscalled from another still gets flagged.

Works with Claude Code, Roo Code, Cursor, Windsurf, and 30+ other agents. MIT licensed. All analysis runs locally, which matters if you’re working with proprietary code or in an environment with outbound network restrictions.

Pro tip

The checks run in two tiers: pattern-matched (reliable) and heuristic (best-effort). Every finding is tagged with its confidence level. Treat pattern-matched flags as hard blockers before any commit. Use heuristic warnings as a second-pass review list. Don’t skip either category.

The practical way to build this into a workflow: run verify agent before every commit the same way you’d run your test suite. Pattern-matched errors are binary, fix them or don’t ship. Heuristic warnings deserve a look even when they turn out to be false positives, because the process of checking forces you to read the flagged section of code carefully, which is often where you spot something unrelated that needed attention anyway. The loop check is particularly worth taking seriously even when the confidence tag is lower. An unbounded retry in production is an expensive lesson.

Check it out

Repo: github.com/aurite-ai/agent-verifier. Open source, actively expanding the check library. The maintainers are adding new checks with each release, so the tool gets more thorough the longer it sits in your environment. If you ship AI agents with any regularity, this is the safety net that earns its keep the first time it catches a hardcoded secret before production. And based on the comment thread under the repo, that first catch tends to happen faster than most people expect.

Frequently Asked Questions

Q: What is a hallucinated tool reference and why is it so hard to debug?

It’s when Claude confidently generates code calling a function or API that doesn’t actually exist in your codebase. Without tooling to catch it, you spend hours tracing mysterious failures. Agent Verifier flags these instantly by analyzing your available tools and catching references that don’t match.

Q: Can this prevent broken scripts from running during development?

Yes , run verify agent before executing your agent to catch issues like hallucinated tool references, infinite loops, and hardcoded secrets in one structured report. It won’t stop execution automatically, but it catches problems in development so they don’t reach production.

Q: How reliable are the pattern-matched vs heuristic checks?

Pattern-matched checks are high-confidence and bulletproof (like detecting hardcoded API keys via regex), while heuristic checks are best-effort (like spotting unbounded loops). Every finding is tagged with confidence level so you know what to trust and prioritize.

I built an open-source verification skill for Claude Code that catches security issues, hallucinated tools, and infinite loops
by u/Chance-Roll-2408 in PromptEngineering

How to set it up

Pro tip

Check it out

Frequently Asked Questions

Related: