Scan-for-Secrets: Prevent Accidental API Key Leaks

Simon Willison just shipped a new Python tool called scan-for-secrets, built to solve a problem familiar to anyone who publishes logs or transcripts from AI coding sessions: the risk of accidentally leaking API keys and other sensitive credentials.

Willison, who regularly publishes transcripts of his local Claude Code sessions using his claude-code-transcripts tool, reports being “often paranoid” that secrets might slip through in detailed log files. So he built a dedicated scanner to catch them before they go public.

What It Does

The tool is straightforward to use. You feed it your secrets and point it at a directory:

uvx scan-for-secrets $OPENAI_API_KEY -d logs-to-publish/

Leave off the -d flag and it defaults to scanning the current directory. What makes it more than a simple string search is its awareness of encoding variations. The scanner doesn’t just look for literal secret strings. It also checks for:

Backslash-escaped versions of your secrets
JSON-encoded variants that might appear in serialized log data
Other common encodings that could disguise a key in plain sight

This matters because API keys buried inside JSON structures or escaped strings would slip right past a basic grep.

Persistent Configuration

For secrets you always want to protect, scan-for-secrets supports a ~/.scan-for-secrets.conf.sh configuration file. You list shell commands that output your secrets, and the tool runs them automatically. Willison’s own config pulls keys from his llm tool for OpenAI, Anthropic, Gemini, and Mistral, plus extracts AWS credentials from ~/.aws/credentials.

This means you set it up once and every scan covers your full set of sensitive credentials without remembering to pass each one manually.

Built with README-Driven Development

The development process here is worth noting. Willison used what he calls README-driven development: he wrote the complete README first, describing exactly how the tool should work, then fed that spec into Claude Code and had it build the implementation using red and green test-driven development (TDD). It’s a clean example of using AI coding assistants not as magic wands but as implementation partners, where the human defines the spec and the AI handles the build. The README becomes both documentation and prompt.

Why This Matters

As more developers use AI coding tools like Claude Code, Cursor, and Copilot, the practice of sharing sessions and transcripts is growing fast. These logs are valuable for teaching, debugging, and transparency. But they’re also a minefield for credential leaks.

A single exposed API key can lead to unauthorized usage, unexpected bills, or worse. Traditional secret scanning tools like git-leaks or GitHub’s built-in scanner focus on repository commits. Willison’s tool fills a different niche: scanning arbitrary output files before you publish them anywhere.

The tool is available now as a Python package, runnable via uvx without installation. More details on encoding handling and configuration options are in the project’s README on Simon Willison’s site.

Read original article

What It Does

Persistent Configuration

Built with README-Driven Development

Why This Matters

Related: