OpenClaw Optimization: Master These Best Practices

Most people using OpenClaw are barely scratching the surface. They set it up, chat with it in a single thread, and wonder why it keeps forgetting things or feels clunky. Turns out, the gap between a basic setup and a truly optimized one is massive.

Matthew Berman, an AI professional who has spent over 200 hours and billions of tokens perfecting his OpenClaw environment, just dropped a video walking through every best practice he has learned. And honestly, some of these tips completely shifted how I think about running an agentic setup.

Let me break down the key ideas.

🧵 Threaded chats solve the memory problem

Here is probably the easiest and biggest unlock. Instead of dumping every topic into one long chat window, the creator splits conversations into separate threads using Telegram groups. Each thread gets its own context window and session, so OpenClaw only loads what is relevant to that specific topic. He has threads for his CRM, knowledge base, cron updates, and more. This is why he never had the memory issues that so many other users complain about. The context stays focused, the model stays on track, and you can jump between topics without awkwardly saying “hold that thought.” This approach works on WhatsApp and Discord too, any platform that supports threading.

🎙️ Voice memos for on-the-go interaction

This one is simple but powerful. Telegram has a built-in microphone button. You hold it down, speak your request, and OpenClaw processes it. No typing needed. The expert uses this constantly when he only has access to his phone, especially while driving. It works natively with no extra setup. If you are not using voice memos with your agent, you are leaving speed on the table.

🔀 Use the right model for the right job

This is where things get serious. Berman does not use a single model for everything. He runs a whole roster. Opus 4.6 for main chat and orchestration (because the planning model should be the best possible). Sonnet for tasks that do not need frontier intelligence. Gemini 3.1 Pro for video processing. Grok for search. GPT 5.4 as a fallback. Nomic for embeddings. Qwen 3.5 running locally for tasks like email labeling where he fine-tuned a small model to match Opus performance at zero cost. OpenClaw lets you assign specific models to specific threads, so your Q&A thread can run a lighter model while your coding thread runs a frontier one. Faster responses, lower costs, better results.

🤖 Delegate to sub-agents early and often

One of the most frustrating things about agentic workflows is when your main agent gets blocked on a long task. The solution: delegate. The creator sends all coding work, API calls, multi-step tasks, data processing, and anything taking over 10 seconds to sub-agents. His main agent stays free for conversation and planning. Sub-agents can even delegate further to agentic harnesses like Cursor Agent CLI or Claude Code. The key insight here is that your main agent should be an orchestrator, not a worker. Keep it unblocked.

📝 Prompt optimization across models

Different models respond differently to the same prompt. Opus 4.6 does not like all caps or negative instructions. GPT 5.4 actually prefers explicit “do not” statements. So the author maintains separate prompt files optimized for each model, built by referencing the official prompting guides published by each frontier lab. He even runs a nightly cron job that compares prompt files across models to keep them in sync while respecting each model’s preferences. It is complex, but the efficiency gains are real.

⏰ Cron jobs, security, logging, and documentation

The rest of the video covers four critical pillars that tie everything together:

Cron jobs: Schedule recurring tasks during off-hours to avoid quota conflicts. The creator spreads his nightly jobs across every 5 minutes to stay within rolling quota windows. Health checks, documentation drift detection, config consistency, daily backups, all automated.
Security hardening: A multi-layer defense against prompt injection. Layer one is deterministic text sanitation scanning for common injection patterns. Layer two is a frontier model reviewing anything that slips through. Layer three is outbound redaction of PII and secrets. Plus granular permissions (read email but cannot send, read files but cannot delete) and runtime governance with spending caps and loop detection to prevent wallet draining.
Logging everything: Cheap and essential. About 1 GB for two months of full logging. Every morning, the expert tells OpenClaw to review overnight logs, surface errors, and propose fixes. This turns debugging from guesswork into a simple conversation.
Documentation: A PRD describing all features, use case workflows, workspace organization, model-specific prompting guides, security docs, and a learnings file so the system never repeats the same mistake. A daily cron compares docs against actual code and commits to catch any drift.

The author also emphasizes using your subscription (via agents SDK for Anthropic, Codex OAuth for OpenAI) instead of raw API calls, which can be multiple times more expensive. And he recommends building OpenClaw in a proper IDE like Cursor rather than through Telegram, since code editors are simply better for iterating on agent configurations.

This video is packed with practical patterns that apply to any agentic setup, not just OpenClaw. If you want the full walkthrough with demos and the security prompt template, check out the original video for all the details.