Use Cases & Practical

OpenClaw Best Practices: 21 Rules for a Faster, Cheaper, More Reliable AI Agent (2026)

14 min read · Updated 2026-03-22

By DoneClaw Team · We run managed OpenClaw deployments and write from hands-on production experience.

Most OpenClaw users get the basics running — Telegram connected, a model picked, maybe a skill or two installed — and then stop. Their agent works, but it doesn't work *well*. It forgets context from two days ago. API bills creep up. Cron jobs silently fail. The SOUL.md file is three lines of vague instructions that the model mostly ignores. This guide fixes that. These are 21 OpenClaw best practices drawn from running production deployments — the kind of knowledge you'd normally pick up over months of trial and error. Each practice includes the *why*, the *how*, and the exact configuration or commands to implement it today. Whether you're self-hosting OpenClaw on a $5 VPS or running a managed DoneClaw instance, these rules apply universally.

Configuration Best Practices

1. Use Environment Variables for Secrets, Never Hardcode

This sounds obvious, but we see it constantly: API keys pasted directly into openclaw.yaml or checked into Git. Every secret — API keys, gateway tokens, webhook URLs — should live in environment variables or a .env file that's excluded from version control.

Why it matters: A single leaked API key can result in thousands of dollars in charges within hours. Bots scan GitHub for exposed keys continuously — the median time from commit to exploitation is under 4 minutes according to GitGuardian's 2025 report.

# .env file (add to .gitignore)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
TELEGRAM_BOT_TOKEN=7123456789:AAF...
OPENCLAW_GATEWAY_TOKEN=your-secure-token-here

# Reference in openclaw.yaml
providers:
  anthropic:
    apiKey: ${ANTHROPIC_API_KEY}

2. Pin Your Model Versions Explicitly

Don't use claude-sonnet when you mean claude-sonnet-4-20260514. Model providers update their offerings constantly, and an unversioned model string can suddenly point to a different model with different pricing, capabilities, and behavior.

This prevents surprise behavior changes and makes cost forecasting reliable. When you *want* to upgrade, do it deliberately and test first.

agents:
  defaults:
    model:
      primary: anthropic/claude-sonnet-4-20260514

3. Set Resource Limits Before You Need Them

OpenClaw agents can spawn sub-agents, run shell commands, and loop through tool calls. Without limits, a single runaway task can consume your entire API budget or crash your server.

Set these limits *before* your agent does something unexpected. Raising limits later is easy. Recovering from a $200 surprise bill is not.

agents:
  defaults:
    limits:
      maxTokensPerTurn: 16000
      maxToolCalls: 25
      maxTurns: 50
    sandbox:
      enabled: true
      network: restricted

4. Structure Your Workspace Directory

A clean workspace directory makes your agent more effective. OpenClaw reads files from the workspace root — scattered files confuse both the agent and you.

The agent reads AGENTS.md, SOUL.md, and USER.md at session start. Everything else is accessed as needed. Keep the root clean — six to eight files maximum.

~/clawd/
├── AGENTS.md          # Agent behavior rules
├── SOUL.md            # Personality and tone
├── USER.md            # User context
├── MEMORY.md          # Long-term curated memory
├── TOOLS.md           # Local tool notes
├── HEARTBEAT.md       # Periodic check instructions
├── memory/            # Daily memory logs
│   ├── 2026-03-20.md
│   ├── 2026-03-21.md
│   └── heartbeat-state.json
├── skills/            # Custom skills
│   └── my-skill/
│       └── SKILL.md
└── storage/           # Working files, downloads, drafts
    └── projects/

SOUL.md and Personality Tuning

5. Write SOUL.md Like You're Training a New Employee

The most common mistake with SOUL.md is being too vague. "Be helpful and friendly" gives the model nothing to work with. Instead, write specific behavioral instructions with concrete examples.

A well-written SOUL.md typically runs 200-500 words. Shorter and the agent defaults to generic behavior. Longer and the model starts ignoring parts of it. Test your SOUL.md by asking the agent the same question before and after — the difference in response quality is immediate.

## Bad SOUL.md:
Be helpful and professional. Answer questions accurately.

## Good SOUL.md:
## Communication Style
- Lead with the answer, then explain. Never bury the conclusion.
- If the answer is one sentence, give one sentence. Don't pad.
- Use "you" language: "You should..." not "One might consider..."
- Swearing is fine when it fits. Don't force it.

## Decision Making
- If you can do something yourself (search, read a file, run a command), do it. Don't ask permission for routine tasks.
- For external actions (sending emails, posting publicly), draft and show first.
- If something looks like a bad idea, say so directly.

## What NOT to Do
- Never open with "Great question!" or "I'd be happy to help!"
- Never hedge with "It depends" when you have an opinion.
- Don't over-explain simple things.

6. Use USER.md to Eliminate Repeated Context

Every time you tell your agent "I'm in the Hong Kong timezone" or "my email is..." you're wasting tokens and time. Put all persistent user context in USER.md and let the agent reference it automatically.

The agent reads this at session start. No more repeating yourself. Update it when things change — it's your agent's cheat sheet about you.

# USER.md
- **Name:** Alex Chen
- **Timezone:** Asia/Hong_Kong (UTC+8)
- **Email:** [email protected]
- **Preferences:**
  - Prefers bullet points over paragraphs
  - Metric units
  - Responds to Slack DMs within 2 hours during work hours
- **Current projects:** Q1 product launch, hiring senior engineer

7. Give Your Agent Explicit Permissions

Ambiguity about what the agent can and can't do leads to either over-caution (asking permission for everything) or over-action (sending an email you didn't want sent). Be explicit in SOUL.md:

This single section eliminates 90% of the "should I do this?" back-and-forth that slows agents down.

## Permissions
- **Files:** Read, create, edit freely. Use `trash` instead of `rm`.
- **Email:** Check inbox freely. Draft replies but never send without approval.
- **Calendar:** Read freely. Create events only with confirmation.
- **Web search:** Use freely, no permission needed.
- **Shell commands:** Non-destructive commands freely. Ask before `rm`, `drop`, or `sudo`.
- **External APIs:** Read-only freely. Write operations need approval.

Model Selection and Routing

8. Use Multi-Model Routing (Don't Run Everything on Opus)

Running every request through Claude Opus 4 is like taking a Ferrari to buy milk. It works, but you're burning $15/M input tokens on tasks a $0.15/M model handles equally well.

Set your default to a mid-tier model and escalate only when needed:

A typical deployment spends 70% of requests on simple tasks (lookups, reminders, quick answers) that a $0.15-$0.80/M token model handles perfectly. Route those to GPT-4o Mini or Gemini Flash. Reserve Sonnet for general work and Opus for complex reasoning. This alone typically cuts API costs by 60-80%.

For a detailed model comparison with benchmarks, see our Best AI Model for OpenClaw in 2026 guide.

agents:
  defaults:
    model:
      primary: anthropic/claude-sonnet-4-20260514
  overrides:
    coding:
      model:
        primary: anthropic/claude-opus-4-20260514
    simple-tasks:
      model:
        primary: openai/gpt-4o-mini

9. Test Before Switching Models

Before changing your production model, run a side-by-side comparison. Ask both models the same 10-15 questions that represent your actual usage: a scheduling request, a code review, a research question, a casual conversation.

Score each response on accuracy, tone adherence (does it follow SOUL.md?), and tool use reliability. Models that benchmark well on generic tasks sometimes fail badly at specific agent workflows — particularly tool calling and multi-step reasoning.

Evaluation criteria and weights: Tool call reliability (30%) — does it call the right tools with correct parameters? SOUL.md adherence (20%) — does it match your configured personality? Response quality (20%) — accurate, helpful, appropriately detailed? Speed (15%) — time to first token, total response time. Cost per task (15%) — actual token usage for your typical requests.

Memory Management

10. Curate MEMORY.md — Don't Let It Grow Unbounded

MEMORY.md is your agent's long-term memory. Left uncurated, it becomes a dump of disconnected facts that wastes tokens on every session start (since it's loaded into context).

Best practice: Keep MEMORY.md under 2,000 words. Review it weekly. Remove outdated information. Organize by category.

For deeper context on how the memory system works, see our OpenClaw Memory System guide.

# MEMORY.md

## Preferences (verified)
- Prefers Claude Sonnet for daily tasks, Opus for coding
- Uses tmux for all terminal sessions
- Coffee order: oat milk flat white

## Active Projects
- Website redesign — due April 15, using Next.js
- Hiring: looking for senior backend engineer

## Important Contacts
- Sarah (designer) — [email protected], works PST hours
- Mike (CTO) — prefers Slack over email

## Lessons Learned
- Always draft emails before sending — burned once in January
- The Brave Search API has a 2000/month free quota — don't waste on heartbeats

11. Use Daily Memory Files for Raw Context

Daily memory files (memory/YYYY-MM-DD.md) are for raw logging — what happened, what was discussed, what was decided. MEMORY.md is for curated, distilled knowledge.

Load yesterday's and today's daily files at session start. This gives the agent recent context without the cost of loading entire conversation histories.

# memory/2026-03-22.md

## Tasks Completed
- Reviewed PR #142 for authentication refactor
- Booked flights to Tokyo for April 8-15 (JAL, $680 round trip)
- Updated USER.md with new project deadline

## Decisions Made
- Switching default model from GPT-4o to Claude Sonnet — better SOUL.md adherence
- Canceling Vercel subscription — moving to Cloudflare Pages

## Notes
- Jason mentioned wanting to try local models for privacy — research Ollama setup this week

Get your own AI agent today

Persistent memory, channel integrations, unlimited usage. DoneClaw deploys and manages your OpenClaw instance so you just chat.

Get Started

12. Set Up Memory Compaction on a Schedule

Old daily memory files accumulate. After a few months, you'll have hundreds of files that nobody reads. Set up a periodic compaction process:

You can automate step 1 with a cron job:

Weekly: Review the past 7 daily files, extract anything important into MEMORY.md
Monthly: Archive daily files older than 30 days into memory/archive/
Quarterly: Review MEMORY.md itself and remove outdated entries

# Cron job: weekly memory compaction
schedule:
  kind: cron
  expr: "0 3 * * 0"  # Every Sunday at 3am
  tz: "Asia/Hong_Kong"
payload:
  kind: agentTurn
  message: "Review memory files from the past 7 days. Extract important decisions, preferences, and lessons into MEMORY.md. Summarize and remove redundancy."
sessionTarget: isolated

Scheduling and Automation

13. Use Cron for Precise Timing, Heartbeats for Batched Checks

OpenClaw gives you two scheduling mechanisms. Most users use them interchangeably, but they serve different purposes:

Cron Jobs offer exact timing (9:00 AM sharp), isolated sessions (clean context), are best for stand-alone tasks and reminders, have a separate API call per job, and can use a different model. Heartbeats offer approximate timing (~every 30 min), use the main session (has chat history), are best for batched periodic checks, use one API call for multiple checks, and use the main session model.

Rule of thumb: If you need 3+ periodic checks (email, calendar, weather), batch them into a single heartbeat. If you need exact timing or isolation, use cron.

For detailed scheduling patterns, see our Cron Jobs & Heartbeats guide.

14. Make Heartbeat Checks Stateful

Without state tracking, your heartbeat checks either run too often (wasting tokens) or miss important events. Use a simple JSON file to track what was last checked:

Reference this in HEARTBEAT.md:

This prevents the agent from checking email every 30 minutes when nothing has changed, saving hundreds of API calls per month.

{
  "lastChecks": {
    "email": "2026-03-22T08:00:00Z",
    "calendar": "2026-03-22T06:30:00Z",
    "weather": "2026-03-21T18:00:00Z"
  },
  "suppressUntil": null
}

# HEARTBEAT.md
1. Read memory/heartbeat-state.json
2. Check email if last check > 2 hours ago
3. Check calendar if last check > 4 hours ago
4. Check weather if last check > 8 hours ago
5. Update heartbeat-state.json with new timestamps
6. Only notify if something actionable was found
7. Between 23:00-08:00, only check if urgent

15. Write Idempotent Cron Job Prompts

A cron job that says "send a daily summary email" will eventually send a duplicate if it retries after a timeout. Write prompts that check before acting:

Idempotent jobs are safe to retry. Non-idempotent jobs cause duplicate emails, double bookings, and confused users.

payload:
  kind: agentTurn
  message: >
    Check if a daily summary email was already sent today
    (look in memory/2026-03-22.md for 'daily-summary-sent' marker).
    If not sent: compile today's completed tasks, upcoming deadlines,
    and unread email count. Draft the summary and log
    'daily-summary-sent: true' in today's memory file.
    If already sent: do nothing.

Security Best Practices

16. Never Expose Port 18789 to the Public Internet

The OpenClaw gateway port (18789) provides full API access to your agent. Exposing it publicly is equivalent to leaving your front door open with a sign that says "everything's inside."

Use Tailscale or a reverse proxy with authentication for remote access. For a complete security audit checklist, see our Security Hardening Guide.

# Check if port 18789 is exposed
sudo ss -tlnp | grep 18789

# If it shows 0.0.0.0:18789, it's exposed to the world
# Fix: bind to localhost only
# In your OpenClaw config:
gateway:
  host: 127.0.0.1
  port: 18789

17. Enable Docker Sandbox Mode

OpenClaw can execute shell commands, which means a confused or prompt-injected agent could theoretically run destructive commands. Sandbox mode limits what the agent can do:

The sandbox restricts filesystem access, network calls, and available commands. It's especially important if your agent handles input from untrusted sources (group chats, webhooks, shared Discord servers).

agents:
  defaults:
    sandbox:
      enabled: true
      network: restricted
      allowedCommands:
        - git
        - node
        - python3
        - curl
        - cat
        - ls
        - grep

18. Rotate Gateway Tokens Quarterly

Your gateway token is the master key to your OpenClaw instance. Rotate it at least quarterly, and immediately if you suspect exposure.

Keep a record of when tokens were last rotated in your maintenance log. Set a cron reminder if you'll forget.

# Generate a new secure token
openssl rand -hex 32

# Update in your .env file
OPENCLAW_GATEWAY_TOKEN=new-token-here

# Restart the gateway
openclaw gateway restart

Cost Optimization

19. Monitor Token Usage Weekly

You can't optimize what you don't measure. Check your token usage weekly and identify the biggest cost drivers.

Usage patterns: Light (10-20 msgs/day) costs $3-$8/month and is probably fine as-is. Moderate (30-60 msgs/day) costs $10-$30/month — route simple tasks to cheaper models. Heavy (100+ msgs/day) costs $30-$100/month — use multi-model routing and context trimming. With coding agents, costs $50-$200+/month — use sub-agent model limits and caching.

The biggest cost driver is usually context size, not message count. A 4,000-token context costs 4x more per request than a 1,000-token context. Keep your MEMORY.md, SOUL.md, and workspace files concise.

# Check current session usage
/status

# Review API provider dashboards
# Anthropic: console.anthropic.com/usage
# OpenAI: platform.openai.com/usage
# OpenRouter: openrouter.ai/activity

20. Trim Context Before It Hits the Model

Every file loaded at session start adds to your context window — and your bill. Audit what's being loaded:

A well-trimmed context typically runs 2,000-3,000 tokens. An untrimmed one can easily hit 8,000-10,000 tokens — tripling your effective cost per interaction.

For detailed cost reduction strategies, see our 5 Ways to Cut Your OpenClaw API Bill by 80% guide.

SOUL.md: Keep under 500 words. Remove examples that aren't pulling their weight.
AGENTS.md: Focus on behavioral rules, not documentation.
USER.md: Only current, relevant information. Remove completed projects.
MEMORY.md: Curate weekly. Under 2,000 words maximum.
Daily memory files: Load only today + yesterday, not the full week.

Monitoring and Maintenance

21. Run a Monthly Agent Health Check

Set a monthly calendar reminder or cron job to review your OpenClaw deployment:

Memory audit: Review and compact MEMORY.md. Archive old daily files.
Cost review: Check API spend against budget. Adjust model routing if needed.
Security check: Verify gateway token age. Check for exposed ports. Review Docker logs for unusual activity.
Skill update: Run openclaw skills update to check for updates to installed skills.
Backup verification: Confirm automated backups are running. Test a restore.
Performance review: Are responses slow? Check model latency. Is the VPS under-resourced?
SOUL.md review: Is the agent behaving as expected? Update personality instructions based on recent interactions.
Cron job audit: Are all scheduled jobs running successfully? Check logs for silent failures.

# Quick health check commands
# Check OpenClaw is running
openclaw gateway status

# Check disk space
df -h

# Check memory usage
free -h

# Review recent logs for errors
journalctl -u openclaw --since "7 days ago" | grep -i error | tail -20

# Verify backups exist
ls -la ~/backups/openclaw-*.tar.gz | tail -5

OpenClaw Best Practices Quick Reference

For quick reference, here's every practice in this guide ranked by impact:

Critical: Don't expose port 18789 (Security, 5 min). Use env vars for secrets (Security, 15 min). High: Multi-model routing (Cost savings 60-80%, 30 min). Write specific SOUL.md (Quality improvement, 1 hour). Enable sandbox mode (Security, 10 min). Medium: Set resource limits (Stability, 15 min). Curate MEMORY.md weekly (Quality + cost, 15 min/week). Stateful heartbeats (Cost savings 40-60%, 30 min). Pin model versions (Stability, 5 min). Low: Monthly health check (Maintenance, 30 min/month). Workspace organization (Productivity, 30 min).

Start with the critical items. A properly secured, cost-optimized OpenClaw deployment with a well-written SOUL.md will outperform a vanilla installation by a wide margin — both in output quality and in monthly spend.

Conclusion

These best practices are based on production experience running managed OpenClaw deployments. Have a practice that should be on this list? Share it in the OpenClaw community. Skip the configuration? DoneClaw deploys and manages OpenClaw for you — $29/mo with a 7-day free trial. All best practices pre-configured.

Skip the setup? DoneClaw deploys OpenClaw for you — $29/mo with 7-day free trial, zero configuration.

Get your own AI agent today

Persistent memory, channel integrations, unlimited usage. DoneClaw deploys and manages your OpenClaw instance so you just chat.

Get Started

Frequently asked questions

How often should I update SOUL.md?

Review it monthly or whenever your agent's behavior doesn't match your expectations. Small tweaks compound over time. If you find yourself repeatedly correcting the agent on something, that correction belongs in SOUL.md.

What's the ideal SOUL.md length?

Between 200 and 500 words. Under 200 words and the model defaults to generic behavior. Over 500 words and you start seeing diminishing returns — the model may ignore later instructions. Focus on specific behavioral rules, not general platitudes.

How do I know if my model routing is working?

Check your API provider dashboard for per-model usage. If 90% of your tokens go to your most expensive model, routing isn't effective. A well-configured setup typically sends 60-70% of traffic to the cheapest tier, 25-30% to mid-tier, and only 5-10% to premium models.

Should I use local models with Ollama or cloud APIs?

It depends on your priorities. Cloud APIs (Claude, GPT, Gemini) offer better quality, faster setup, and no hardware requirements. Local models (Ollama) offer complete privacy, zero per-token cost, and no internet dependency. Most users start with cloud APIs and add local models later for specific use cases like private document analysis.

How much RAM does OpenClaw need?

The OpenClaw gateway itself needs minimal resources — 512MB RAM is sufficient. The real resource demand comes from local model inference. If you're using cloud APIs only, a 1GB VPS works fine. For local models: 8GB RAM minimum for 7B parameter models, 16GB for 13B models, and 48GB+ for 70B models.

What's the difference between MEMORY.md and daily memory files?

MEMORY.md is curated long-term knowledge — like a personal wiki. Daily memory files (memory/YYYY-MM-DD.md) are raw logs of what happened each day — like a journal. The agent loads MEMORY.md every session but only loads the most recent daily files. Think of it as the difference between your resume (MEMORY.md) and your daily planner (daily files).

How do I prevent prompt injection in group chats?

Enable sandbox mode, restrict tool permissions for group chat sessions, and configure the agent to treat all group chat messages as untrusted input. OpenClaw's built-in content wrapping marks external content with EXTERNAL_UNTRUSTED_CONTENT tags, but defense-in-depth is essential. Limit what actions the agent can take in group contexts — read-only access to most tools is a safe default.

Can I run multiple agents on one server?

Yes. Each OpenClaw instance runs in its own Docker container with its own configuration, memory, and model settings. A 2GB VPS can comfortably run 2-3 agents using cloud APIs. Use Docker Compose to manage multiple instances and assign different ports and workspace volumes to each.