Setup & Deployment

OpenClaw Multi Agent: How to Build and Orchestrate AI Agent Teams (2026)

22 min read · Updated 2026-03-18

By DoneClaw Team · We run managed OpenClaw deployments and write from hands-on production experience.

OpenClaw multi agent setups let you run multiple AI agents that work together — each handling a different task, using a different model, or operating in a different context — all coordinated by a single orchestrator. Instead of one overloaded agent doing everything from email triage to code reviews to research, you split the work across specialized sub-agents that run in parallel, share results, and finish faster. This guide covers every aspect of OpenClaw multi agent orchestration: the architecture behind it, how to configure sub-agents and ACP sessions, real-world workflow patterns, cost management, and troubleshooting. By the end, you'll have a production-ready multi-agent system running on your own infrastructure.

Why Multi-Agent? The Case for Splitting Work

Single-agent setups work fine for most personal use cases. You message your OpenClaw agent on Telegram, it responds, life is good. But once you start pushing an agent to handle complex, concurrent tasks — reviewing three PRs while monitoring your inbox while researching a topic — things break down.

The problems with a single overloaded agent:

Multi-agent orchestration solves all three. Each sub-agent gets its own context window, runs in parallel, and can use a different model optimized for its specific task. Anthropic's research on building effective agents identifies this as the "orchestrator-workers" pattern — one of the most powerful architectures for complex AI workflows.

The numbers back this up. In our testing across DoneClaw deployments, multi-agent setups complete complex workflows 2.5–4x faster than single-agent approaches while using 15–30% fewer total tokens (because each sub-agent has a cleaner, more focused context).

  • Context window saturation: A single conversation thread accumulates tokens fast. Once you're deep into a coding task, asking the same agent to also check your email means the model juggles two unrelated contexts, reducing quality on both.
  • Sequential bottleneck: One agent processes one task at a time. If a code review takes 3 minutes, your email check waits. Multiply by five tasks and you're waiting 15 minutes for something that could finish in 3.
  • Model mismatch: Not every task needs the same model. Summarizing emails works great on a fast, cheap model like Gemini 2.5 Flash. Complex refactoring needs Claude Sonnet or Opus. A single agent forces you to pick one model for everything — either overpaying for simple tasks or underpowering complex ones.

OpenClaw Multi Agent Architecture: How It Works

OpenClaw's multi-agent system is built on two core primitives: sub-agents and ACP sessions. Understanding the difference is key to choosing the right approach for your workflow.

Sub-agents are isolated sessions spawned by your main OpenClaw agent. They inherit the parent's workspace directory, run the same OpenClaw runtime, and use the sessions_spawn tool with runtime="subagent". When a sub-agent finishes, its result is automatically delivered back to the parent session.

ACP (Agent Client Protocol) sessions connect OpenClaw to external coding agents like Codex, Claude Code, or Gemini CLI. They're more powerful for coding tasks because the external agent has its own tool set (file I/O, shell access, git operations) and can maintain persistent state across restarts.

For most multi-agent workflows, you'll use sub-agents for research, analysis, and orchestration tasks, and ACP sessions for anything involving code.

  • Sub-agent key characteristics: Fast to spawn (built-in runtime), shared workspace, model flexibility per agent, two modes (run for one-shot, session for persistent), automatic cleanup
  • ACP session key characteristics: External runtime (Codex, Claude Code, Gemini CLI via acpx), session persistence with resume, thread binding for Discord/Telegram, real-time streaming
  • Use sub-agents for: research, summaries, analysis — no setup needed, any model via model param
  • Use ACP sessions for: coding, file operations, PRs — needs CLI install, full persistence and thread binding

Step-by-Step: Setting Up Your First Multi-Agent Workflow

Before configuring multi-agent workflows, ensure you have: a running OpenClaw instance (self-hosted or DoneClaw managed), at least one AI model configured, coding agent CLIs installed for ACP sessions (npm install -g @openai/codex @anthropic-ai/claude-code), and minimum 2GB RAM free (4GB recommended for parallel agents).

Step 1: Configure Your Agent Allowlist

OpenClaw requires you to explicitly allow which agents can be spawned. Edit your openclaw.json:

Restart your gateway after editing.

{
  "agents": {
    "allowlist": ["research-agent", "code-reviewer", "email-agent"],
    "list": [
      {
        "id": "research-agent",
        "model": "google/gemini-2.5-flash",
        "description": "Fast research and web searches"
      },
      {
        "id": "code-reviewer",
        "model": "anthropic/claude-sonnet-4",
        "description": "Deep code analysis and PR reviews"
      },
      {
        "id": "email-agent",
        "model": "google/gemini-2.5-flash",
        "description": "Email triage and draft responses"
      }
    ]
  }
}
openclaw gateway restart

Step 2: Spawn Your First Sub-Agent

From your main agent session (or via the API), spawn a sub-agent:

The sub-agent runs in isolation, completes the research, and returns the result to your main session. You don't need to poll or check — it's push-based.

{
  "task": "Search the web for the latest benchmarks comparing Claude Sonnet 4 vs GPT-5 on coding tasks. Summarize the top 3 findings with specific numbers.",
  "runtime": "subagent",
  "agentId": "research-agent",
  "mode": "run"
}

Step 3: Run Multiple Sub-Agents in Parallel

Here's where multi-agent shines. Spawn several agents simultaneously:

All three run simultaneously. Results arrive as each completes — typically within 30–90 seconds for most tasks.

// Agent 1: Research
{
  "task": "Find the top 5 competitors to our product and summarize their pricing",
  "runtime": "subagent",
  "agentId": "research-agent",
  "mode": "run",
  "label": "competitor-research"
}

// Agent 2: Code review (simultaneously)
{
  "task": "Review the latest PR on our main repo. Focus on security issues.",
  "runtime": "acp",
  "agentId": "codex",
  "mode": "run",
  "label": "pr-review"
}

// Agent 3: Email triage (simultaneously)
{
  "task": "Check the inbox for urgent emails from the last 4 hours. Summarize anything that needs immediate attention.",
  "runtime": "subagent",
  "agentId": "email-agent",
  "mode": "run",
  "label": "email-check"
}

Step 4: Orchestrate with the Main Agent

Your main OpenClaw agent acts as the orchestrator. After spawning sub-agents, it can: yield and wait (call sessions_yield to pause the main turn and receive sub-agent results as the next message), continue working while sub-agents run in the background, steer running agents with follow-up instructions via sessions_send, or kill runaway agents that are stuck or no longer needed.

// Steer a running sub-agent
{
  "action": "steer",
  "target": "competitor-research",
  "message": "Also include their free tier details"
}

// Kill a stuck agent
{
  "action": "kill",
  "target": "pr-review"
}

Real-World Multi-Agent Patterns

Here are four production-tested patterns for multi-agent workflows.

Pattern 1: The Morning Command Center

Spawn four agents simultaneously at 7am via cron:

This replaces 20+ minutes of manual morning triage with a single consolidated briefing delivered to your Telegram, assembled from four specialized agents that each know exactly what to look for.

{
  "schedule": { "kind": "cron", "expr": "0 7 * * *", "tz": "America/New_York" },
  "payload": {
    "kind": "agentTurn",
    "message": "Run the morning command center: spawn 4 sub-agents in parallel — (1) check email inbox for urgent items, (2) review today's calendar and prep meeting notes, (3) check GitHub notifications and summarize open PRs, (4) get weather and commute conditions. Compile all results into a single morning briefing.",
    "timeoutSeconds": 300
  },
  "sessionTarget": "isolated",
  "delivery": { "mode": "announce" }
}

Pattern 2: Parallel Code Review Pipeline

When a PR lands, spawn multiple reviewers that each focus on different aspects:

Three reviewers finish in the time it takes one to run. The orchestrator combines their findings into a single consolidated review comment.

// Security reviewer
{
  "task": "Review PR #142 for security vulnerabilities. Check for SQL injection, XSS, auth bypass, and insecure dependencies. Output a severity-rated list.",
  "runtime": "acp",
  "agentId": "claude",
  "mode": "run",
  "label": "security-review"
}

// Performance reviewer
{
  "task": "Review PR #142 for performance issues. Check for N+1 queries, unnecessary re-renders, missing indexes, and memory leaks. Include benchmarks if possible.",
  "runtime": "acp",
  "agentId": "codex",
  "mode": "run",
  "label": "perf-review"
}

// Style reviewer
{
  "task": "Review PR #142 for code style, readability, and adherence to project conventions. Check naming, documentation, and test coverage.",
  "runtime": "subagent",
  "agentId": "research-agent",
  "mode": "run",
  "label": "style-review"
}

Pattern 3: Research → Analyze → Act Pipeline

Chain agents sequentially when each step depends on the previous: (1) Research agent gathers raw data from 5+ sources, (2) Analysis agent processes the data, identifies patterns, computes metrics, (3) Action agent drafts a report, email, or PR based on the analysis.

// Step 1: Research (spawned first)
{
  "task": "Research the latest trends in AI agent frameworks. Find 5 recent articles, extract key data points, and save raw findings to /tmp/research-output.md",
  "runtime": "subagent",
  "mode": "run",
  "label": "research-step"
}

// After research completes, the orchestrator reads the output and spawns:

// Step 2: Analysis
{
  "task": "Read /tmp/research-output.md and analyze the findings. Identify the top 3 trends, compute growth percentages, and rank frameworks by adoption. Save analysis to /tmp/analysis-output.md",
  "runtime": "subagent",
  "mode": "run",
  "label": "analysis-step"
}

// After analysis completes:

// Step 3: Action
{
  "task": "Read /tmp/analysis-output.md and draft a 500-word blog post summarizing the top AI agent framework trends for our audience. Include specific numbers from the analysis.",
  "runtime": "subagent",
  "mode": "run",
  "label": "draft-step"
}

Pattern 4: Multi-Model Arbitration

Use different models for the same task and pick the best result:

The orchestrator receives all 15 taglines and can either pick the best or run a fourth agent to evaluate and rank them.

// Same task, three different models
{
  "task": "Write a marketing tagline for our AI agent platform. Give 5 options.",
  "runtime": "subagent",
  "model": "anthropic/claude-sonnet-4",
  "mode": "run",
  "label": "taglines-claude"
}

{
  "task": "Write a marketing tagline for our AI agent platform. Give 5 options.",
  "runtime": "subagent",
  "model": "openai/gpt-5",
  "mode": "run",
  "label": "taglines-gpt"
}

{
  "task": "Write a marketing tagline for our AI agent platform. Give 5 options.",
  "runtime": "subagent",
  "model": "google/gemini-2.5-pro",
  "mode": "run",
  "label": "taglines-gemini"
}

Skip 60 minutes of setup — deploy in 60 seconds

DoneClaw handles Docker, servers, security, and updates. Your OpenClaw agent is ready to chat in under a minute.

Deploy Now

ACP Multi-Agent Configuration Deep Dive

For teams running multiple coding agents, here's a production-ready ACP configuration:

{
  "acp": {
    "enabled": true,
    "backend": "acpx",
    "defaultAgent": "codex",
    "dispatch": { "enabled": true },
    "allowedAgents": ["codex", "claude", "gemini"]
  },
  "agents": {
    "list": [
      {
        "id": "codex",
        "runtime": {
          "type": "acp",
          "acp": {
            "agent": "codex",
            "backend": "acpx",
            "mode": "persistent",
            "cwd": "/workspace/projects"
          }
        }
      },
      {
        "id": "claude",
        "runtime": {
          "type": "acp",
          "acp": {
            "agent": "claude",
            "backend": "acpx",
            "mode": "persistent"
          }
        }
      },
      {
        "id": "gemini",
        "runtime": {
          "type": "acp",
          "acp": {
            "agent": "gemini",
            "backend": "acpx",
            "mode": "persistent"
          }
        }
      }
    ]
  }
}

Thread Binding for Multi-Agent Coding

Enable thread bindings so each coding agent gets its own dedicated thread:

With thread binding enabled, telling your agent "start a Codex session for the auth module" creates a new Discord thread (or Telegram topic) dedicated to that coding agent. Every message in that thread goes directly to Codex, and all Codex responses appear there. Meanwhile, you can start a Claude Code session in another thread for a different part of the codebase — both running simultaneously without interference.

{
  "channels": {
    "discord": {
      "threadBindings": {
        "spawnAcpSessions": true
      }
    },
    "telegram": {
      "threadBindings": {
        "spawnAcpSessions": true
      }
    }
  }
}

Cost Management for Multi-Agent Setups

Running multiple agents multiplies your API costs. Here's how to keep spending under control.

Cost per task type (multi-agent): Research sub-agent: 3K–8K tokens ($0.01–$0.02 Claude Sonnet, $0.001–$0.003 Gemini Flash). Email triage sub-agent: 5K–15K tokens ($0.02–$0.05 Claude Sonnet, $0.002–$0.006 Gemini Flash). Code review (ACP): 15K–50K tokens ($0.05–$0.15 Claude Sonnet). Full coding task (ACP): 30K–100K tokens ($0.10–$0.30 Claude Sonnet). Morning briefing (4 agents): 20K–40K total ($0.06–$0.12 Claude Sonnet, $0.008–$0.016 Gemini Flash).

Monthly cost estimates: Light personal use (5–10 sub-agents/day) runs $5–$15/month. Moderate developer use (15–30 sub-agents + ACP/day) runs $20–$60/month. Heavy team/business use (50+ sub-agents + ACP/day) runs $80–$200+/month. These estimates assume a mix of cheap models for simple tasks and premium models for complex work.

  • Use the cheapest model that works: Research and email agents don't need Claude Opus. Gemini 2.5 Flash handles them at 1/10th the cost.
  • Set timeouts on every sub-agent: Prevent runaway sessions that burn tokens with runTimeoutSeconds.
  • Use mode: "run" for one-shot tasks: Don't create persistent sessions when you just need a quick answer.
  • Batch related requests: Instead of spawning 5 agents for 5 similar tasks, spawn 1 agent for all 5.
  • Monitor with session_status: Check token usage after multi-agent runs to identify expensive patterns.

Managing Running Agents

Check what's currently running with the list action and recentMinutes parameter. View what a sub-agent has been doing with session history. Kill stuck agents with the kill action.

  • Always use labels: Give every sub-agent a descriptive label so you can identify and manage them later
  • Set timeouts: Every spawned agent should have a runTimeoutSeconds to prevent indefinite execution
  • Don't poll in loops: Sub-agent completion is push-based — don't waste tokens polling subagents list repeatedly
  • Use sessions_yield: After spawning agents, yield your turn to receive results as the next message instead of blocking
// List active sub-agents
{
  "action": "list",
  "recentMinutes": 30
}

// Monitor session history
{
  "sessionKey": "agent-session-key",
  "limit": 20,
  "includeTools": true
}

// Kill stuck agents
{
  "action": "kill",
  "target": "stuck-agent-label"
}

Troubleshooting Multi-Agent Issues

Common issues and their fixes for multi-agent setups.

Problem: Sub-Agent Returns Empty Result

Cause: The task description was too vague, or the agent hit a timeout before producing output.

Fix: Make task descriptions specific and actionable. Increase runTimeoutSeconds for complex tasks. Check the session history for errors using sessions_history with includeTools: true.

Problem: ACP Session Won't Start

Cause: The external coding agent CLI isn't installed, or acpx isn't configured.

Fix: Verify installations and test a simple ACP session.

# Verify installations
codex --version
claude --version
acpx --version

# If acpx is missing:
npm install -g @openclaw/acpx

# Test a simple ACP session:
acpx codex 'echo hello'

Problem: Too Many Concurrent Agents Crash the Server

Cause: Each agent consumes memory and CPU. Running 10+ simultaneously on a small VPS will cause issues.

Fix: Limit concurrent sub-agents to 3–5 on 2GB RAM, 5–8 on 4GB RAM. Use mode: "run" instead of mode: "session" to free resources after completion. Stagger agent spawning with delays if needed. Monitor server resources with htop or free -m.

Problem: Sub-Agent Can't Access Files

Cause: Sub-agents inherit the parent workspace, but ACP sessions may use a different working directory.

Fix: For sub-agents, files in /root/clawd/ are automatically accessible. For ACP sessions, set cwd explicitly in your agent config. Use absolute paths when sharing files between agents.

Problem: Results Don't Arrive After Spawning

Cause: The main session didn't yield, so results queue up.

Fix: Call sessions_yield after spawning to receive results as the next message. Or check subagents list to see if agents completed.

Security Considerations for Multi-Agent Setups

Running multiple agents with tool access requires careful security planning.

  • Principle of least privilege: Each agent should only have access to the tools it needs. Don't give your email-triage agent shell access.
  • Sandbox ACP sessions: Codex's --full-auto mode runs in a sandboxed environment. Use it over --yolo mode for untrusted tasks.
  • Validate agent outputs: When one agent's output feeds into another's input, validate intermediate results. Don't let a research agent inject arbitrary instructions into a coding agent.
  • Separate API keys: Use different API keys for different agent types so you can revoke access granularly and track spending per agent.
  • Network isolation: If possible, run coding agents in Docker containers with restricted network access to prevent data exfiltration.

Comparing OpenClaw Multi-Agent to Other Frameworks

OpenClaw vs CrewAI vs LangGraph vs AutoGPT comparison: OpenClaw offers built-in sub-agents + ACP orchestration, any model per agent, full persistence with memory and sessions, native ACP coding agents (Codex, Claude Code), Telegram/Discord/WhatsApp channel integration, low setup complexity (JSON config), and high production readiness (Docker, managed hosting). CrewAI uses role-based crews with any model but has limited persistence, no native coding agent support, API only (no channels), and medium complexity (Python code). LangGraph uses graph-based workflows with checkpoints but no native coding agents or channels, and requires high complexity (Python + graph logic). AutoGPT uses task-based chains primarily with GPT, file-based persistence, web UI only, and has low production readiness.

OpenClaw's advantage is the combination of multi-agent orchestration with messaging channel integration and persistent memory. You can spawn agents from a Telegram message and get results back in the same chat — no separate dashboard or API required.

Conclusion

OpenClaw multi agent orchestration transforms a single AI assistant into a team of specialized agents that work in parallel, use different models for different tasks, and deliver results faster than any single agent could alone. The setup is straightforward — define your agents in JSON, spawn them with sessions_spawn, and let the orchestrator coordinate. Start simple: pick one workflow (like the morning briefing) and split it across two sub-agents. Measure the time savings and cost. Then expand to more complex patterns like parallel code reviews or research-analyze-act pipelines. The key insight is that multi-agent isn't about replacing your single agent — it's about giving it a team to delegate to. Ready to build your multi-agent system? Deploy OpenClaw with DoneClaw and skip the infrastructure setup, or follow our deployment guide to self-host.

Skip the setup? DoneClaw deploys OpenClaw for you — $29/mo with 7-day free trial, zero configuration.

Skip 60 minutes of setup — deploy in 60 seconds

DoneClaw handles Docker, servers, security, and updates. Your OpenClaw agent is ready to chat in under a minute.

Deploy Now

Frequently asked questions

How many sub-agents can I run simultaneously?

There's no hard limit in OpenClaw, but your server resources are the practical constraint. On a 2GB RAM VPS, plan for 3–5 concurrent sub-agents comfortably. On 4GB RAM, you can run 8–10. Each sub-agent consumes approximately 100–200MB of memory depending on the model and task complexity.

Can sub-agents communicate with each other directly?

Not directly. Sub-agents communicate through the parent (orchestrator) agent. Agent A's results flow to the parent, which can then pass relevant information to Agent B. This is by design — it maintains a clear chain of command and prevents agents from creating circular dependencies.

Do I need separate API keys for each sub-agent?

No. Sub-agents use the same API keys configured in your OpenClaw instance. They inherit the parent's model configuration unless you explicitly override it with the model parameter. ACP sessions may use separate keys if the external coding agent has its own API key configuration.

Is multi-agent faster than a single powerful agent?

For parallelizable tasks, yes — significantly faster. Three sub-agents running simultaneously finish in the time of the slowest one, not the sum of all three. In benchmarks, parallel multi-agent workflows complete 2.5–4x faster than equivalent single-agent approaches.

Can I mix sub-agents and ACP sessions in the same workflow?

Absolutely. This is the recommended approach for complex workflows. Use sub-agents for research, analysis, and orchestration tasks (they're faster to spawn and cheaper to run), and ACP sessions for coding tasks that need file access, shell commands, and persistent workspace state.