🦞OpenClaw Guide
Advanced

Multi-Agent Architecture — Fleet Setup & Best Practices

Configure multiple OpenClaw agents running 24/7, set up persistent memory, handle OAuth token issues, use multiple models, and fix exec session timeouts.

⚠️ The Problem

Users running multi-agent fleets (e.g., coordinator + developer + content agents) encounter several challenges:

  1. Context overflow - Long-running sessions exceed model context limits
  2. Memory persistence - Agent state doesn't survive session resets
  3. OAuth token errors - OAuth token refresh failed for anthropic: Failed to refresh OAuth token
  4. Exec session timeouts - CLI commands work in terminal but timeout when agent runs them
  5. Agent misbehavior after upgrade - Crons not firing, soul not updating, unintended skill publishing
  6. Multiple model configuration - Using different models for different tasks

🔍 Why This Happens

Context Overflow: Running agents 24/7 in a single session accumulates context until it exceeds model limits (typically 100k-200k tokens). Without compaction, every message adds to the context.

OAuth Token Errors: The Anthropic API key has expired, been revoked, or reached rate limits. This happens even with Claude Max subscriptions if using API access incorrectly.

Exec Timeouts: The Gateway binds to 127.0.0.1 by default, but exec sessions may run in isolated network namespaces where localhost routing differs. Commands fail with:

bash
Error: gateway timeout after 10000ms

Post-Upgrade Issues: The clawdbotopenclaw migration may leave orphaned configs, cron schedules, or session states that reference old paths.

Memory Loss: Without explicit file-based memory (MEMORY.md, daily notes), agent state only exists in the current session context.

The Fix

Production Architecture for 24/7 Agent Fleet

The recommended architecture uses OpenClaw as the control plane with Markdown files as source-of-truth memory:

1. Memory Layer Setup

Create a structured memory hierarchy in your workspace:

bash
~/clawd/├── MEMORY.md           # Curated durable facts, principles, "how we work"├── IDENTITY.md         # Agent personality and instructions├── memory/│   └── YYYY-MM-DD.md   # Daily logs (auto-created)└── projects/    ├── project-a/    └── project-b/

Each agent (Jarvis, CLU, Cortana) gets its own workspace:

bash
mkdir -p ~/agents/jarvis ~/agents/clu ~/agents/cortana

2. Per-Task Sessions (Not Forever-Sessions)

Don't run one eternal session. Instead:

  • Coordinator (Jarvis): Long-lived main session for human chat
  • Workers (CLU, Cortana): Spawned as subagents per gig/task
  • Crons/Hooks: Isolated sessions that complete and exit

3. Enable Auto-Compaction

Prevent context overflow by enabling compaction:

bash
openclaw config set agent.compaction.enabled trueopenclaw config set agent.compaction.threshold 50000

This summarizes old context when approaching limits. See: https://docs.openclaw.ai/concepts/compaction


Fix OAuth Token Refresh Errors

When you see:

bash
⚠️ Agent failed before reply: OAuth token refresh failed for anthropic: Failed to refresh OAuth token

For API Key Users

  1. Get a fresh API key from https://console.anthropic.com/
  2. Update your config:
bash
openclaw configure anthropic

Or manually:

bash
openclaw config set anthropic.apiKey "sk-ant-api03-..."
  1. Restart the gateway:
bash
openclaw gateway restart

For Claude Max Subscribers

Claude Max (subscription) uses OAuth, not API keys. The OAuth flow may have stale tokens:

  1. Re-authenticate:
bash
openclaw auth anthropic
  1. Follow the browser OAuth flow

  2. Restart:

bash
openclaw gateway restart

Note: If the bot still responds despite the error, it's using cached context. The error indicates token refresh fails on each request but the cached session still works. Fix it anyway to prevent future failures.


Fix CLI Timeout in Exec Sessions

When terminal commands work but agent exec fails:

Terminal (works):

bash
$ openclaw cron list# Returns immediately

Agent exec (fails):

bash
Error: gateway timeout after 10000ms

Diagnosis

Check what the gateway is listening on:

bash
ss -tulpn | grep 18789

If you see 127.0.0.1:18789, that's the problem.

Solution: Bind to All Interfaces

Edit ~/.openclaw/config.json5:

json5
{  "gateway": {    "host": "0.0.0.0",  // Changed from 127.0.0.1    "port": 18789  }}

Or use the bind shortcut:

bash
openclaw config set gateway.bind "lan"

Restart:

bash
openclaw gateway restart

Verify:

bash
ss -tulpn | grep 18789# Should show: tcp LISTEN 0 511 0.0.0.0:18789

Why This Happens

Exec sessions may run in isolated network namespaces (containers, different PID namespaces). Even though both use 127.0.0.1, socket routing differs. Binding to 0.0.0.0 makes the Gateway reachable from all network contexts.


Configure Multiple Models

Use different models for different tasks:

Interactive Model Selection

bash
openclaw config

Choose "models", then press Space to select multiple models, Enter to confirm.

Recommended Multi-Model Setup

json5
{  "models": {    "default": "anthropic/claude-sonnet-4-20250514",    "coding": "anthropic/claude-sonnet-4-20250514",    "creative": "anthropic/claude-opus-4-0",    "fast": "moonshotai/kimi-k2.5"  }}

Free/Cheap Model Options


Factory Reset (Preserve Key Configs)

If your agent is misbehaving after upgrade:

1. Backup Critical Files

bash
mkdir -p ~/openclaw-backupcp -r ~/.openclaw/config.json5 ~/openclaw-backup/cp -r ~/clawd/MEMORY.md ~/openclaw-backup/cp -r ~/clawd/IDENTITY.md ~/openclaw-backup/cp -r ~/clawd/memory/ ~/openclaw-backup/

2. Clean State Reset

bash
openclaw gateway stoprm -rf ~/.openclaw/sessions/rm -rf ~/.openclaw/cache/

3. Reset Crons

bash
openclaw cron clearopenclaw cron list  # Should be empty

4. Restore and Restart

bash
openclaw gateway start

5. Re-add Crons

Manually re-add your scheduled tasks:

bash
openclaw cron add "0 9 * * *" "Good morning check-in"

Full Nuclear Reset (Start Fresh)

bash
openclaw gateway stoprm -rf ~/.openclaw/rm -rf ~/.config/openclaw/openclaw gateway start# Re-run initial setupopenclaw configure

🔥 Your AI should run your business, not just answer questions.

We'll show you how.Free to join.

Join Vibe Combinator →

📋 Quick Commands

CommandDescription
openclaw config set agent.compaction.enabled trueEnable auto-compaction to prevent context overflow
openclaw configure anthropicReconfigure Anthropic API key interactively
openclaw auth anthropicRe-authenticate OAuth for Claude Max
openclaw config set gateway.bind "lan"Bind gateway to all interfaces (fixes exec timeouts)
ss -tulpn | grep 18789Check what interface the gateway is listening on
openclaw gateway restartRestart the gateway after config changes
openclaw cron listList all scheduled cron jobs
openclaw cron clearRemove all cron jobs (for reset)
openclaw configInteractive configuration menu (select multiple models)
rm -rf ~/.openclaw/sessions/Clear session cache (soft reset)

Related Issues

    🐙 Your AI should run your business.

    Weekly live builds + template vault. We'll show you how to make AI actually work.Free to join.

    Join Vibe Combinator →