Multi-Agent Architecture — Fleet Setup & Best Practices
Configure multiple OpenClaw agents running 24/7, set up persistent memory, handle OAuth token issues, use multiple models, and fix exec session timeouts.
⚠️ The Problem
Users running multi-agent fleets (e.g., coordinator + developer + content agents) encounter several challenges:
- Context overflow - Long-running sessions exceed model context limits
- Memory persistence - Agent state doesn't survive session resets
- OAuth token errors -
OAuth token refresh failed for anthropic: Failed to refresh OAuth token - Exec session timeouts - CLI commands work in terminal but timeout when agent runs them
- Agent misbehavior after upgrade - Crons not firing, soul not updating, unintended skill publishing
- Multiple model configuration - Using different models for different tasks
🔍 Why This Happens
Context Overflow: Running agents 24/7 in a single session accumulates context until it exceeds model limits (typically 100k-200k tokens). Without compaction, every message adds to the context.
OAuth Token Errors: The Anthropic API key has expired, been revoked, or reached rate limits. This happens even with Claude Max subscriptions if using API access incorrectly.
Exec Timeouts: The Gateway binds to 127.0.0.1 by default, but exec sessions may run in isolated network namespaces where localhost routing differs. Commands fail with:
Error: gateway timeout after 10000msPost-Upgrade Issues: The clawdbot → openclaw migration may leave orphaned configs, cron schedules, or session states that reference old paths.
Memory Loss: Without explicit file-based memory (MEMORY.md, daily notes), agent state only exists in the current session context.
✅ The Fix
Production Architecture for 24/7 Agent Fleet
The recommended architecture uses OpenClaw as the control plane with Markdown files as source-of-truth memory:
1. Memory Layer Setup
Create a structured memory hierarchy in your workspace:
~/clawd/├── MEMORY.md # Curated durable facts, principles, "how we work"├── IDENTITY.md # Agent personality and instructions├── memory/│ └── YYYY-MM-DD.md # Daily logs (auto-created)└── projects/ ├── project-a/ └── project-b/Each agent (Jarvis, CLU, Cortana) gets its own workspace:
mkdir -p ~/agents/jarvis ~/agents/clu ~/agents/cortana2. Per-Task Sessions (Not Forever-Sessions)
Don't run one eternal session. Instead:
- Coordinator (Jarvis): Long-lived main session for human chat
- Workers (CLU, Cortana): Spawned as subagents per gig/task
- Crons/Hooks: Isolated sessions that complete and exit
3. Enable Auto-Compaction
Prevent context overflow by enabling compaction:
openclaw config set agent.compaction.enabled trueopenclaw config set agent.compaction.threshold 50000This summarizes old context when approaching limits. See: https://docs.openclaw.ai/concepts/compaction
Fix OAuth Token Refresh Errors
When you see:
⚠️ Agent failed before reply: OAuth token refresh failed for anthropic: Failed to refresh OAuth tokenFor API Key Users
- Get a fresh API key from https://console.anthropic.com/
- Update your config:
openclaw configure anthropicOr manually:
openclaw config set anthropic.apiKey "sk-ant-api03-..."- Restart the gateway:
openclaw gateway restartFor Claude Max Subscribers
Claude Max (subscription) uses OAuth, not API keys. The OAuth flow may have stale tokens:
- Re-authenticate:
openclaw auth anthropic-
Follow the browser OAuth flow
-
Restart:
openclaw gateway restartNote: If the bot still responds despite the error, it's using cached context. The error indicates token refresh fails on each request but the cached session still works. Fix it anyway to prevent future failures.
Fix CLI Timeout in Exec Sessions
When terminal commands work but agent exec fails:
Terminal (works):
$ openclaw cron list# Returns immediatelyAgent exec (fails):
Error: gateway timeout after 10000msDiagnosis
Check what the gateway is listening on:
ss -tulpn | grep 18789If you see 127.0.0.1:18789, that's the problem.
Solution: Bind to All Interfaces
Edit ~/.openclaw/config.json5:
{ "gateway": { "host": "0.0.0.0", // Changed from 127.0.0.1 "port": 18789 }}Or use the bind shortcut:
openclaw config set gateway.bind "lan"Restart:
openclaw gateway restartVerify:
ss -tulpn | grep 18789# Should show: tcp LISTEN 0 511 0.0.0.0:18789Why This Happens
Exec sessions may run in isolated network namespaces (containers, different PID namespaces). Even though both use 127.0.0.1, socket routing differs. Binding to 0.0.0.0 makes the Gateway reachable from all network contexts.
Configure Multiple Models
Use different models for different tasks:
Interactive Model Selection
openclaw configChoose "models", then press Space to select multiple models, Enter to confirm.
Recommended Multi-Model Setup
{ "models": { "default": "anthropic/claude-sonnet-4-20250514", "coding": "anthropic/claude-sonnet-4-20250514", "creative": "anthropic/claude-opus-4-0", "fast": "moonshotai/kimi-k2.5" }}Free/Cheap Model Options
- Kimi K2.5 - Free tier available via Kilo Gateway: https://blog.kilo.ai/p/kilo-gateway-supercharges-moltbot-fka-clawdbot
- Gemini - Google's models with generous free tier
- Groq - Fast inference, free tier available
Factory Reset (Preserve Key Configs)
If your agent is misbehaving after upgrade:
1. Backup Critical Files
mkdir -p ~/openclaw-backupcp -r ~/.openclaw/config.json5 ~/openclaw-backup/cp -r ~/clawd/MEMORY.md ~/openclaw-backup/cp -r ~/clawd/IDENTITY.md ~/openclaw-backup/cp -r ~/clawd/memory/ ~/openclaw-backup/2. Clean State Reset
openclaw gateway stoprm -rf ~/.openclaw/sessions/rm -rf ~/.openclaw/cache/3. Reset Crons
openclaw cron clearopenclaw cron list # Should be empty4. Restore and Restart
openclaw gateway start5. Re-add Crons
Manually re-add your scheduled tasks:
openclaw cron add "0 9 * * *" "Good morning check-in"Full Nuclear Reset (Start Fresh)
openclaw gateway stoprm -rf ~/.openclaw/rm -rf ~/.config/openclaw/openclaw gateway start# Re-run initial setupopenclaw configure🔥 Your AI should run your business, not just answer questions.
We'll show you how.Free to join.
📋 Quick Commands
| Command | Description |
|---|---|
| openclaw config set agent.compaction.enabled true | Enable auto-compaction to prevent context overflow |
| openclaw configure anthropic | Reconfigure Anthropic API key interactively |
| openclaw auth anthropic | Re-authenticate OAuth for Claude Max |
| openclaw config set gateway.bind "lan" | Bind gateway to all interfaces (fixes exec timeouts) |
| ss -tulpn | grep 18789 | Check what interface the gateway is listening on |
| openclaw gateway restart | Restart the gateway after config changes |
| openclaw cron list | List all scheduled cron jobs |
| openclaw cron clear | Remove all cron jobs (for reset) |
| openclaw config | Interactive configuration menu (select multiple models) |
| rm -rf ~/.openclaw/sessions/ | Clear session cache (soft reset) |
Related Issues
📚 You Might Also Like
How to Configure OpenClaw: Complete Settings Guide (2026)
Configure OpenClaw in ~/.openclaw/openclaw.json: API keys, model providers, channels (WhatsApp/Telegram/Discord), security, and multi-agent routing. Copy-paste examples included.
Running OpenClaw From Your Android Watch (It Actually Works)
OpenClaw runs on your server. Your Android watch can send Telegram messages. Connect the two and you've got an AI agent on your wrist — setup takes 10 minutes.
AI Assistant for Real Estate Agents
More clients. Less paperwork. Better follow-up.
Discord
Add your AI assistant to Discord servers and DMs. Get help, manage tasks, and run automations directly from your Discord channels.
🐙 Your AI should run your business.
Weekly live builds + template vault. We'll show you how to make AI actually work.Free to join.
Join Vibe Combinator →