Rate Limits & Quota Management — Avoid Downtime
Getting HTTP 429 rate limit errors? Learn how to configure model fallbacks, rotate API keys, understand cooldown periods, and keep your agent running when quotas are exhausted.
⚠️ The Problem
Users encounter rate limit errors that block their agent from responding. Common error messages include:
-
HTTP 429: rate_limit_error: This request would exceed your account's rate limit. Please try again later. (request_id: req_011CXaaeqVtep3vVgdpEj83N) -
429 You exceeded your current quota for generate_content_paid_tier_inp -
Rate limit reached for organization org-BOvpEHVcDPTe8h4lZnwMO5Ly on tokens per min (TPM): Limit 250000, Used 250000, Requested 10339. Please try again in 2.481s. -
Extreme cooldown periods (one user reported 5365 minutes!)
-
Agent fails to fall back to alternative providers when primary is rate-limited
🔍 Why This Happens
Rate limits occur when you exceed your API provider's quota for requests or tokens per minute. Key points to understand:
-
Rate limits don't automatically trigger provider fallback — When Claude or GPT hits a 429, OpenClaw puts that provider in cooldown and retries later. It does NOT automatically switch to a different provider unless you've configured fallbacks.
-
OpenClaw's failover works in two stages:
- Stage 1: Auth profile rotation within the same provider (if you have multiple API keys)
- Stage 2: Model fallback to the next model in your configured fallback chain
-
Cooldown periods can stack — If you continue sending requests during rate limiting, the cooldown period extends. This is how users end up with 5000+ minute cooldowns.
-
Provider quotas vary by tier:
- Free tiers: Very low limits (e.g., 15 RPM, 1M tokens/day)
- Paid Tier 1: Moderate limits (e.g., 250K tokens/min for OpenAI, 1M for Google)
- Higher tiers: Require spending thresholds or enterprise agreements
-
OAuth vs API keys — OAuth tokens (like ChatGPT Plus subscriptions) have different, often lower, rate limits than direct API keys.
✅ The Fix
Step 1: Diagnose the Rate Limit
First, understand which provider is being rate-limited and why:
# Check current model status and any active cooldownsopenclaw models statusLook at the error message for clues:
org-BOvp...= OpenAI organization limitgenerate_content_paid_tier_inp= Google Gemini input tokensanthropicin the error = Claude rate limit
Step 2: Configure Model Fallbacks
The most important fix is setting up fallback models so your agent continues working when one provider is rate-limited:
// In ~/.config/openclaw/config.json5{ "agents": { "defaults": { "model": { "primary": "anthropic/claude-sonnet-4-5", "fallbacks": [ "anthropic/claude-sonnet-4-5", // Try same model with different key "openai/gpt-4o", // Fall back to OpenAI "google/gemini-2.5-flash-preview" // Then Google ] } } }}Or add fallbacks via CLI:
# Add OpenAI as a fallbackopenclaw models fallbacks add openai/gpt-4o# Add Google as another fallbackopenclaw models fallbacks add google/gemini-2.5-flash-preview# Verify your fallback chainopenclaw models statusStep 3: Set Up Multiple API Keys (Same Provider)
For the same provider, you can configure multiple auth profiles to rotate when one is rate-limited:
// In ~/.config/openclaw/config.json5{ "models": { "anthropic": { "auth": [ { "apiKey": "sk-ant-api03-key1..." }, { "apiKey": "sk-ant-api03-key2..." }, { "apiKey": "sk-ant-api03-key3..." } ] }, "openai": { "auth": [ { "apiKey": "sk-proj-key1..." }, { "apiKey": "sk-proj-key2..." } ] } }}Step 4: Clear Stuck Cooldowns
If you're stuck in an extremely long cooldown (like 5000+ minutes), you may need to reset:
# Check current cooldownsopenclaw status# Update OpenClaw (sometimes fixes stuck states)openclaw update# If that doesn't work, restart the gatewayopenclaw gateway restartStep 5: Reduce Token Consumption
Preventing rate limits is better than handling them. Reduce your token burn rate:
// Limit context size to reduce tokens per request{ "agents": { "defaults": { "contextTokens": 50000 } }}Also consider:
- Use Gemini Flash instead of Pro (10x cheaper)
- Disable thinking mode
- Start fresh sessions regularly (
/session new) - Use subagents for heavy tasks
Step 6: Upgrade Your API Tier
If you legitimately need higher limits:
OpenAI:
open "https://platform.openai.com/account/rate-limits"Google AI:
open "https://aistudio.google.com/app/apikey" # Check tier, upgrade if eligibleopen "https://forms.gle/ETzX94k8jf7iSotH9" # Request limit increaseAnthropic:
open "https://console.anthropic.com/settings/limits"Step 7: OpenRouter for Automatic Rotation
OpenRouter can automatically route to available providers:
// Use OpenRouter as a meta-provider{ "models": { "openrouter": { "apiKey": "sk-or-v1-..." } }, "agents": { "defaults": { "model": { "primary": "openrouter/anthropic/claude-sonnet-4-5" } } }}OpenRouter handles failover across providers automatically, though you still need credits loaded.
🚀 Skip the setup. OpenClaw Cloud is ready in 60 seconds.
No server. No SSH. No config files. Just connect your channel and go.
🔥 Your AI should run your business, not just answer questions.
We'll show you how.Free to join.
📋 Quick Commands
| Command | Description |
|---|---|
| openclaw models status | Check current model configuration, fallbacks, and any active cooldowns |
| openclaw models fallbacks add openai/gpt-4o | Add OpenAI GPT-4o as a fallback model |
| openclaw models fallbacks add google/gemini-2.5-flash-preview | Add Google Gemini Flash as a fallback model |
| openclaw update | Update OpenClaw to latest version (can fix stuck cooldowns) |
| openclaw gateway restart | Restart the gateway daemon to clear stuck states |
| openclaw status | Check overall system status including rate limit cooldowns |
| /session new | Start a fresh session with clean context to reduce token usage |
Related Issues
📚 You Might Also Like
OpenClaw API Spending Limits: How to Prevent a $1,000 Overnight Surprise
Users have woken up to hundreds of dollars in unexpected API charges. Not from hacking. Not from someone stealing their credentials. From their own bot getti...
1Password
Connect OpenClaw to 1Password for secure credential management. Inject secrets into skills, access vaults, and manage credentials programmatically.
AI Assistant for CEOs
More time on strategy. Less time on the information work that shouldn't need you.
How to Build an AI Reminder System
Forget sticky notes. Tell your AI once and it'll remind you at the right time.
🐙 Your AI should run your business.
Weekly live builds + template vault. We'll show you how to make AI actually work.Free to join.
Join Vibe Combinator →