🦞OpenClaw Guide
Models

Rate Limits & Quota Management — Avoid Downtime

Getting HTTP 429 rate limit errors? Learn how to configure model fallbacks, rotate API keys, understand cooldown periods, and keep your agent running when quotas are exhausted.

⚠️ The Problem

Users encounter rate limit errors that block their agent from responding. Common error messages include: - HTTP 429: rate_limit_error: This request would exceed your account's rate limit. Please try again later. (request_id: req_011CXaaeqVtep3vVgdpEj83N) - 429 You exceeded your current quota for generate_content_paid_tier_inp - Rate limit reached for organization org-BOvpEHVcDPTe8h4lZnwMO5Ly on tokens per min (TPM): Limit 250000, Used 250000, Requested 10339. Please try again in 2.481s. - Extreme cooldown periods (one user reported 5365 minutes!) - Agent fails to fall back to alternative providers when primary is rate-limited

🔍 Why This Happens

Rate limits occur when you exceed your API provider's quota for requests or tokens per minute. Key points to understand: 1. Rate limits don't automatically trigger provider fallback — When Claude or GPT hits a 429, OpenClaw puts that provider in cooldown and retries later. It does NOT automatically switch to a different provider unless you've configured fallbacks. 2. OpenClaw's failover works in two stages: - Stage 1: Auth profile rotation within the same provider (if you have multiple API keys) - Stage 2: Model fallback to the next model in your configured fallback chain 3. Cooldown periods can stack — If you continue sending requests during rate limiting, the cooldown period extends. This is how users end up with 5000+ minute cooldowns. 4. Provider quotas vary by tier: - Free tiers: Very low limits (e.g., 15 RPM, 1M tokens/day) - Paid Tier 1: Moderate limits (e.g., 250K tokens/min for OpenAI, 1M for Google) - Higher tiers: Require spending thresholds or enterprise agreements 5. OAuth vs API keys — OAuth tokens (like ChatGPT Plus subscriptions) have different, often lower, rate limits than direct API keys.

The Fix

## Step 1: Diagnose the Rate Limit First, understand which provider is being rate-limited and why:

bash
# Check current model status and any active cooldownsopenclaw models status

Look at the error message for clues: - org-BOvp... = OpenAI organization limit - generate_content_paid_tier_inp = Google Gemini input tokens - anthropic in the error = Claude rate limit

## Step 2: Configure Model Fallbacks The most important fix is setting up fallback models so your agent continues working when one provider is rate-limited:

json5
// In ~/.config/openclaw/config.json5{  "agents": {    "defaults": {      "model": {        "primary": "anthropic/claude-sonnet-4-5",        "fallbacks": [          "anthropic/claude-sonnet-4-5",  // Try same model with different key          "openai/gpt-4o",                 // Fall back to OpenAI          "google/gemini-2.5-flash-preview" // Then Google        ]      }    }  }}

Or add fallbacks via CLI:

bash
# Add OpenAI as a fallbackopenclaw models fallbacks add openai/gpt-4o# Add Google as another fallbackopenclaw models fallbacks add google/gemini-2.5-flash-preview# Verify your fallback chainopenclaw models status

## Step 3: Set Up Multiple API Keys (Same Provider) For the same provider, you can configure multiple auth profiles to rotate when one is rate-limited:

json5
// In ~/.config/openclaw/config.json5{  "models": {    "anthropic": {      "auth": [        { "apiKey": "sk-ant-api03-key1..." },        { "apiKey": "sk-ant-api03-key2..." },        { "apiKey": "sk-ant-api03-key3..." }      ]    },    "openai": {      "auth": [        { "apiKey": "sk-proj-key1..." },        { "apiKey": "sk-proj-key2..." }      ]    }  }}

## Step 4: Clear Stuck Cooldowns If you're stuck in an extremely long cooldown (like 5000+ minutes), you may need to reset:

bash
# Check current cooldownsopenclaw status# Update OpenClaw (sometimes fixes stuck states)openclaw update# If that doesn't work, restart the gatewayopenclaw gateway restart

## Step 5: Reduce Token Consumption Preventing rate limits is better than handling them. Reduce your token burn rate:

json5
// Limit context size to reduce tokens per request{  "agents": {    "defaults": {      "contextTokens": 50000    }  }}

Also consider: - Use Gemini Flash instead of Pro (10x cheaper) - Disable thinking mode - Start fresh sessions regularly (/session new) - Use subagents for heavy tasks

## Step 6: Upgrade Your API Tier If you legitimately need higher limits: OpenAI:

bash
open "https://platform.openai.com/account/rate-limits"

Google AI:

bash
open "https://aistudio.google.com/app/apikey"  # Check tier, upgrade if eligibleopen "https://forms.gle/ETzX94k8jf7iSotH9"      # Request limit increase

Anthropic:

bash
open "https://console.anthropic.com/settings/limits"

## Step 7: OpenRouter for Automatic Rotation OpenRouter can automatically route to available providers:

json5
// Use OpenRouter as a meta-provider{  "models": {    "openrouter": {      "apiKey": "sk-or-v1-..."    }  },  "agents": {    "defaults": {      "model": {        "primary": "openrouter/anthropic/claude-sonnet-4-5"      }    }  }}

OpenRouter handles failover across providers automatically, though you still need credits loaded.

🔥 Your AI should run your business, not just answer questions.

We'll show you how.$97/mo (going to $197 soon)

Join Vibe Combinator →

📋 Quick Commands

CommandDescription
openclaw models statusCheck current model configuration, fallbacks, and any active cooldowns
openclaw models fallbacks add openai/gpt-4oAdd OpenAI GPT-4o as a fallback model
openclaw models fallbacks add google/gemini-2.5-flash-previewAdd Google Gemini Flash as a fallback model
openclaw updateUpdate OpenClaw to latest version (can fix stuck cooldowns)
openclaw gateway restartRestart the gateway daemon to clear stuck states
openclaw statusCheck overall system status including rate limit cooldowns
/session newStart a fresh session with clean context to reduce token usage

Related Issues

🐙 Your AI should run your business.

Weekly live builds + template vault. We'll show you how to make AI actually work.$97/mo (going to $197 soon)

Join Vibe Combinator →

Still stuck?

Join our Discord community for real-time help.

Join Discord