Rate Limits & Quota Management — Avoid Downtime
Getting HTTP 429 rate limit errors? Learn how to configure model fallbacks, rotate API keys, understand cooldown periods, and keep your agent running when quotas are exhausted.
⚠️ The Problem
HTTP 429: rate_limit_error: This request would exceed your account's rate limit. Please try again later. (request_id: req_011CXaaeqVtep3vVgdpEj83N)
- 429 You exceeded your current quota for generate_content_paid_tier_inp
- Rate limit reached for organization org-BOvpEHVcDPTe8h4lZnwMO5Ly on tokens per min (TPM): Limit 250000, Used 250000, Requested 10339. Please try again in 2.481s.
- Extreme cooldown periods (one user reported 5365 minutes!)
- Agent fails to fall back to alternative providers when primary is rate-limited🔍 Why This Happens
✅ The Fix
## Step 1: Diagnose the Rate Limit First, understand which provider is being rate-limited and why:
# Check current model status and any active cooldownsopenclaw models statusLook at the error message for clues:
- org-BOvp... = OpenAI organization limit
- generate_content_paid_tier_inp = Google Gemini input tokens
- anthropic in the error = Claude rate limit
## Step 2: Configure Model Fallbacks The most important fix is setting up fallback models so your agent continues working when one provider is rate-limited:
// In ~/.config/openclaw/config.json5{ "agents": { "defaults": { "model": { "primary": "anthropic/claude-sonnet-4-5", "fallbacks": [ "anthropic/claude-sonnet-4-5", // Try same model with different key "openai/gpt-4o", // Fall back to OpenAI "google/gemini-2.5-flash-preview" // Then Google ] } } }}Or add fallbacks via CLI:
# Add OpenAI as a fallbackopenclaw models fallbacks add openai/gpt-4o# Add Google as another fallbackopenclaw models fallbacks add google/gemini-2.5-flash-preview# Verify your fallback chainopenclaw models status## Step 3: Set Up Multiple API Keys (Same Provider) For the same provider, you can configure multiple auth profiles to rotate when one is rate-limited:
// In ~/.config/openclaw/config.json5{ "models": { "anthropic": { "auth": [ { "apiKey": "sk-ant-api03-key1..." }, { "apiKey": "sk-ant-api03-key2..." }, { "apiKey": "sk-ant-api03-key3..." } ] }, "openai": { "auth": [ { "apiKey": "sk-proj-key1..." }, { "apiKey": "sk-proj-key2..." } ] } }}## Step 4: Clear Stuck Cooldowns If you're stuck in an extremely long cooldown (like 5000+ minutes), you may need to reset:
# Check current cooldownsopenclaw status# Update OpenClaw (sometimes fixes stuck states)openclaw update# If that doesn't work, restart the gatewayopenclaw gateway restart## Step 5: Reduce Token Consumption Preventing rate limits is better than handling them. Reduce your token burn rate:
// Limit context size to reduce tokens per request{ "agents": { "defaults": { "contextTokens": 50000 } }}Also consider:
- Use Gemini Flash instead of Pro (10x cheaper)
- Disable thinking mode
- Start fresh sessions regularly (/session new)
- Use subagents for heavy tasks
## Step 6: Upgrade Your API Tier If you legitimately need higher limits: OpenAI:
open "https://platform.openai.com/account/rate-limits"Google AI:
open "https://aistudio.google.com/app/apikey" # Check tier, upgrade if eligibleopen "https://forms.gle/ETzX94k8jf7iSotH9" # Request limit increaseAnthropic:
open "https://console.anthropic.com/settings/limits"## Step 7: OpenRouter for Automatic Rotation OpenRouter can automatically route to available providers:
// Use OpenRouter as a meta-provider{ "models": { "openrouter": { "apiKey": "sk-or-v1-..." } }, "agents": { "defaults": { "model": { "primary": "openrouter/anthropic/claude-sonnet-4-5" } } }}OpenRouter handles failover across providers automatically, though you still need credits loaded.
🔥 Your AI should run your business, not just answer questions.
We'll show you how.$97/mo (going to $197 soon)
📋 Quick Commands
| Command | Description |
|---|---|
| openclaw models status | Check current model configuration, fallbacks, and any active cooldowns |
| openclaw models fallbacks add openai/gpt-4o | Add OpenAI GPT-4o as a fallback model |
| openclaw models fallbacks add google/gemini-2.5-flash-preview | Add Google Gemini Flash as a fallback model |
| openclaw update | Update OpenClaw to latest version (can fix stuck cooldowns) |
| openclaw gateway restart | Restart the gateway daemon to clear stuck states |
| openclaw status | Check overall system status including rate limit cooldowns |
| /session new | Start a fresh session with clean context to reduce token usage |
Related Issues
📚 You Might Also Like
OpenClaw Configuration Guide: Complete Settings Reference (2026)
Master OpenClaw configuration with this complete reference. All config.yaml settings explained: AI models, channels, multi-agent setup, plugins, secrets management, and more.
1Password
Secure secrets management through conversation. Access passwords, API keys, and secure notes safely.
AI Assistant Tutorial: Complete Beginner's Guide (2026)
Ready to set up your first AI assistant? This complete tutorial walks you through everything step by step — no coding experience required. 30 minutes to your own personal AI.
AI Assistant for Content Creators
Create more, manage less
🐙 Your AI should run your business.
Weekly live builds + template vault. We'll show you how to make AI actually work.$97/mo (going to $197 soon)
Join Vibe Combinator →