OpenClaw Rate Limiting: How to Prevent a $500 Overnight Surprise (And Cap Your Monthly Bill)
OpenClaw Rate Limiting: How to Prevent a $500 Overnight Surprise (And Cap Your Monthly Bill)
Target keywords: openclaw rate limits | openclaw budget control
The $500 Overnight Surprise
Here's a scenario that happens more than people admit: you kick off an automation before bed — a research loop, a lead-scraping agent, an outreach pipeline — and wake up to find it's been running all night. Not because it finished. Because it got stuck.
A single stuck loop that keeps retrying, keeps searching, keeps spinning can hit your entire monthly budget in six hours. No warning. No circuit breaker. Just an API bill that looks like a typo.
Rate limits and budget controls are the circuit breaker. They're not about slowing your agent down — they're about making sure one runaway task can't blow your whole month.
Here's exactly how to set them up.
What Causes Runaway Token Burns
Before you can prevent the problem, you need to know what actually triggers it. There are three main culprits:
1. Search Spirals
Your agent runs a web search, gets a partial result, searches again to fill in the gap, gets another partial result, searches again. This can repeat dozens of times in a few minutes. Each search costs tokens. Each follow-up search costs more. There's no natural stopping point unless you build one in.
2. Retry Loops
An API call fails. The agent retries immediately. Fails again. Retries again. If there's no backoff logic and no retry limit, this can generate hundreds of calls in seconds — especially if the failure is rate-limit-induced to begin with (which makes the retrying actively worse).
3. Stuck Agents
The agent is waiting on something — a tool response, a file read, a network call — and it's not coming. The agent keeps pinging. Keeps checking. Keeps generating tokens. You don't see it because you're asleep.
All three of these are preventable with the same set of rules.
The Exact Rate Limits Prompt Block
This is the configuration from the source, verbatim. Add it to your system prompt:
RATE LIMITS:
- 5 seconds minimum between API calls
- 10 seconds between web searches
- Max 5 searches per batch, then 2-minute break
- Batch similar work (one request for 10 leads, not 10 requests)
- If you hit 429 error: STOP, wait 5 minutes, retry
DAILY BUDGET: $5 (warning at 75%)
MONTHLY BUDGET: $200 (warning at 75%)
That's the whole thing. It lives in your system prompt — not in config, not in code. The intelligence is in the prompt.
Let's break down each rule:
5 seconds minimum between API calls — Prevents rapid-fire bursts. If your agent is making 12 calls in 12 seconds, something has gone wrong. This forces a natural pacing that also keeps you well under Anthropic's rate limits.
10 seconds between web searches — Web searches are expensive per token AND they generate long responses that your agent then processes. Doubling the cooldown on searches specifically caps the most expensive pattern.
Max 5 searches per batch, then 2-minute break — This is the spiral-breaker. After 5 searches, mandatory stop. This gives the agent a moment to consolidate what it has instead of chasing more data indefinitely.
Batch similar work — See the dedicated section below. This is its own category of savings.
429 error: STOP, wait 5 minutes, retry — When you hit a rate limit error, the worst response is to immediately retry. The right response is to back off fully. 5 minutes is long enough to clear most rate windows without losing significant time.
Daily and Monthly Budget Configuration
The prompt block above sets both a daily and monthly budget. These aren't hard API limits — they're instructions to your agent about what to do as it approaches thresholds.
Daily budget: $5
This is your operational day-to-day cap. At $5/day with optimized model routing and session management, you have significant headroom for real work. The daily limit protects you from the overnight runaway scenario.
Think of it as a per-24-hour fuse. Even if something goes catastrophically wrong at 2am, the worst-case damage is $5.
Monthly budget: $200
This is your ceiling for the entire month. At $200/month, you could be running multiple agents, heavy workflows, and regular automation without issue — unless something goes off the rails. The monthly budget is the backstop for sustained problems that the daily limit doesn't catch (e.g., a $4.90/day problem that runs every day for a month).
When to adjust these numbers:
- You're just getting started → use $2/day, $50/month until you understand your usage patterns
- You're running multiple agents in production → $10/day, $300/month
- You're doing heavy batch processing on a specific day → temporarily raise the daily limit for that session only, then lower it back
Warning at 75%: The Early Alert Strategy
Both budgets trigger a warning at 75%, not at 100%.
The reason is simple: by the time you hit 100%, it's too late. You've already spent the money. The warning at 75% gives you two things:
-
Enough runway to investigate. If you hit 75% on a Tuesday and it's only noon, something unusual is happening. You have time to check what's running and stop it before you blow the limit.
-
Time to make deliberate decisions. Maybe you do need to spend more today for legitimate work. Fine — you can make that call consciously at 75% instead of discovering you're over budget at 105%.
The agent's job at 75% is to surface the alert to you (via your configured channel — Telegram, Slack, etc.) and slow down non-critical operations. It's not a hard stop; it's a yellow light.
What Happens When a 429 Error Hits
A 429 means you've hit a rate limit at the API level — too many requests in too short a window. The rule is explicit:
STOP. Wait 5 minutes. Then retry.
Do not retry in 1 second. Do not retry in 30 seconds. Wait the full 5 minutes.
Why 5 minutes? Anthropic's rate limit windows are typically 1 minute at the per-minute tier and 5 minutes at broader tiers. Waiting 5 minutes clears both windows with certainty.
Immediate retrying after a 429 is the exact behavior that turns a rate limit blip into a full-scale spiral. Each retry hits the rate limit again, counts as another failed call, and may or may not add to the rate limit window. You can end up with dozens of failed 429 calls — each one burning tokens on the request overhead — before the window resets.
The 5-minute wait is not just about compliance. It's about breaking the feedback loop.
Batching Similar Work: 10 Leads in 1 Call
This is the highest-leverage rate limit optimization, and it's not about limits at all — it's about efficiency.
Bad pattern:
Get info on lead 1 → API call
Get info on lead 2 → API call
Get info on lead 3 → API call
...
Get info on lead 10 → API call
10 API calls. 10 sets of context overhead. 10 chances to hit rate limits.
Good pattern:
Get info on leads 1-10 → 1 API call
1 API call. Context overhead once. No rate limit exposure.
The rule in the prompt block says it plainly: "one request for 10 leads, not 10 requests."
This applies to:
- Research tasks (batch your questions, not one per call)
- Email drafts (give the agent all 10 contacts at once)
- Data processing (send the full dataset, not row by row)
- Analysis (one comprehensive request beats five follow-ups)
The token cost of one combined call is almost always less than the combined token cost of multiple separate calls — because you're not repeatedly paying for context setup, system prompt, and response framing overhead.
Key Takeaways
- One stuck loop can hit your monthly budget in hours. Rate limits are the circuit breaker.
- Add the RATE LIMITS prompt block to your system prompt — it's the only configuration you need.
- 5s between calls, 10s between searches, max 5 searches then 2-minute break are your core pacing rules.
- $5 daily / $200 monthly is a solid starting point. Adjust based on actual usage patterns.
- Warning at 75% gives you time to investigate before hitting the hard limit.
- 429 error = STOP and wait 5 full minutes. Never retry immediately.
- Batch similar work. 10 leads in 1 call is always cheaper than 10 separate calls.
Rate limits aren't a constraint on what your agent can do. They're what keep it doing things indefinitely without surprise bills.
Learn alongside 1,000+ operators
Ask questions, share workflows, and get help from people running OpenClaw every day.
📚 Explore More
Rate Limits & Quota Management — Avoid Downtime
Getting HTTP 429 rate limit errors? Learn how to configure model fallbacks, rotate API keys, understand cooldown periods, and keep your agent running when quotas are exhausted.
Voice-Controlled AI Assistant — Talk Instead of Type
Control your AI assistant with your voice through WhatsApp or Telegram. Send voice notes, get spoken responses. Hands-free AI that works while you multitask.
How to Control Your Smart Home with AI
Manage lights, thermostat, and devices through natural conversation.
Home Assistant
Connect OpenClaw to Home Assistant to control your smart home with AI. Natural language commands for lights, thermostat, sensors, and automations.