Reduce API Costs — Save Money on AI Usage
Slash your OpenClaw API bill by 50-80%. Learn model selection, caching, prompt optimization, and smart fallback strategies.
⚠️ The Problem
$50+ bill last month — not sure why it's so expensive
Just chatting casually but spending $5/day on API
2. **Using expensive models for simple tasks:**
Using Claude Opus for everything including 'hello'
No fallback models configured
3. **Context windows exploding:**
Long conversations hitting 200k tokens
Paying for the same context repeatedly
4. **No visibility into spend:**
Don't know which conversations cost the most
No way to set spending limits
🔍 Why This Happens
✅ The Fix
## Quick Wins (Immediate Savings)
### 1. Switch to Sonnet for Most Tasks
Claude 3.5 Sonnet is 5x cheaper than Opus and handles 90% of tasks equally well. Set it as your default:
# ~/.openclaw/config.yamlmodels: primary: anthropic/claude-sonnet-4-20250514 fallbacks: - anthropic/claude-3-5-haiku-20241022 - openai/gpt-4o-miniCost comparison per 1M tokens: - Opus: $15 output / $75 input - Sonnet: $3 output / $15 input - Haiku: $0.25 output / $1.25 input
Use Opus only when you explicitly need it (complex reasoning, long documents).
### 2. Enable Prompt Caching
Anthropic's prompt caching stores your system prompt and conversation context, charging only 10% for cached tokens:
models: anthropic: cacheControlTtl: 300 # Cache for 5 minutesThis alone can reduce costs by 50-80% for conversational use.
### 3. Configure Aggressive Context Pruning
Don't pay for old messages you don't need:
contextPruning: mode: sliding maxMessages: 20 # Keep only last 20 messages maxTokens: 50000 # Cap at 50k tokensFor most conversations, 20 messages of context is plenty.
### 4. Use Haiku for Heartbeats & Cron
Automated tasks don't need the smartest model:
agents: main: heartbeat: model: anthropic/claude-3-5-haiku-20241022Haiku is 60x cheaper than Opus — perfect for scheduled checks.
## Advanced Strategies
### 5. Set Up Model Routing
Route different tasks to appropriate models:
# Use Haiku for simple queries, Sonnet for code, Opus for analysismodels: primary: anthropic/claude-sonnet-4-20250514 routing: simple: anthropic/claude-3-5-haiku-20241022 code: anthropic/claude-sonnet-4-20250514 analysis: anthropic/claude-opus-4-20250514### 6. Use Local Models for Drafts
Run Ollama locally for first drafts, only use paid APIs for final output:
models: local: ollama/qwen2.5:7b primary: anthropic/claude-sonnet-4-20250514Local models cost $0. Use them for brainstorming, then polish with Claude.
### 7. Truncate Tool Outputs
Large file reads and web scrapes bloat context. Limit tool output size:
tools: read: maxChars: 10000 # Cap file reads at 10k chars web: maxChars: 5000 # Cap web fetches### 8. Monitor with Session Status
Check your token usage regularly:
# In chat/status# CLIopenclaw status --usageThis shows tokens used and estimated cost per session.
### 9. Set Budget Alerts
Configure alerts in your provider dashboard: - Anthropic Console: console.anthropic.com → Usage → Set alerts - OpenAI: platform.openai.com → Settings → Limits
Set alerts at 50% and 80% of your monthly budget.
### 10. Use Batch Processing for Bulk Work
If you're processing many items (emails, documents), batch them instead of one-by-one. Anthropic offers batch API with 50% discount.
## Cost Estimation Cheat Sheet
Typical monthly costs by usage pattern:
| Usage | Model | Est. Monthly Cost | |-------|-------|-------------------| | Light (10 msgs/day) | Haiku | $1-3 | | Medium (50 msgs/day) | Sonnet | $5-15 | | Heavy (200+ msgs/day) | Sonnet + caching | $15-40 | | Power user | Opus + Sonnet mix | $30-80 |
With caching enabled, expect 50-80% reduction from these estimates.
## Example: Optimized Config
# Optimized for costmodels: primary: anthropic/claude-sonnet-4-20250514 fallbacks: - anthropic/claude-3-5-haiku-20241022 - minimax/MiniMax-M2.1 anthropic: cacheControlTtl: 300contextPruning: mode: sliding maxMessages: 25 maxTokens: 60000agents: main: heartbeat: model: anthropic/claude-3-5-haiku-20241022tools: read: maxChars: 15000 web: maxChars: 8000This config uses Sonnet by default, Haiku for automated tasks, aggressive caching, and limits context size. Expected savings: 60-80% vs naive Opus-only config.
🔥 Your AI should run your business, not just answer questions.
We'll show you how.$97/mo (going to $197 soon)
📋 Quick Commands
| Command | Description |
|---|---|
| /status | Check current session token usage and cost |
| openclaw status --usage | View usage statistics from CLI |
| openclaw config set models.primary anthropic/claude-sonnet-4-20250514 | Switch default model to Sonnet |
| openclaw config set models.anthropic.cacheControlTtl 300 | Enable 5-minute prompt caching |
| openclaw config set contextPruning.maxMessages 20 | Limit context to last 20 messages |
Related Issues
📚 You Might Also Like
How to Build an AI Reading List Manager
Save articles and books to read later. Get summaries and recommendations based on your interests.
1Password
Secure secrets management through conversation. Access passwords, API keys, and secure notes safely.
Why Self-Host AI? Privacy, Cost, and Control Explained
Why run AI on your own computer when ChatGPT is a browser tab away? Privacy, cost savings, and control. Here's the complete case for self-hosted AI assistants.
OpenClaw vs Notion AI
Your assistant, everywhere - not just in one app
🐙 Your AI should run your business.
Weekly live builds + template vault. We'll show you how to make AI actually work.$97/mo (going to $197 soon)
Join Vibe Combinator →