🦞OpenClaw Guide
← Back to BlogAutomation

Run OpenClaw Heartbeats Free with Ollama: Stop Paying for 1,440 API Calls Per Day

2026-03-179 min read

Run OpenClaw Heartbeats Free with Ollama: Stop Paying for 1,440 API Calls Per Day

Target keywords: openclaw ollama heartbeat, openclaw free heartbeat


If your heartbeat runs every minute, that's 1,440 paid API calls per day — just to check if you're alive.

That's not a typo. 60 minutes × 24 hours = 1,440 heartbeat pings per day. Each one hits the Anthropic API. Each one burns tokens on a task so simple it barely deserves the name "task" — essentially, "are you there? respond with OK."

At Haiku pricing ($0.00025/1K input tokens), the per-call cost is tiny. But at 1,440 calls/day with a minimal prompt + response, it adds up to $5–15/month purely on heartbeats. Not on actual work. Just on existence checks.

Worse: these calls count against your API rate limits. If you're running busy automation sessions, your heartbeats are competing with real work for quota.

The fix is to stop using the Anthropic API for heartbeats entirely. Ollama runs a local LLM on your machine — no API calls, no tokens billed, no rate limit impact. The heartbeat cost goes from $5–15/month to exactly $0.

Here's how to set it up.


What Heartbeats Are and Why They're Expensive by Default

OpenClaw heartbeats are periodic health checks. They fire on a schedule to confirm your agent is running, responsive, and able to process requests. They're also used to trigger lightweight monitoring tasks — checking for blockers, progress updates, or time-sensitive items that need attention.

By default, heartbeats use your configured primary model — whatever model OpenClaw uses for everything else. If that's Sonnet, your heartbeats are Sonnet calls. If you've already set Haiku as default (article 2), your heartbeats are Haiku calls. Either way, they're paid API calls.

The math:

  • 1-minute interval: 1,440 calls/day × 30 days = 43,200 calls/month
  • 5-minute interval: 288 calls/day × 30 days = 8,640 calls/month
  • 1-hour interval: 24 calls/day × 30 days = 720 calls/month

At Haiku pricing, even the 1-hour interval accumulates. At 1-minute intervals it's a meaningful ongoing cost — again, just for the agent to confirm it's still running.

The heartbeat payload is simple. A prompt like "Check: Any blockers, opportunities, or progress updates needed?" plus a short response is maybe 200–500 tokens total. That doesn't justify the API overhead.


Installing Ollama

Ollama runs open-source LLMs locally. No API key, no cloud, no per-token billing. You download a model once and run it as many times as you want at zero marginal cost.

Install on macOS or Linux:

# macOS / Linux
curl -fsSL https://ollama.ai/install.sh | sh

Or download the macOS app directly from ollama.ai — it installs as a menu bar app that keeps the server running automatically.

Pull the heartbeat model:

# Pull llama3.2:3b (2GB download, one-time)
ollama pull llama3.2:3b

That's the full install. Ollama starts a local API server at http://localhost:11434 that accepts the same format as OpenAI's API — which is what OpenClaw's Ollama integration talks to.


Why llama3.2:3b Specifically

There's a 1b version (llama3.2:1b) that's smaller and faster — about 1.3GB vs 2GB. You'll see it mentioned in some guides. For heartbeat-only use cases, 1b works fine.

But the source guide recommends llama3.2:3b for a specific reason: it handles complex context better than 1b for production use.

If your heartbeat prompt is simple ("respond with OK"), 1b is sufficient. But heartbeat prompts often include some agent context — recent activity, current project status, things to watch for. When that context gets more detailed, 1b can produce lower-quality responses or miss the point of the check. 3b is more reliable with richer prompts.

The practical guidance:

  • Starting out with simple heartbeats → llama3.2:1b works, 1.3GB download
  • Heartbeats with agent context or structured responses → llama3.2:3b, 2GB download
  • When in doubt → use 3b. The extra 700MB is worth the reliability

Both run fast on modern hardware. On an M-series Mac, llama3.2:3b responds in under a second for simple prompts.


The Full Heartbeat Config

Update your ~/.openclaw/openclaw.json to add the heartbeat section. Here's the full config including the model routing setup from article 2:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-haiku-4-5"
      },
      "models": {
        "anthropic/claude-sonnet-4-5": {
          "alias": "sonnet"
        },
        "anthropic/claude-haiku-4-5": {
          "alias": "haiku"
        }
      }
    }
  },
  "heartbeat": {
    "every": "1h",
    "model": "ollama/llama3.2:3b",
    "session": "main",
    "target": "slack",
    "prompt": "Check: Any blockers, opportunities, or progress updates needed?"
  }
}

What each field does:

FieldValueWhat it does
every"1h"Fires every hour (adjust to your preference: "5m", "30m", "1h", "6h")
model"ollama/llama3.2:3b"Routes heartbeat to local Ollama instead of paid API
session"main"Which agent session handles the heartbeat
target"slack"Where heartbeat responses go (slack, telegram, or omit to suppress output)
prompt"Check: Any..."The actual question asked on each heartbeat

The critical field is "model": "ollama/llama3.2:3b". This is what routes the heartbeat to your local machine instead of Anthropic's API. Everything else is configuration for frequency and routing.

Adjust the interval based on what you need:

"every": "5m"   // High-frequency monitoring (still free with Ollama)
"every": "1h"   // Standard health check
"every": "6h"   // Low-priority background agent

With a paid API, the interval directly affected cost. With Ollama, you can run every minute at zero cost if you need that level of monitoring. Set it based on how quickly you want to detect issues, not based on cost.


Verifying Ollama Is Running Correctly

After installing and configuring, run these checks:

# Start the Ollama server (if not already running as a service)
ollama serve

# In a separate terminal — test the model responds
ollama run llama3.2:3b "respond with OK"
# Should output something like "OK" within 1-2 seconds

# Verify the API endpoint is live
curl http://localhost:11434/api/generate \
  -d '{"model": "llama3.2:3b", "prompt": "respond with OK", "stream": false}'
# Should return a JSON response with "response": "OK"

If ollama serve hangs or shows errors:

  • Check if another Ollama instance is already running (ps aux | grep ollama)
  • On macOS with the menu bar app, Ollama starts automatically — no need to run ollama serve manually
  • Check port 11434 isn't blocked (lsof -i :11434)

After configuring heartbeats in openclaw.json:

openclaw shell
session_status
# Look for: Heartbeat: Ollama/local (not API)

If the heartbeat still shows API instead of Ollama/local, verify:

  1. The JSON syntax in openclaw.json is valid (no missing commas or brackets)
  2. Ollama is actually running (ollama serve or the menu bar app is active)
  3. The model is pulled (ollama list — should show llama3.2:3b)

The Monthly Cost: $5–15 → $0

Here's what you're actually saving, broken down by heartbeat interval:

At 1-minute intervals (1,440 calls/day):

Paid API (Haiku)Ollama
Daily calls1,4401,440
Token cost/day~$0.50$0
Monthly~$15$0
Annual~$180$0

At 5-minute intervals (288 calls/day):

Paid API (Haiku)Ollama
Daily calls288288
Token cost/day~$0.10$0
Monthly~$3$0

At 1-hour intervals (24 calls/day):

Paid API (Haiku)Ollama
Daily calls2424
Token cost/day~$0.008$0
Monthly~$0.25$0

The absolute dollar savings look small at hourly intervals. What matters is the principle: you're eliminating a recurring cost that provides zero value relative to a local alternative, and you're freeing up API quota for actual work.

At 1-minute or 5-minute intervals, the savings are meaningful ($3–15/month) and the Ollama switch is an obvious move.


When You Might Still Want a Paid Model for Heartbeats

Most of the time, Ollama is the right call. But there are scenarios where you'd stick with a paid API:

The agent runs on a remote server without GPU/CPU capacity for Ollama. Ollama runs fine on most modern hardware, but if your agent is on a tiny VPS or a device with very limited resources, the local LLM might be too slow or fail to run at all. In this case, Haiku at 1-hour intervals is still cheap.

Your heartbeat prompt is doing real reasoning, not just health checks. If you've configured your heartbeat to summarize complex project status, generate reports, or make non-trivial decisions, a smarter model might produce better output. llama3.2:3b is capable but not on par with Sonnet for complex reasoning. If the heartbeat output is something you act on, consider whether Haiku is the right tradeoff.

You need the heartbeat output to be highly reliable in production. For most personal agent setups, occasional quirky Ollama output is fine — you review it manually. For production systems where heartbeat responses trigger automated workflows, you might want the consistency of a hosted model.

For everything else — the vast majority of OpenClaw setups — Ollama handles heartbeats perfectly and costs nothing.


Before vs After

❌ Before✓ After
Heartbeat modelPaid API (Haiku or Sonnet)Ollama local LLM (free)
Daily API calls1,440 (at 1-min interval)0
Monthly cost$5–15$0
Rate limit impactYes — competes with real workNone
Flexibility on intervalLimited by costRun as frequently as needed
DependencyAnthropic API uptimeLocal machine uptime

The only downside to Ollama: if your machine is off, heartbeats don't run. For an always-on desktop or server, this isn't a concern. For a laptop you close overnight, factor in that heartbeats will pause when the machine sleeps.


Key Takeaways

  • 1,440 API calls/day at 1-minute intervals: This is what default heartbeat behavior looks like. It adds up to $5–15/month just for existence checks.
  • One install, zero ongoing cost: curl -fsSL https://ollama.ai/install.sh | sh + ollama pull llama3.2:3b. That's it. No API keys, no billing, no rate limits.
  • llama3.2:3b is the right size: 2GB, fast, handles production context better than 1b. Don't over-optimize to 1b and get inconsistent responses.
  • The config change is one line: "model": "ollama/llama3.2:3b" in the heartbeat section of openclaw.json.
  • Verify with three commands: ollama serve, ollama run llama3.2:3b "respond with OK", then session_status in openclaw shell.
  • Free interval flexibility: With Ollama, you can run heartbeats every minute if you want without any cost concern. Set the interval based on operational needs, not API economics.

This is the third of five optimizations. Combined with session initialization (article 1) and model routing (article 2), you've already addressed three of the four major cost drivers. Add rate limits and prompt caching, and the full system takes you from $1,500+/month to under $50/month — no complex infrastructure, just smart configuration.

Learn alongside 1,000+ operators

Ask questions, share workflows, and get help from people running OpenClaw every day.