Complete OpenClaw Token Optimization Guide

1. Session Initialization

Your agent loads 50KB of history on every message. This wastes 2-3M tokens per session and costs $4/day. If you're using third-party messaging or interfaces that don't have built-in session clearing, this problem compounds fast.

The Solution

Add this session initialization rule to your agent's system prompt:

On every session start:
1. Load ONLY these files:
   - SOUL.md
   - USER.md
   - IDENTITY.md
   - memory/YYYY-MM-DD.md (if exists)

2. DO NOT auto-load:
   - MEMORY.md
   - Session history
   - Prior messages
   - Previous tool outputs

3. When user asks about prior context:
   - Use memory_search() on demand
   - Pull only the relevant snippet
   - Don't load the whole file

4. Update memory/YYYY-MM-DD.md at session end

❌ Before

• 50KB context on startup
• 2-3M tokens wasted/session
• $0.40 per session
• History bloat over time

✓ After

• 8KB context on startup
• Only loads what's needed
• $0.05 per session
• Clean daily memory files

💡 Pro Tip

This saves 80% on context overhead and works with any interface — no built-in session clearing needed.

2. Intelligent Model Routing

Out of the box, OpenClaw defaults to Sonnet for everything. While Sonnet is excellent, it's overkill for routine tasks like checking file status, running commands, or monitoring. Haiku handles these perfectly at 1/10th the cost.

Update Your Config

// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-haiku-4-5"
      },
      "models": {
        "anthropic/claude-sonnet-4-5": { "alias": "sonnet" },
        "anthropic/claude-haiku-4-5": { "alias": "haiku" }
      }
    }
  }
}

Add Routing Rules

MODEL SELECTION RULE:

Default: Always use Haiku

Switch to Sonnet ONLY when:
- Architecture decisions
- Production code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions

When in doubt: Try Haiku first

❌ Before

• Sonnet for everything
• $0.003 per 1K tokens
• Overkill for simple tasks
• $50-70/month

✓ After

• Haiku by default
• $0.00025 per 1K tokens
• Right model for the job
• $5-10/month

3. Heartbeat to Local LLM

Your OpenClaw agent sends heartbeat pings to check on tasks and collect updates. If these run on Claude API, they cost money even for simple status checks. Move them to a free local LLM and save instantly.

📊 Impact

If you run 10 heartbeats/day × 30 days = 300 API calls. At $0.003 per heartbeat = $0.90/month saves. But if you heartbeat every 5 minutes that's 8,640/month = $26/month saved just on heartbeats.

Supported Local LLMs

Ollama

4. Rate Limits & Budget Controls

Without guardrails, an automated agent can burn $100+ in minutes if something goes wrong. Set hard limits on daily/monthly spend and request rates.

Add to Your Config

// ~/.openclaw/openclaw.json
{
  "rateLimit": {
    "requestsPerMinute": 10,
    "tokensPerDay": 100000,
    "spendingLimitDaily": 5,
    "spendingLimitMonthly": 100
  },
  "fallback": {
    "enabled": true,
    "model": "anthropic/claude-haiku-4-5"
  }
}

⚠️ Critical

Always set spending limits on production agents. A single runaway loop can cost hundreds of dollars in minutes.

5. Prompt Caching

If you send large context blocks repeatedly (like system prompts, instruction sets, or documentation), use prompt caching to reuse them at 90% discount.

✨ Example Savings

A 10KB system prompt × 100 requests/day = 1M tokens/day × 25 days = 25M tokens/month.

• Without caching: 25M × $0.003 = $75/month
• With caching: 25M × $0.0003 = $7.50/month
• Saves $67.50/month

Ready to Optimize?

These five techniques combined can reduce your costs by 97%. Start with session initialization and model routing — they have the biggest impact and take the least time to implement.

Calculate Your Savings Explore Alternatives View Source