🤖 AI ModelsCloudv1.4+

$ cat claude-api.md

Advanced Claude API Integration

/** Route complex reasoning tasks to Claude 3.5 Sonnet when local models reach their limit. Maintain a "Hybrid-First" workflow. */

architecture_vision.log

The 'Hybrid-First' Philosophy

OpenClaw is designed to be model-agnostic. While running 100% locally on Ollama is ideal for privacy and cost, certain tasks—like massive code refactors, complex logical puzzles, or 200K+ token document analysis—require the frontier intelligence of Claude 3.5. Our integration focuses on 'Smart Routing': using local models for intent classification and simple automation, while transparently escalating complex payloads to the Anthropic cloud.

use_claude_when.md

🤔 When to Use Claude vs. Ollama

Use Ollama (Local) for:

[ok] Real-time chat and desktop automation

[ok] Processing sensitive PII or corporate secrets

[ok] High-volume, low-complexity tasks (log formatting)

[ok] Local smart home control (MQTT/HomeAssistant)

[ok] Zero-latency background tasks

Use Claude (Cloud) for:

[ escalation ] Multi-step architectural reasoning

[ escalation ] Massive context windows (Analyzing entire codebases)

[ escalation ] High-stakes code reviews and security audits

[ escalation ] Legal, medical, or technical document parsing

[ escalation ] Converting unstructured speech-to-text into clean JSON

models.md

📊 Frontier Model Selection

Model	Context	Cost / 1M tok	Optimized Usage
claude-3-5-sonnet-20241022	200K	$3 in / $15 out	The Gold Standard for coding and reasoning ⭐
claude-3-5-haiku-20241022	200K	$0.80 / $4	Ultra-fast, cheaper than GPT-4o-mini
claude-3-opus-20240229	200K	$15 / $75	Highest capability, maximum nuance

Accelerated

prompt_caching.exe

Pro Feature: Prompt Caching

For repetitive tasks (like asking questions about the same large PDF), OpenClaw automatically enables Anthropic's Prompt Caching. This reduces your API costs by up to 90% and cuts latency by 50% for sequential queries on the same context.

Cost Savings

-90%

Latency Reduction

-50%

config.yaml

⚙️ config.yaml Configuration

# hybrid_routing_engine v1.1

"ai": {

"provider": "anthropic",

"api_key": "sk-ant-YOUR_KEY_HERE",

"model": "claude-3-5-sonnet-20241022",

"max_tokens": 8192

}

💡// 💡 Pro Tip: Set 'max_tokens' to 8192 for Sonnet to enable the extended output window.

troubleshooting.log

API Error Codes & Fixes

429: Rate Limit Exceeded

$ You reached your tier limit. Consider pre-funding your account to Tier 2+ or implementing a local delay.

401: Invalid API Key

$ Check your config.yaml for trailing spaces or incorrect sk-ant- prefix.

Overloaded Error (529)

$ Anthropic's servers are busy. High-reliability workflows should configure a fallback to a local Llama-3-70B model.

/ 🚀 Integration Ecosystem

→ Ollama (Local)

// The foundation of your personal hybrid cloud AI structure.

→ OpenAI GPT-4o

// Multi-cloud fallback for maximum reliability.

❓ FAQ

Q1. Which Claude models are supported?

Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, and Claude 3.5 Haiku. Model selection is per-task configurable.

Q2. What's the advantage over local models?

Claude's 200K context window handles massive documents. Claude 3.5 Sonnet excels at coding, analysis, and nuanced reasoning.

Q3. How does billing work?

You use your own Anthropic API key. Costs based on input/output tokens. OpenClaw adds zero markup.

Q4. Can I combine Claude with local models?

Yes. Use Claude for complex reasoning and local models for simple tasks. OpenClaw's model routing makes this automatic.

← Back to Integrations