🤖 AI ModelsRecommendedv1.0+100% local · free forever

$ cat ollama-integration.md

openclaw.useModel('ollama', { local: true })

/** OpenClaw + Ollama：终极本地 AI 堆栈。您的数据永不离开您的机器。 */

// 为什么 Ollama 是默认推荐

🔒

100% 隐私与数据主权

// 无 API 调用，无 Token 追踪，数据不离开您的机器。非常适合处理敏感日志或个人文档。

💸

永久免费

// 零成本下载 100+ 开源模型。无速率限制，无订阅，避免意外账单。

⚡

低延迟推理

// Mac M 系列：20-45 tok/s。速度足以支持实时的 Telegram/WhatsApp 聊天和自动化工作流。

hardware_requirements.md

💻 最低硬件要求

8GB RAM / 显存

运行 3B - 7B 模型 (llama3.2:3b, qwen2.5:7b)。适合基础聊天。

16GB RAM / 显存

运行 8B - 14B 模型 (llama3.1:8b, gemma3:9b)。通用场景的最佳基准。

32GB+ RAM / 显存

运行 32B+ 模型 (qwen2.5:32b)。适合复杂的编码和逻辑推理任务。

step_01_install_ollama.sh

步骤 1：安装 Ollama 并拉取模型

# macOS

$ brew install ollama

# Linux (one-liner)

$ curl -fsSL https://ollama.ai/install.sh | sh

# Windows: download installer from ollama.ai

# Pull your first model (recommended for OpenClaw)

$ ollama pull llama3.1:8b

# pulling manifest... done ✓ 4.7 GB

$ ollama serve # starts on http://localhost:11434

model_comparison.md

📊 OpenClaw 推荐模型矩阵

// 在 Mac Mini M4 (16GB UM) 和 Hetzner CPX21 (Ubuntu) 上测试

Model	Size	Speed	Best For
llama3.2:3b	2.0 GB	~45 tok/s	Fast replies, chat
llama3.1:8b← recommended	4.7 GB	~28 tok/s	General purpose ⭐
llama3.1:70b	40.0 GB	~8 tok/s	Complex reasoning
gemma3:9b	5.4 GB	~25 tok/s	Code + structured output
mistral:7b	4.1 GB	~30 tok/s	EU data sovereignty
qwen2.5:14b	8.9 GB	~18 tok/s	Best for Chinese text

step_02_config.yaml

步骤 2：将 OpenClaw 连接到 Ollama

# openclaw/config.yaml

ai:

provider: "ollama"

base_url: "http://localhost:11434"

model: "llama3.1:8b"

context_window: 8192

$ openclaw start

# ✓ Connected to Ollama at localhost:11434

# ✓ Model: llama3.1:8b (loaded, 4.7GB)

# ✓ OpenClaw ready.

performance_tips.md

⚡ 高级性能调优

将 Ollama 模型常驻内存

$ OLLAMA_KEEP_ALIVE=24h ollama serve

// 防止模型在请求之间被卸载，消除 5-10 秒的冷启动延迟。

增加 GPU 层数（NVIDIA/AMD）

$ OLLAMA_GPU_LAYERS=33 ollama serve

// 强制将更多 Transformer 层卸载到显卡以大幅加快推理速度。

多用户并行请求

$ OLLAMA_NUM_PARALLEL=4 ollama serve

// 允许 4 个并发生成。如果您将 OpenClaw 机器人在群聊中使用，此项必设。

troubleshoot.log

🔧 常见问题与修复

Q: 错误: Connection refused (localhost:11434)

A: Ollama 未运行。请使用 `ollama serve` 启动它或确保后台服务已激活。

Q: OpenClaw 回复极其缓慢

A: 模型可能溢出到 CPU 交换内存(Swap)中了。请选择更小的量化模型或增加系统 RAM/VRAM。

🚀 下一步

→ Claude API

// Use cloud AI when local isn't enough

→ Local AI Setup Guide

// Full hardware + OS setup walkthrough

❓ FAQ

Q1. 哪些模型最好？

Llama 3 (8B) 通用、Mistral 7B 追求速度、CodeLlama 编程任务。16GB 内存均可运行。

Q2. 需要多少内存？

7B 模型最低 8GB，13B 推荐 16GB，70B 需要 32GB+。GPU 可选但有帮助。

← 返回集成列表