$cd ../integrations/
πŸ’¬ MessagingVoicev2026.3+
$ cat voice-call-integration.md

openclaw.connect('voice-call')

/** "Hey OpenClaw, check my server status" β€” via a real phone call, anywhere in the world */

✨ Key Features

Real Phone Calls

Call a phone number and talk to OpenClaw. Uses Whisper for speech-to-text and ElevenLabs/TTS for natural-sounding responses.

Hands-Free Operation

Control your entire OpenClaw setup while driving, cooking, or walking. No screen required.

Emergency Alerts

OpenClaw can call YOU when something critical happens β€” server down, security alert, smart home emergency.

Multi-Language Voice

Speak in English, Chinese, Japanese, Spanish, or 50+ other languages. OpenClaw responds in the same language.

Twilio / SIP Integration

Use Twilio for cloud phone numbers or connect your own SIP trunk for full telephony control.

how_it_works.md

πŸ“ž How It Works

1.
You call your OpenClaw phone number
A dedicated Twilio number or SIP endpoint
2.
Whisper transcribes your speech in real-time
Supports streaming for low-latency (~500ms)
3.
OpenClaw processes your request
Same capabilities as text β€” skills, memory, browser, etc.
4.
TTS speaks the response back to you
ElevenLabs for natural voice, or edge TTS for speed
config.yaml

βš™οΈ Configuration

# openclaw/config.yaml
integrations:
voice-call:
enabled: true
provider: "twilio"
account_sid: "$TWILIO_SID"
auth_token: "$TWILIO_TOKEN"
phone_number: "+1-555-CLAW"
stt_model: "whisper-large-v3"
tts_provider: "elevenlabs"
tts_voice: "nova"
emergency_outbound: true
example_call.log

πŸ’¬ Example Call

You (phone)
Hey OpenClaw, what's the status of my Hetzner server?
OpenClaw
Your Hetzner CPX41 server is running. CPU at 23%, memory at 68%. The SSL certificate for clawdbot.space expires in 14 days β€” want me to renew it?
You
Yes, renew it
OpenClaw
Done. Certificate for clawdbot.space renewed via Certbot. Valid until June 2026. Anything else?

❓ FAQ

Q1. Which voice providers are supported?

Twilio (primary), Vonage, and generic SIP/VOIP. Twilio is recommended for easiest setup.

Q2. Does it sound natural?

Yes. Using ElevenLabs or OpenAI TTS for voice synthesis, and Whisper for speech-to-text. Latency is typically under 2 seconds.
← Back to Integrations