$cd ../integrations/
π¬ MessagingVoicev2026.3+
$ cat voice-call-integration.md
openclaw.connect('voice-call')
/** "Hey OpenClaw, check my server status" β via a real phone call, anywhere in the world */
β¨ Key Features
Real Phone Calls
Call a phone number and talk to OpenClaw. Uses Whisper for speech-to-text and ElevenLabs/TTS for natural-sounding responses.
Hands-Free Operation
Control your entire OpenClaw setup while driving, cooking, or walking. No screen required.
Emergency Alerts
OpenClaw can call YOU when something critical happens β server down, security alert, smart home emergency.
Multi-Language Voice
Speak in English, Chinese, Japanese, Spanish, or 50+ other languages. OpenClaw responds in the same language.
Twilio / SIP Integration
Use Twilio for cloud phone numbers or connect your own SIP trunk for full telephony control.
how_it_works.md
π How It Works
1.
You call your OpenClaw phone number
A dedicated Twilio number or SIP endpoint
2.
Whisper transcribes your speech in real-time
Supports streaming for low-latency (~500ms)
3.
OpenClaw processes your request
Same capabilities as text β skills, memory, browser, etc.
4.
TTS speaks the response back to you
ElevenLabs for natural voice, or edge TTS for speed
config.yaml
βοΈ Configuration
# openclaw/config.yaml
integrations:
voice-call:
enabled: true
provider: "twilio"
account_sid: "$TWILIO_SID"
auth_token: "$TWILIO_TOKEN"
phone_number: "+1-555-CLAW"
stt_model: "whisper-large-v3"
tts_provider: "elevenlabs"
tts_voice: "nova"
emergency_outbound: true
example_call.log
π¬ Example Call
You (phone)
Hey OpenClaw, what's the status of my Hetzner server?
OpenClaw
Your Hetzner CPX41 server is running. CPU at 23%, memory at 68%. The SSL certificate for clawdbot.space expires in 14 days β want me to renew it?
You
Yes, renew it
OpenClaw
Done. Certificate for clawdbot.space renewed via Certbot. Valid until June 2026. Anything else?
β FAQ
Q1. Which voice providers are supported?
Twilio (primary), Vonage, and generic SIP/VOIP. Twilio is recommended for easiest setup.
Q2. Does it sound natural?
Yes. Using ElevenLabs or OpenAI TTS for voice synthesis, and Whisper for speech-to-text. Latency is typically under 2 seconds.