Voice & Text-to-Speech Setup
Configure TTS providers like ElevenLabs, fix MEDIA: path output issues, and set up hands-free voice-only workflows for mobile or car use.
ā ļø The Problem
Users encounter issues setting up Text-to-Speech (TTS) with providers like ElevenLabs. The most common problem is the agent outputting MEDIA: /path/to/audio.ogg as plain text instead of attaching the actual voice note. Additionally, users want to configure hands-free voice-only workflows for scenarios like driving where touching the phone isn't safe.
š Why This Happens
The MEDIA: path issue occurs when the TTS tool generates audio successfully, but the channel adapter (Telegram, Discord, etc.) doesn't recognize or process the MEDIA: directive to attach the file. This can happen due to:
- Model not following TTS tool instructions - The model outputs the MEDIA: path literally instead of letting the system handle attachment
- TTS auto mode not triggering - When
messages.tts.autois set toalwaysbut the model still outputs text - Channel capability mismatch - The channel may not support voice note attachments in the expected format
- Provider configuration issues - Invalid voiceId, modelId, or API credentials
For hands-free voice workflows, the challenge is configuring bidirectional voice: Speech-to-Text (STT) for input and TTS for output, without requiring screen interaction.
ā The Fix
Fix MEDIA: Path Appearing as Text
When you see output like:
š¦ Got it. MEDIA: /Users/you/.openclaw/cache/tts/abc123.oggInstead of receiving an actual voice note, try these fixes:
1. Verify TTS Configuration
Check your current TTS settings:
openclaw config get messages.ttsYou should see something like:
{ "provider": "elevenlabs", "auto": "always", "elevenlabs": { "voiceId": "FF7KdobWPaiR0vkcALHF", "modelId": "eleven_v3" }}2. Ensure Provider API Key is Set
ElevenLabs requires a valid API key:
openclaw config set messages.tts.elevenlabs.apiKey "your-elevenlabs-api-key"Get your API key from: https://elevenlabs.io/app/settings/api-keys
3. Restart the Gateway
After any TTS configuration change, restart:
openclaw gateway restart4. Check Channel Voice Capability
Not all channels support voice notes the same way. Verify your channel:
- Telegram: Full voice note support via
asVoice: true - Discord: Voice attachments work as audio files
- Slack/iMessage: May render as file attachments
5. Test TTS Directly
Ask the agent explicitly to speak:
Say "hello world" as a voice noteOr use the /tts command if available:
/tts Hello, this is a test6. Check for Model Instruction Issues
Some models may not properly handle TTS tool calls. If using a non-Anthropic model (like Kimi K2.5), ensure it understands the TTS tool schema. You may need to add explicit instructions in your IDENTITY.md:
## Voice OutputWhen TTS is enabled, use the tts tool to convert responses to speech.Do not output MEDIA: paths as text - the system handles audio attachment automatically.Hands-Free Voice-Only Setup (Car/Bluetooth)
For safe driving use with Bluetooth, configure full voice loop:
1. Enable Always-On TTS
Make the agent always respond with voice:
openclaw config set messages.tts.auto "always"openclaw config set messages.tts.provider "elevenlabs"2. Configure Speech-to-Text (STT)
Enable STT so you can speak instead of typing:
openclaw config set messages.stt.provider "elevenlabs"openclaw config set messages.stt.auto trueAlternative STT providers:
whisper- OpenAI Whisper (local or API)deepgram- Fast and accurategoogle- Google Cloud Speech
3. Telegram Voice Message Workflow
For Telegram, the safest hands-free flow:
- Send voice messages - Telegram converts your speech to a voice note
- OpenClaw transcribes - STT converts your voice to text
- Agent processes - Responds to your request
- TTS converts - Response becomes a voice note
- Bluetooth plays - You hear the response through car speakers
4. iOS Shortcuts Integration (Advanced)
Create an iOS Shortcut for truly hands-free activation:
- Create a Shortcut that sends a message to your Telegram bot
- Use "Hey Siri" to trigger the shortcut
- Dictate your message
- The agent responds via voice note
Example Shortcut actions:
- Get input from Spoken Text
- Send message to Telegram chat
- Wait for response
- Play audio
5. Safety Configuration
For driving, add guardrails in your IDENTITY.md:
## Driving ModeKeep responses brief and clear for audio playback.No visual elements (tables, code blocks) - voice-friendly only.Confirm critical actions verbally before executing.š„ Your AI should run your business, not just answer questions.
We'll show you how.Free to join.
š Quick Commands
| Command | Description |
|---|---|
| openclaw config get messages.tts | View current TTS configuration |
| openclaw config set messages.tts.provider "elevenlabs" | Set ElevenLabs as TTS provider |
| openclaw config set messages.tts.auto "always" | Enable automatic TTS for all responses |
| openclaw config set messages.tts.elevenlabs.apiKey "YOUR_KEY" | Set ElevenLabs API key |
| openclaw config set messages.tts.elevenlabs.voiceId "VOICE_ID" | Set specific ElevenLabs voice |
| openclaw config set messages.stt.provider "elevenlabs" | Set STT provider for voice input |
| openclaw config set messages.stt.auto true | Enable automatic speech-to-text |
| openclaw gateway restart | Restart gateway after config changes |
Related Issues
š You Might Also Like
Hire OpenClaw Expert: Find Consultants & Developers
Need professional help with OpenClaw? How to find, evaluate, and hire experts for setup and custom development.
Chat with your AI assistant through WhatsApp, the messaging app you already use every day. Send voice notes, share files, and get things done without switching apps.
Voice-Controlled AI Assistant ā Talk Instead of Type
Control your AI assistant with your voice through WhatsApp or Telegram. Send voice notes, get spoken responses. Hands-free AI that works while you multitask.
How to Set Up OpenClaw on Mac Mini (Perfect Always-On Setup)
The ideal dedicated AI assistant setup. Buy once, runs forever, no monthly fees for hosting.
š Your AI should run your business.
Weekly live builds + template vault. We'll show you how to make AI actually work.Free to join.
Join Vibe Combinator ā