Symptom
After upgrading to OpenClaw v2026.4.9, TTS (text-to-speech) voice note functionality is completely broken:
-
TTS Outbound (Voice Note Replies):
- Assistant calls
ttstool withchannel: "telegram" - Tool returns
Generated audio reply - Expected: Voice note delivered to user via Telegram
- Actual: Only text response received; no voice note delivered
- Assistant calls
-
STT Inbound (Voice Note Transcription):
pdftool fails with:Expected PDF but got audio/oggimagetool fails with:Unsupported media type: audio- No transcription occurs
-
Gateway Logs Show No Delivery Attempts:
- No TTS-related events logged after tool call
- No
sendVoiceAPI calls to Telegram - Earlier logs show last
sendVoicefailures: telegram sendVoice failed: Network request for ‘sendVoice’ failed! telegram final reply failed: HttpError: Network request for ‘sendVoice’ failed!
-
Working Functionality:
- Text messages via Telegram work correctly
- ElevenLabs API generates audio successfully (credentials valid)
- Network connectivity to Telegram API is stable
Root Cause Analysis
Based on the evidence provided, this is a regression in the channel delivery layer introduced in v2026.4.9:
- Audio Generation Works: The
ttstool successfully generates audio files via ElevenLabs API - Tool Reports Success: Tool returns
Generated audio reply - Delivery Handoff Broken: The audio file is never passed to the gateway’s Telegram channel
- No API Call: No
sendVoicerequest is made to Telegram’s API - Root Cause Location: Regression in how the
ttstool result is handled by the channel delivery layer
Additional Factor - Multi-Session Handling:
- The bug report notes that pre-v2026.4.9, audio corruption ("!!!!!!!!!!!!!" strings) appeared when using both Telegram AND webchat simultaneously
- This suggests session handling issues when multiple channels are active simultaneously
- The v2026.4.9 update appears to have worsened this from “corrupted delivery” to “no delivery”
Affected Components:
toolsmodule - TTS tool executiongateway/channels/telegram- Audio delivery handoff- Session management - Multi-channel session handling
Solution
Note: This is an open bug report. The following solutions represent workarounds until the regression is fixed in a future release.
Workaround 1: Use Text Responses Instead of TTS
Until the regression is fixed, rely on text messaging:
- Disable automatic TTS voice note replies
- Use standard text responses for assistant replies
Workaround 2: Manual File Attachment (Untested)
As a potential alternative:
- Use the
messagetool with media path parameter - Manually specify the path to the generated audio file
- This has not been tested but may work as a temporary solution
Workaround 3: Disable Multi-Channel Sessions
If using both Telegram and webchat simultaneously:
- Use only one channel at a time
- This may temporarily reduce session handling conflicts
- Not a complete fix but may improve stability
Required Fix (For Developers)
The development team needs to investigate:
- Revert or patch the changes to
gateway/channels/telegramthat affectedsendVoicehandoff - Review session handling code for multi-channel scenarios
- Add regression tests for TTS tool → channel delivery flow
Prevention
To prevent similar regressions in future releases:
-
Add TTS Delivery Regression Tests:
- Create automated tests that verify
ttstool output is properly delivered via each channel - Include tests for
sendVoiceAPI call verification - Test with multiple concurrent sessions
- Create automated tests that verify
-
Session Handling Review:
- Audit session management when multiple channels are active
- Ensure audio file handles are properly passed between components
- Add integration tests for multi-channel scenarios
-
Pre-Release Testing Checklist:
- Verify TTS works on all channels (Telegram, webchat, etc.)
- Test voice note delivery with active multi-session connections
- Verify
sendVoiceAPI calls are made for each channel
-
Logging Improvements:
- Add explicit logging when audio files are handed off to channel delivery
- Log
sendVoiceattempts with file path verification - This would help diagnose future delivery failures
Additional Information
Environment Details:
- OpenClaw version: 2026.4.9 (updated 2026-04-09)
- OS: Ubuntu (VM in VirtualBox on Intel Mac Mini)
- TTS Provider: ElevenLabs (
eleven_multilingual_v2) - Channel: Telegram
Error Log Evidence: 2026-04-10T10:11:59.515Z error gateway/channels/telegram telegram sendVoice failed: Network request for ‘sendVoice’ failed! 2026-04-10T10:11:59.521Z error gateway/channels/telegram telegram final reply failed: HttpError: Network request for ‘sendVoice’ failed! 2026-04-10T10:39:45.453Z error [tools] image failed: Unsupported media type: audio
Behavior Change Summary:
| Version | Behavior | Symptom |
|---|---|---|
| Pre-v2026.4.9 | Corrupted audio delivery | “!!!!!!” in webchat voice notes |
| v2026.4.9 | No audio delivery | Voice notes never sent |