Symptom

After upgrading to OpenClaw v2026.4.9, TTS (text-to-speech) voice note functionality is completely broken:

  1. TTS Outbound (Voice Note Replies):

    • Assistant calls tts tool with channel: "telegram"
    • Tool returns Generated audio reply
    • Expected: Voice note delivered to user via Telegram
    • Actual: Only text response received; no voice note delivered
  2. STT Inbound (Voice Note Transcription):

    • pdf tool fails with: Expected PDF but got audio/ogg
    • image tool fails with: Unsupported media type: audio
    • No transcription occurs
  3. Gateway Logs Show No Delivery Attempts:

    • No TTS-related events logged after tool call
    • No sendVoice API calls to Telegram
    • Earlier logs show last sendVoice failures: telegram sendVoice failed: Network request for ‘sendVoice’ failed! telegram final reply failed: HttpError: Network request for ‘sendVoice’ failed!
  4. Working Functionality:

    • Text messages via Telegram work correctly
    • ElevenLabs API generates audio successfully (credentials valid)
    • Network connectivity to Telegram API is stable

Root Cause Analysis

Based on the evidence provided, this is a regression in the channel delivery layer introduced in v2026.4.9:

  1. Audio Generation Works: The tts tool successfully generates audio files via ElevenLabs API
  2. Tool Reports Success: Tool returns Generated audio reply
  3. Delivery Handoff Broken: The audio file is never passed to the gateway’s Telegram channel
  4. No API Call: No sendVoice request is made to Telegram’s API
  5. Root Cause Location: Regression in how the tts tool result is handled by the channel delivery layer

Additional Factor - Multi-Session Handling:

  • The bug report notes that pre-v2026.4.9, audio corruption ("!!!!!!!!!!!!!" strings) appeared when using both Telegram AND webchat simultaneously
  • This suggests session handling issues when multiple channels are active simultaneously
  • The v2026.4.9 update appears to have worsened this from “corrupted delivery” to “no delivery”

Affected Components:

  • tools module - TTS tool execution
  • gateway/channels/telegram - Audio delivery handoff
  • Session management - Multi-channel session handling

Solution

Note: This is an open bug report. The following solutions represent workarounds until the regression is fixed in a future release.

Workaround 1: Use Text Responses Instead of TTS

Until the regression is fixed, rely on text messaging:

  • Disable automatic TTS voice note replies
  • Use standard text responses for assistant replies

Workaround 2: Manual File Attachment (Untested)

As a potential alternative:

  1. Use the message tool with media path parameter
  2. Manually specify the path to the generated audio file
  3. This has not been tested but may work as a temporary solution

Workaround 3: Disable Multi-Channel Sessions

If using both Telegram and webchat simultaneously:

  1. Use only one channel at a time
  2. This may temporarily reduce session handling conflicts
  3. Not a complete fix but may improve stability

Required Fix (For Developers)

The development team needs to investigate:

  1. Revert or patch the changes to gateway/channels/telegram that affected sendVoice handoff
  2. Review session handling code for multi-channel scenarios
  3. Add regression tests for TTS tool → channel delivery flow

Prevention

To prevent similar regressions in future releases:

  1. Add TTS Delivery Regression Tests:

    • Create automated tests that verify tts tool output is properly delivered via each channel
    • Include tests for sendVoice API call verification
    • Test with multiple concurrent sessions
  2. Session Handling Review:

    • Audit session management when multiple channels are active
    • Ensure audio file handles are properly passed between components
    • Add integration tests for multi-channel scenarios
  3. Pre-Release Testing Checklist:

    • Verify TTS works on all channels (Telegram, webchat, etc.)
    • Test voice note delivery with active multi-session connections
    • Verify sendVoice API calls are made for each channel
  4. Logging Improvements:

    • Add explicit logging when audio files are handed off to channel delivery
    • Log sendVoice attempts with file path verification
    • This would help diagnose future delivery failures

Additional Information

Environment Details:

  • OpenClaw version: 2026.4.9 (updated 2026-04-09)
  • OS: Ubuntu (VM in VirtualBox on Intel Mac Mini)
  • TTS Provider: ElevenLabs (eleven_multilingual_v2)
  • Channel: Telegram

Error Log Evidence: 2026-04-10T10:11:59.515Z error gateway/channels/telegram telegram sendVoice failed: Network request for ‘sendVoice’ failed! 2026-04-10T10:11:59.521Z error gateway/channels/telegram telegram final reply failed: HttpError: Network request for ‘sendVoice’ failed! 2026-04-10T10:39:45.453Z error [tools] image failed: Unsupported media type: audio

Behavior Change Summary:

Version Behavior Symptom
Pre-v2026.4.9 Corrupted audio delivery “!!!!!!” in webchat voice notes
v2026.4.9 No audio delivery Voice notes never sent

Sources