April 26, 2026

Telegram Forum: Message Sent to Wrong Topic After Anthropic 529 Retry

When Anthropic API returns 529 (overloaded) and OpenClaw retries the request, the reply message is sent without the correct message_thread_id, causing messages to disappear from forum topics.

🔍 Symptoms

Primary Symptom

After an Anthropic API 529 retry cycle, the Telegram reply message is sent successfully (Telegram API returns ok and a valid message_id), but the message does not appear in the expected forum topic.

Log Evidence

2026-03-03T11:19:05.208Z [agent/embedded] embedded run agent end: runId=561c9fa1 isError=true error=The AI service is temporarily overloaded.
2026-03-03T11:19:05.685Z [agent/embedded] embedded run agent end: runId=81dab484 isError=true error=The AI service is temporarily overloaded.
2026-03-03T11:24:34.955Z [telegram] sendMessage ok chat=-1003885638534 message=13832

Diagnostic Symptoms

No thread not found error — Telegram did not reject the thread ID
No message_thread_id in logs — The debug output omits the thread parameter, obscuring diagnosis
5-minute gap between last 529 error and sendMessage (indicates retry with backoff)
Missing session entries — No session records for topic:562 on the day of the incident
Stale-socket restart occurred 7 minutes after sendMessage but after the message was already lost

User-Facing Behavior

The original message in topic 562 receives no response
The response message ID exists in Telegram's database (confirmed by API response)
The message is invisible in both the target topic AND the General topic
The message appears to be "sent" but is effectively orphaned

🧠 Root Cause

Primary Failure: Thread Context Loss During Retry

The root cause is a context propagation failure in the retry pipeline. When Anthropic returns HTTP 529, the following sequence occurs:

Message received — OpenClaw receives a Telegram update containing message.chat.id, message.message_thread_id: 562, and conversation context
API call initiated — OpenClaw calls Anthropic's messages API with the conversation context
529 error received — Anthropic returns HTTP 529: The AI service is temporarily overloaded
Retry triggered — OpenClaw's retry mechanism (with backoff) re-attempts the API call
Context corruption — During the retry cycle, the message_thread_id from the original Telegram update is not carried forward to the sendMessage call

Architectural Issue: Session State vs. Inline Context

OpenClaw uses a session-based architecture where conversation context is stored in a session store. The critical bug occurs when:

// Simplified flow showing the failure point
async function handleUpdate(update) {
  const threadId = update.message.message_thread_id; // 562 - captured here
  
  // On first attempt, session is created/loaded
  const session = await sessionStore.get(update.chat.id);
  session.threadId = threadId;
  await sessionStore.set(update.chat.id, session);
  
  // ... API call made, 529 received ...
  
  // On retry, session state may be stale or overwritten
  const retrySession = await sessionStore.get(update.chat.id);
  // retrySession.threadId could be undefined, null, or wrong value
  
  // sendMessage called without correct thread_id
  await telegram.sendMessage({
    chat_id: update.chat.id,
    text: response,
    message_thread_id: retrySession.threadId // BUG: undefined!
  });
}

Contributing Factors

Retry delay creates race condition — The 5-minute backoff between the 529 and the retry allows session state to be cleared, corrupted, or overwritten

No thread_id in sendMessage logs — The debug statement omits message_thread_id, preventing early detection:

// Current (broken) log format
console.log(`sendMessage ok chat=${chatId} message=${messageId}`);
// Missing: message_thread_id=${threadId || ‘undefined’}

Session store TTL/expiry — If sessions expire during the retry window, thread context is lost
Concurrent message handling — If another message arrives in a different topic during the retry, session state can be overwritten

Why No Error Is Raised

Telegram accepts the message without message_thread_id because it defaults to sending to the "Main Topic" (thread_id: 0). However, the Main Topic behavior in forum groups varies by client and Telegram version — some clients hide these messages entirely if the original context was from a different thread.

🛠️ Step-by-Step Fix

Step 1: Ensure Thread ID is Passed to sendMessage

Modify the Telegram adapter to always include message_thread_id in the sendMessage payload, defaulting to the value from the incoming message if not available from session state:

// BEFORE (broken implementation)
async sendMessage(chatId, text, options = {}) {
  const payload = {
    chat_id: chatId,
    text: text,
    // message_thread_id not included - defaults to 0/undefined
    ...options
  };
  
  const result = await this.telegram.sendMessage(payload);
  console.log(`sendMessage ok chat=${chatId} message=${result.message_id}`);
  return result;
}

// AFTER (fixed implementation)
async sendMessage(chatId, text, options = {}) {
  const payload = {
    chat_id: chatId,
    text: text,
    parse_mode: 'Markdown',
    ...options
    // message_thread_id MUST be passed explicitly in options
    // No defaulting to undefined - caller is responsible
  };
  
  // Enhanced logging with thread_id
  console.log(`sendMessage ok chat=${chatId} thread=${payload.message_thread_id ?? 'main'} message=${result.message_id}`);
  return result;
}

Step 2: Preserve Thread ID Through Retry Cycles

Ensure the thread_id from the incoming message is carried through to the sendMessage call, regardless of session state:

// BEFORE (session-dependent)
async handleMessage(ctx, messageText) {
  const session = await this.getSession(ctx.chat.id);
  const response = await this.callAIWithRetry(messageText, session.context);
  
  // Thread ID from session - may be stale after retry
  await this.telegram.sendMessage(ctx.chat.id, response, {
    message_thread_id: session.threadId
  });
}

// AFTER (incoming message context preserved)
async handleMessage(ctx, messageText) {
  // Capture thread_id from the ACTUAL incoming message, not session
  const originalThreadId = ctx.message.message_thread_id;
  
  const session = await this.getSession(ctx.chat.id);
  const response = await this.callAIWithRetry(messageText, session.context);
  
  // Always use the original message's thread_id
  await this.telegram.sendMessage(ctx.chat.id, response, {
    message_thread_id: originalThreadId
  });
}

Step 3: Add Thread ID to All Send Operations

Ensure all Telegram send methods include thread_id when operating in a forum context:

// Helper to build send options with thread context
function buildSendOptions(originalMessage, overrides = {}) {
  const options = { ...overrides };
  
  // Always include thread_id if original message had one
  if (originalMessage.message_thread_id) {
    options.message_thread_id = originalMessage.message_thread_id;
  }
  
  return options;
}

// Usage
const sendOptions = buildSendOptions(ctx.message);
await this.telegram.sendMessage(ctx.chat.id, text, sendOptions);
await this.telegram.editMessageReplyMarkup(ctx.chat.id, messageId, sendOptions);

Step 4: Improve Retry Logging

Log the thread_id at each retry attempt to aid debugging:

async callAIWithRetry(message, context, threadId) {
  const maxRetries = 3;
  let lastError;
  
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    console.log(`[retry] attempt=${attempt} thread=${threadId} maxRetries=${maxRetries}`);
    
    try {
      return await this.anthropic.messages.create({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 1024,
        messages: [{ role: 'user', content: message }],
        extra_headers: { 'anthropic-dangerous-direct-browser-access': 'true' }
      });
    } catch (error) {
      lastError = error;
      
      if (error.status === 529) {
        console.log(`[retry] received 529 (overloaded) thread=${threadId}`);
        const backoffMs = Math.min(1000 * Math.pow(2, attempt), 30000);
        console.log(`[retry] backing off for ${backoffMs}ms thread=${threadId}`);
        await sleep(backoffMs);
      } else if (error.status === 529) {
        throw error; // Non-retryable error
      }
    }
  }
  
  throw lastError;
}

Step 5: Session State Locking (Advanced)

Prevent session state corruption during long retry cycles:

// Use optimistic locking for session updates
async updateSession(chatId, updater, threadId) {
  const maxAttempts = 3;
  
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const session = await this.sessionStore.get(chatId);
    const updated = updater(session);
    
    // Preserve thread_id across session updates
    updated.threadId = session.threadId || threadId;
    
    try {
      await this.sessionStore.set(chatId, updated);
      return updated;
    } catch (conflictError) {
      if (attempt === maxAttempts) throw conflictError;
      await sleep(50 * attempt); // Brief backoff
    }
  }
}

🧪 Verification

Step 1: Reproduce the 529 Scenario

Simulate an Anthropic 529 error to trigger the retry path:

# Using curl to simulate the Telegram update webhook
curl -X POST http://localhost:3000/webhook/telegram \
  -H "Content-Type: application/json" \
  -d '{
    "update_id": 123456789,
    "message": {
      "message_id": 100,
      "chat": { "id": -1003885638534, "type": "supergroup" },
      "message_thread_id": 562,
      "text": "Test message for 529 retry scenario"
    }
  }'

Step 2: Verify sendMessage Log Output

After applying the fix, confirm the log includes thread:

# Expected log output AFTER fix
2026-03-03T11:24:34.955Z [telegram] sendMessage ok chat=-1003885638534 thread=562 message=13832

# Should NOT see (before fix):
2026-03-03T11:24:34.955Z [telegram] sendMessage ok chat=-1003885638534 message=13832

Step 3: Verify Message Appears in Correct Topic

# Use Telegram's getMessage to verify thread placement
curl "https://api.telegram.org/bot${BOT_TOKEN}/getMessage?chat_id=-1003885638534&message_id=13832"

# Expected response includes:
{
  "ok": true,
  "result": {
    "message_id": 13832,
    "chat": { "id": -1003885638534, "type": "supergroup" },
    "message_thread_id": 562,  // <-- Must match original
    "text": "..."
  }
}

Step 4: Verify Session Contains Thread ID

# Check session store for correct thread_id
# (depends on session store implementation)

# If using Redis:
redis-cli GET "session:-1003885638534"
# Should contain: {"threadId": 562, "..."}

# If using file-based:
cat sessions/-1003885638534.json
# Should contain: {"threadId": 562, "..."}

Step 5: Unit Test for Thread Context Preservation

describe('Telegram forum thread context', () => {
  it('should preserve message_thread_id through 529 retry', async () => {
    const ctx = createMockContext({
      chatId: -1003885638534,
      messageId: 100,
      threadId: 562,
      text: 'Test message'
    });
    
    // Mock Anthropic to return 529 twice, then success
    aiClient.messages.create
      .mockRejectedValueOnce({ status: 529, message: 'overloaded' })
      .mockRejectedValueOnce({ status: 529, message: 'overloaded' })
      .mockResolvedValueOnce({ content: [{ type: 'text', text: 'Response' }] });
    
    await handler.handleUpdate(ctx);
    
    // Verify sendMessage was called with correct thread_id
    expect(telegramAdapter.sendMessage).toHaveBeenCalledWith(
      -1003885638534,
      expect.any(String),
      expect.objectContaining({ message_thread_id: 562 })
    );
  });
});

Step 6: Integration Test with Telegram Test Environment

# Use Telegram's test environment or a private bot
# Send message in a forum topic, trigger 529 error, verify reply location

# 1. Set BOT_TOKEN to test bot
export BOT_TOKEN="test_bot_token"

# 2. Run openclaw with logging
OPENCLAW_LOG_LEVEL=debug npm start

# 3. Monitor for:
# - sendMessage logs with thread=562
# - Message appears in correct topic
# - No "lost" messages

⚠️ Common Pitfalls

Environment-Specific Traps

Docker container restart clears session state

If OpenClaw runs in Docker and the container restarts during a long retry cycle, session state (including thread_id) is lost. Ensure session store is externalized (Redis) rather than in-memory.

# Docker Compose configuration - externalize session storage
services:
  openclaw:
    image: openclaw:latest
    environment:
      - SESSION_STORE=redis
      - REDIS_URL=redis://redis:6379
  redis:
    image: redis:7-alpine
    volumes:
      - redis-data:/data
volumes:
  redis-data:

macOS file descriptor limits
When using file-based sessions on macOS, the default ulimit can cause session write failures during high load:
```
# Check current limit
ulimit -n
# Increase if below 1024
ulimit -n 65535
```

Windows path separators in session keys

Session store file paths may have issues on Windows with special characters in chat IDs (leading hyphen):

# Use encodeURIComponent for chat IDs in file paths
const sessionPath = path.join(
  sessionDir,
  `${encodeURIComponent(String(chatId))}.json`
);

Configuration Pitfalls

Forgetting to enable forum support in BotFather

Telegram bots require explicit group_membership permission for forum topics:

# Required BotFather commands:
# /setprivacy -> Disable (for forum access)
# /setjoingroup -> Yes
# /setforums -> Enable (if available)

Mismatched session TTL vs retry backoff

If session TTL is shorter than the retry backoff period, thread context expires:

# Example: 5-minute TTL but 5-minute backoff = guaranteed context loss
SESSION_TTL=300000  # 5 minutes in ms
MAX_RETRY_BACKOFF=300000  # Should be less than TTL

Using reply_to_message_id without message_thread_id
Even with reply_to_message_id set correctly, omitting message_thread_id causes forum messages to be lost:
# BROKEN: reply without thread context { chat_id: -1003885638534, text: "Reply text", reply_to_message_id: 100 // Missing: message_thread_id: 562 } CORRECT: include both
{ chat_id: -1003885638534, text: “Reply text”, reply_to_message_id: 100, message_thread_id: 562 }

Code-Level Pitfalls

Storing thread_id as string vs number
Telegram API accepts both but mixing types causes issues:
# Telegram API is flexible but some clients expect integer const threadId = parseInt(message.message_thread_id, 10); # Or ensure consistent type const threadId = String(message.message_thread_id);

Overwriting session in concurrent handlers
If multiple messages arrive simultaneously for the same chat, session writes can race:
// PROBLEMATIC: Read-modify-write without atomicity const session = await getSession(chatId); session.threadId = threadId; // Read await saveSession(chatId, session); // Write - another request may overwrite
// FIXED: Use atomic operations or locking await updateSessionAtomic(chatId, (s) => { s.threadId = threadId; return s; });

Async/await race conditions in retry handlers
The callback/promise chain can lose context:
// PROBLEMATIC function handleMessage(ctx) { let threadId = ctx.message.message_thread_id; retry(3, () => ai.call()).then(response => { // ’this’ and ’threadId’ may be out of scope or stale sendMessage(ctx.chat.id, response, { threadId }); }); // New message arrives, ’threadId’ is overwritten threadId = newMessage.message_thread_id; } // FIXED: Capture context in closure function handleMessage(ctx) { const threadId = ctx.message.message_thread_id; // Capture immediately
retry(3, () => ai.call()).then(response => { sendMessage(ctx.chat.id, response, { message_thread_id: threadId }); }); }

HTTP 529: The AI service is temporarily overloaded
Anthropic's rate limiting error that triggers the retry sequence. The 529 error is the initiating event for the bug.
stale-socket
Gateway health monitor restart that occurred 7 minutes after the lost message. Not directly related but indicates underlying connection instability that may exacerbate retry issues.
thread not found (not seen in this case)
Telegram API error when message_thread_id refers to a non-existent topic. The absence of this error confirms Telegram received a valid thread ID (or no thread ID at all).
Session expiration during long-running operations
Related to GitHub issue where session TTL is shorter than operation duration, causing context loss. Similar root cause to the thread_id loss.
Webhook delivery failure with missing thread context
Related issue where Telegram webhook updates arrive without message_thread_id for forum messages, causing routing failures.
context.lengthExceeded Anthropic error
When conversation context grows too large during retries, Anthropic returns this error. Can compound thread context issues if error handling loses state.
Race condition in concurrent Telegram updates
When multiple updates arrive simultaneously for the same chat, session state can be overwritten, losing thread_id. Same architectural vulnerability as this bug.
Message sent to wrong chat after bot token refresh
Related session/context loss scenario where bot configuration changes mid-operation cause messages to route incorrectly.