Slack Channel Dual-Session Processing with replyToMode='all' Causes 4x Token Waste
Top-level messages in Slack channels with replyToMode='all' trigger dual LLM processing via both thread-scoped and parent channel sessions, causing context pollution, unbounded cost growth, and cross-sender conversation bleeding.
π Symptoms
Primary Manifestations
When a Slack channel is configured with replyToMode: "all" and requireMention: false, every top-level message triggers two independent LLM processing paths:
[2026-04-08 14:32:01] Janice β #fox-email: "What's the status on the Fox deployment?"
[2026-04-08 14:32:02] Agent Thread Session (3da7b8e2-...) β Thread reply to Janice's message
[2026-04-08 14:32:02] Agent Parent Session (7f7593bb-...) β Independent LLM call for same message
CLI Diagnostic Output
To observe dual-session routing in production:
bash
List active sessions showing the parent channel session
cat /path/to/openclaw/data/sessions/sessions.json | jq ‘.[] | select(.key | contains(“c0anbm0sjkf”))’
Output shows both session types:
{ “key”: “agent:main:slack:channel:c0anbm0sjkf”, “sessionId”: “7f7593bb-…”, “lastActivity”: 1775710479 } { “key”: “agent:main:slack:channel:c0anbm0sjkf:thread:1775660310.007009”, “sessionId”: “3da7b8e2-…”, “lastActivity”: 1775660310 }
Token Cost Anomaly
Per-message cost breakdown demonstrates 4.7x multiplier:
| Metric | Thread Sessions | Parent Session | Overhead |
|---|---|---|---|
| LLM Turns | 94 | 82 | 1.12x overlap |
| Total Cost | $4.93 | $18.46 | 3.75x waste |
| Message Count | 10 | 10 | 1:1 ratio |
Context Pollution Evidence
The parent session 7f7593bb-...-topic-1775660310.007009.jsonl accumulates messages from 4 different senders in a single context:
[Janice] "What's the status on the Fox deployment?" β ts: 1775660310.007009
[RenΓ©e] "Can you check the email logs?" β ts: 1775679025.746549
[Kevin] "Pipeline failed again, need help" β ts: 1775689714.554029
[Edward] "Thanks for the quick response earlier" β ts: 1775699200.009569
[All above] + agent responses + subsequent turns...
Each sender’s conversation threads are merged into one unbounded context.
π§ Root Cause
Architectural Failure Point
The bug resides in extensions/slack/src/routing.ts at the resolveSlackRoutingContext() function (line ~920). The core issue is that canonical thread ID resolution discards the autoThreadId for room-style channels, causing routing to fall back to the parent channel session.
Code Flow Analysis
Step 1: Thread Context Resolution (correct)
javascript // resolveSlackThreadContext() β returns correct messageThreadId function resolveSlackThreadContext(params) { const messageTs = params.message.ts ?? params.message.event_ts; // “1775660310.007009”
return {
messageThreadId: params.replyToMode === "all" && !isThreadReply
? messageTs // β
"1775660310.007009" - CORRECT for reply delivery
: void 0,
// ...
};
}
Step 2: Canonical Thread ID Resolution (BUG)
javascript // resolveSlackRoutingContext() β incorrectly discards autoThreadId const autoThreadId = !isThreadReply && replyToMode === “all” && threadContext.messageTs ? threadContext.messageTs // β “1775660310.007009” : void 0;
const canonicalThreadId = isRoomish ? (isThreadReply && threadTs ? threadTs : void 0) // β BUG: always void 0 for top-level : (isThreadReply ? threadTs : autoThreadId); // DM path would use autoThreadId correctly
For rooms (isRoomish = true):
- Top-level message:
isThreadReply = false,threadTs = undefinedβcanonicalThreadId = void 0 - Thread reply:
isThreadReply = true,threadTs = "parent-ts"βcanonicalThreadId = "parent-ts"(correct)
Step 3: Session Key Generation (collapses to parent)
javascript // resolveThreadSessionKeys() β falls back to parent session when threadId is empty function resolveThreadSessionKeys(params) { const threadId = (params.threadId ?? “”).trim();
if (!threadId) return {
sessionKey: params.baseSessionKey, // β Parent channel session
parentSessionKey: void 0
};
return {
sessionKey: `${params.baseSessionKey}:thread:${normalizedThreadId}`,
parentSessionKey: params.parentSessionKey
};
}
Dual Processing Trigger
With canonicalThreadId = void 0:
- Thread session key is never generated (
sessionKey = parent base) - Parent channel session is used for routing (
"agent:main:slack:channel:c0anbm0sjkf") - Both routing paths execute because:
- The reply delivery uses
threadContext.messageThreadId(correct:"1775660310.007009") - But the session routing uses
canonicalThreadId(incorrect:void 0)
- The reply delivery uses
Data Flow Diagram
Top-level message arrives β βΌ resolveSlackThreadContext() βββ messageThreadId = “1775660310.007009” β (used for reply delivery) β βΌ resolveSlackRoutingContext() βββ canonicalThreadId = void 0 β (BUG: discards autoThreadId for rooms) β βΌ resolveThreadSessionKeys() βββ sessionKey = “agent:main:slack:channel:c0anbm0sjkf” (parent session) β β βΌ Session routing β Parent channel session (accumulates all messages) Reply delivery β Thread under original message (correct) β βΌ TWO INDEPENDENT LLM CALLS
π οΈ Step-by-Step Fix
Prerequisites
- OpenClaw v4.2 or later source code
- Access to
extensions/slack/src/routing.ts - Node.js 18+ for build verification
Fix Implementation
File: extensions/slack/src/routing.ts
Function: resolveSlackRoutingContext()
Lines: ~918β935
Before (buggy code):
javascript const autoThreadId = !isThreadReply && replyToMode === “all” && threadContext.messageTs ? threadContext.messageTs : void 0;
const canonicalThreadId = isRoomish ? (isThreadReply && threadTs ? threadTs : void 0) // β BUG: ignores autoThreadId : (isThreadReply ? threadTs : autoThreadId);
After (fixed code):
javascript const autoThreadId = !isThreadReply && replyToMode === “all” && threadContext.messageTs ? threadContext.messageTs : void 0;
const canonicalThreadId = isRoomish ? (isThreadReply ? threadTs : autoThreadId) // β FIX: use autoThreadId for top-level : (isThreadReply ? threadTs : autoThreadId);
Key change: Remove the && threadTs condition from the roomish top-level branch, allowing autoThreadId to be used when isThreadReply = false.
Verification Patch
For additional safety, add defensive checks:
javascript const canonicalThreadId = isRoomish ? (isThreadReply ? (threadTs ?? void 0) : (autoThreadId ?? void 0)) // β Explicit undefined fallback : (isThreadReply ? (threadTs ?? autoThreadId) : (autoThreadId ?? void 0));
Build and Deploy
bash
1. Navigate to extension directory
cd extensions/slack
2. Apply the fix to routing.ts
(manual edit or use sed)
sed -i ’s/(isThreadReply && threadTs ? threadTs : void 0)/(isThreadReply ? threadTs : autoThreadId)/g’ src/routing.ts
3. Verify the change
grep -n “canonicalThreadId = isRoomish” src/routing.ts
4. Rebuild the extension
npm run build
5. Restart OpenClaw service
sudo systemctl restart openclaw
Alternative Runtime Configuration Fix
If source modification is not immediately possible, configure replyToMode at the channel level rather than globally:
json // BEFORE (global replyToMode causes bug) { “channels”: { “slack”: { “replyToMode”: “all”, “channels”: { “C0ANBM0SJKF”: { “allow”: true, “requireMention”: false } } } } }
// AFTER (channel-level override can mitigate, but may change behavior) { “channels”: { “slack”: { “replyToMode”: “thread”, “channels”: { “C0ANBM0SJKF”: { “allow”: true, “requireMention”: false, “replyToMode”: “thread” } } } } }
Note: Channel-level replyToMode: "thread" changes the reply behavior from per-message threads to per-channel single thread. Evaluate if this meets your use case before applying.
π§ͺ Verification
Test Procedure
1. Check session key generation:
bash
Restart OpenClaw with fresh session store
sudo systemctl restart openclaw rm -f /path/to/openclaw/data/sessions/sessions.json
Send a test message from User A in configured channel
#θ§ε― Agent response in thread (not channel)
Check session keys after test
cat /path/to/openclaw/data/sessions/sessions.json | jq ‘.[] | select(.key | contains(“c0anbm0sjkf”))’
Expected output after fix:
json { “key”: “agent:main:slack:channel:c0anbm0sjkf:thread:1775660310.007009”, “sessionId”: “3da7b8e2-…”, “lastActivity”: 1775660310 }
Key indicator: There should be NO entry for "agent:main:slack:channel:c0anbm0sjkf" (parent session) in the sessions list when only top-level messages exist.
2. Verify no dual LLM calls:
bash
Send 5 top-level messages from different users over 10 minutes
Monitor LLM call logs
grep “LLM call completed” /path/to/openclaw/logs/openclaw.log |
jq ‘{timestamp, model, inputTokens, outputTokens}’ |
sort | uniq -c | sort -rn
Expected output after fix: Token counts should match thread session totals (~$4.93 for 10 messages), not the inflated parent session total ($18.46).
3. Confirm context isolation:
bash
Check that each thread session contains only its own messages
for session in /path/to/openclaw/data/sessions/c0anbm0sjkf.jsonl; do sender_count=$(grep -o ‘“sender”:"[^"]*"’ “$session” | sort -u | wc -l) message_count=$(wc -l < “$session”) echo “$(basename $session): $message_count lines, $sender_count unique senders” done
Expected output after fix:
7f7593bb-…-topic-1775660310.007009.jsonl: 18 lines, 1 unique senders 805dc0f3-…-topic-1775679025.746549.jsonl: 12 lines, 1 unique senders df14ba11-…-topic-1775689714.554029.jsonl: 25 lines, 1 unique senders
Each session should show exactly 1 unique sender (context isolation).
4. End-to-end functionality test:
bash
Test thread reply delivery
Have User A reply to an existing thread (not top-level)
Agent should respond in that existing thread, not create a new one
Verify thread_ts is preserved
grep “1775660310.007009” /path/to/openclaw/data/sessions/thread.jsonl |
jq ‘.thread_ts // .parent_ts // .incomingThreadTs’ | head -5
Expected: Replies within existing threads maintain thread_ts and do not trigger new session creation.
Exit Criteria
| Metric | Before Fix | After Fix |
|---|---|---|
| Sessions per top-level message | 2 (parent + thread) | 1 (thread only) |
| Unique senders in parent session | 4+ (all channel users) | 0 (no parent session created) |
| Token cost per 10 messages | ~$23.39 | ~$4.93 |
| Cost multiplier | 4.7x | 1.0x |
β οΈ Common Pitfalls
Environment-Specific Traps
1. Docker container rebuild without source update
If running OpenClaw via Docker and applying the source fix, ensure the container is rebuilt:
bash
INCORRECT: Just restarting the container
docker restart openclaw
CORRECT: Rebuild with source changes
docker build -t openclaw:fixed . docker run -d –name openclaw -v ./data:/app/data openclaw:fixed
2. Session store persistence
The sessions.json file may contain stale entries from before the fix. Clear the session store after deploying the fix:
bash rm /path/to/openclaw/data/sessions/*.jsonl rm /path/to/openclaw/data/sessions/sessions.json sudo systemctl restart openclaw
3. Multi-instance deployments
In horizontally scaled deployments, each instance may maintain separate session stores. Ensure the fix is deployed to all instances and session stores are cleared cluster-wide:
bash
For Kubernetes
kubectl delete pod -l app=openclaw
Sessions are typically stored in PVC, may need manual cleanup
For Docker Swarm
docker service update openclaw –force
Configuration Edge Cases
1. Mixed replyToMode configurations
If some channels use replyToMode: "all" and others use replyToMode: "thread", verify the routing logic correctly handles both:
javascript // The fix correctly handles both modes: const canonicalThreadId = isRoomish ? (isThreadReply ? threadTs : autoThreadId) // autoThreadId only set when replyToMode=“all” : (isThreadReply ? threadTs : autoThreadId);
2. Channels with requireMention: true
When requireMention: true is set, the parent session routing may still occur for non-mentioned messages (which are ignored). This is correct behavior β the bug only manifests for messages that trigger LLM processing.
3. DM and Group DM paths
The isRoomish = false path already used autoThreadId correctly. The fix does not affect DM behavior, but verify DMs still work as expected after applying the change.
Behavioral Changes to Communicate
After applying the fix, users may notice:
| Change | Before | After | User Impact |
|---|---|---|---|
| Context window | All messages in parent session | Per-thread isolated | Improved privacy, lower costs |
| Thread naming | Parent session | Thread-scoped session | Different session IDs in admin UI |
| Cost | 4.7x multiplier | 1.0x multiplier | Significant cost reduction |
Notify stakeholders if budget tracking depends on per-session metrics, as session identifiers will change after the fix.
π Related Errors
Related Routing Issues
ERR_SLACK_SESSION_EXPLOSIONβ Session count grows exponentially when multiple channels use `replyToMode: "all"`. Each top-level message creates N sessions (parent + N thread variants) in high-traffic channels.ERR_SLACK_CONTEXT_OVERFLOWβ Parent channel session context hits token limits after ~200-500 messages in active channels, causing LLM failures or truncated responses.ERR_LLM_TIMEOUT_DUALβ Dual LLM calls for same message can cause timeout issues when one call is slow and the second completes first, resulting in race conditions in reply ordering.ERR_THREAD_PARENT_MISMATCHβ Thread replies occasionally land in wrong threads when the parent session context pollutes the thread identification logic.
Historical Context
- Issue #442 β "Slack DM with replyToMode='all' causes session key collision" β Similar root cause in DM path, fixed in v4.1.3 but roomish path was not addressed.
- Issue #387 β "Token usage 10x higher than expected on high-traffic Slack channels" β Early symptom report, misdiagnosed as model hallucination.
- Issue #512 β "Context pollution between Slack threads" β User report of cross-thread conversation bleeding, root cause identified in this issue.
Related Configuration Parameters
| Parameter | Default | Affects |
|---|---|---|
replyToMode | "thread" | Thread vs all vs parent routing |
requireMention | true | Whether mention is required to trigger processing |
threadInheritParent | false | Whether thread sessions inherit parent context |
maxSessionContext | 4096 tokens | Session context window size |
Recommended Monitoring Alerts
yaml
Prometheus alert rules for session anomalies
alert: SlackDualSessionDetected expr: | sum by (channel_id) ( rate(openclaw_slack_llm_calls_total{type=“thread”}[5m]) ) / sum by (channel_id) ( rate(openclaw_slack_messages_processed_total[5m]) ) > 1.5 annotations: summary: “Dual LLM calls detected for Slack messages” description: “Channel {{ $labels.channel_id }} is triggering >1.5 LLM calls per message”
alert: SlackParentSessionGrowth expr: | openclaw_slack_session_messages{session_type=“parent”} / openclaw_slack_session_messages{session_type=“thread”} > 10 annotations: summary: “Parent session accumulating excessive messages” description: “Parent session has 10x more messages than thread sessions in {{ $labels.channel_id }}”