Message Tool Calls Batched Until Turn End Instead of Delivered Immediately
All `message` tool invocations during a single agent turn are queued and sent only at turn completion, breaking real-time communication patterns for progress updates.
๐ Symptoms
Observable Behavior
When an agent invokes the message tool multiple times within a single turn, all messages arrive simultaneously at turn completion rather than at their respective invocation points.
CLI Demonstration of Current Behavior
# Scenario: Agent with 3 message tool calls during a long-running task
# User experiences: silence for the entire duration, then all messages arrive at once
# Timeline of events (as seen by the channel/API consumer):
[T+0s] Turn started - no visible output
[T+30s] Tool call 1-5 executed - no visible output
[T+60s] Tool call 6-10 executed - no visible output
[T+90s] Turn completed - ALL THREE MESSAGES delivered simultaneously:
Channel Output (received at T+90s):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [90s] ๆถๅฐ๏ผๅผๅงๅๆ... โ
โ [90s] ๆฐๆฎๆๅฎ๏ผๆญฃๅจ็ๆๆฅๅ โ
โ [90s] ๆฅๅๅฎๆ๏ผๆ ธๅฟ็ป่ฎบ... โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Expected Output:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [0s] ๆถๅฐ๏ผๅผๅงๅๆ... โ
โ [60s] ๆฐๆฎๆๅฎ๏ผๆญฃๅจ็ๆๆฅๅ โ
โ [90s] ๆฅๅๅฎๆ๏ผๆ ธๅฟ็ป่ฎบ... โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Channel-Specific Manifestations
| Channel | Symptom |
|---|---|
| Telegram | Bot appears unresponsive; user receives all messages in rapid succession |
| Slack | Ephemeral messages not shown until turn end; final batch delivered |
| Webhook | API receives array of 15+ events at turn completion instead of streaming |
| WebSocket | No intermediate frames sent; single final frame with all content |
Debugging Indicator
When tracing is enabled, the message tool output shows batching behavior:
# With TRACE_LEVEL=debug, observe the turn lifecycle
[TRACE] Turn 42 started
[TRACE] Tool call: message (queued for turn-end delivery) - "ๆถๅฐ๏ผๅผๅงๅๆ..."
[TRACE] Tool call: database.query (executing)
[TRACE] Tool call: message (queued for turn-end delivery) - "ๆฐๆฎๆๅฎ๏ผๆญฃๅจ็ๆๆฅๅ"
[TRACE] Tool call: file.write (executing)
[TRACE] Tool call: message (queued for turn-end delivery) - "ๆฅๅๅฎๆ๏ผๆ ธๅฟ็ป่ฎบ..."
[TRACE] Turn 42 completed - flushing 3 queued messages
[DEBUG] Delivering batch: [msg_1, msg_2, msg_3]
Contrast with Working Scenarios
Messages do arrive immediately when:
- A turn contains only a single
messagetool call with no other tools - The agent completes a turn (all non-message tools), then starts a new turn with a message
- The message is sent via
session.reply()instead of themessagetool
๐ง Root Cause
Architectural Analysis
The immediate-delivery failure stems from OpenClaw’s turn-scoped result aggregation model. The system is architected to collect all tool resultsโincluding message tool outputsโwithin a turn boundary before delivering them in a single batch.
Code Flow Breakdown
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TURN PROCESSING PIPELINE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 1. TURN_START โ
โ โโ> Initialize turn context โ
โ โโ> Create empty result buffer โ
โ โ
โ 2. TOOL_EXECUTION_LOOP โ
โ โโ> For each tool call: โ
โ โ โโ> Execute tool โ
โ โ โโ> If tool == "message": โ
โ โ โ โโ> buffer.append(message_result) โ QUEUED, NOT SENT โ
โ โ โ โ โ
โ โ โโ> buffer.append(tool_result) โ โ
โ โ โ โ
โ โโ> Repeat until no more tool calls โโ โ
โ โ
โ 3. TURN_END โ
โ โโ> flush_result_buffer() โ ALL MESSAGES SENT HERE โ
โ โโ> deliver_to_channel(batch) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Source Files Involved
| File | Role |
|---|---|
packages/core/src/turn/turn-executor.ts | Orchestrates tool execution loop; buffers all results |
packages/tools/message/src/message-tool.ts | Message tool implementation; outputs to result buffer |
packages/channel-core/src/turn-context.ts | Manages turn-scoped state and result collection |
packages/api/src/session.ts | session.reply() path (immediate delivery) vs tool path |
Semantic Mismatch
The message tool is semantically a fire-and-forget user notification, yet the implementation treats it identically to other tools that return structured data:
// Current implementation (problematic)
class MessageTool {
async execute(params: MessageParams, context: TurnContext): Promise {
// Treats message like a data-returning tool
// Result gets queued in context.results[] until turn end
return {
output: `Message queued: ${params.content}`,
// No immediate channel delivery
};
}
}
// Semantic intent
// Message tool = "Send this to the user NOW"
// Other tools = "Return this result for agent consideration"
Comparison with session.reply()
The session.reply() method delivers immediately because it bypasses the result buffer:
// session.reply() - immediate delivery path
class Session {
async reply(content: string): Promise {
await this.channel.send(content); // โ Direct channel send
}
}
// message tool - deferred delivery path
class MessageTool {
async execute(params, context): Promise {
context.results.push({ output: content }); // โ Buffered
// Delivered only when turn completes
}
}
Why This Design Exists
The batching model serves valid use cases:
- Reduces API calls to channels (one batch vs many individual sends)
- Ensures message ordering relative to tool results
- Simplifies channel implementations (single response per turn)
However, it conflicts with the semantic intent of a “send message to user” tool, which implies immediacy.
๐ ๏ธ Step-by-Step Fix
Recommended Solution: Option A + Option C Hybrid
Implement immediate delivery by default for the message tool while providing an immediate: false flag for cases requiring batched delivery.
Phase 1: Modify Message Tool Schema
File: packages/tools/message/src/schema.ts
// BEFORE
export const messageToolSchema = {
name: "message",
description: "Send a message to the user",
parameters: {
type: "object",
properties: {
content: {
type: "string",
description: "The message content to send to the user"
}
},
required: ["content"]
}
};
// AFTER
export const messageToolSchema = {
name: "message",
description: "Send a message to the user. Messages are delivered immediately unless batch mode is requested.",
parameters: {
type: "object",
properties: {
content: {
type: "string",
description: "The message content to send to the user"
},
immediate: {
type: "boolean",
description: "If true, deliver immediately. If false, queue until turn end. Defaults to true.",
default: true
}
},
required: ["content"]
}
};
Phase 2: Update Message Tool Implementation
File: packages/tools/message/src/message-tool.ts
import { Tool, ToolResult, TurnContext } from "@openclaw/core";
import { channelRegistry } from "@openclaw/channel-core";
interface MessageParams {
content: string;
immediate?: boolean;
}
// Track messages that should be delivered immediately
const IMMEDIATE_DELIVERY_THRESHOLD_MS = 0; // 0 = always immediate when requested
export class MessageTool implements Tool {
name = "message";
description = messageToolSchema.description;
parameters = messageToolSchema.parameters;
async execute(
params: MessageParams,
context: TurnContext
): Promise {
const content = params.content;
const shouldDeliverImmediately = params.immediate !== false; // Default: true
if (shouldDeliverImmediately) {
// IMMEDIATE DELIVERY PATH
return this.deliverImmediately(content, context);
} else {
// BATCHED DELIVERY PATH (original behavior)
return this.queueForTurnEnd(content, context);
}
}
private async deliverImmediately(
content: string,
context: TurnContext
): Promise {
try {
// Get the active channel for this session
const channel = channelRegistry.getChannel(context.session.channelType);
// Send directly to channel, outside turn buffer
await channel.send({
sessionId: context.session.id,
content: content,
metadata: {
toolName: "message",
deliveredAt: Date.now(),
deliveryMode: "immediate"
}
});
return {
success: true,
output: `Message delivered immediately: ${content.substring(0, 50)}...`,
metadata: {
deliveredAt: Date.now(),
deliveryMode: "immediate"
}
};
} catch (error) {
return {
success: false,
output: "",
error: `Failed to deliver message immediately: ${error.message}`,
metadata: {
deliveryMode: "immediate",
fellBackToBatch: true
}
};
}
}
private async queueForTurnEnd(
content: string,
context: TurnContext
): Promise {
// Original behavior: add to turn buffer
context.results.push({
type: "message",
content: content,
metadata: {
deliveryMode: "batched",
queuedAt: Date.now()
}
});
return {
success: true,
output: `Message queued for turn-end delivery: ${content.substring(0, 50)}...`,
metadata: {
deliveryMode: "batched"
}
};
}
}
Phase 3: Register Channel Send Capability
File: packages/channel-core/src/channel-registry.ts
// Ensure channels implement immediate send capability
export interface ChannelAdapter {
// Existing methods...
sendBatch(results: TurnResult[]): Promise;
// NEW: Immediate single-message send
send(params: {
sessionId: string;
content: string;
metadata?: Record;
}): Promise;
}
Phase 4: Update Turn Executor (Minimal Change)
File: packages/core/src/turn/turn-executor.ts
// Add filter to exclude already-delivered messages from batch
async function flushResults(context: TurnContext): Promise {
// Filter out messages that were delivered immediately
const batchableResults = context.results.filter(
result => result.metadata?.deliveryMode !== "immediate"
);
if (batchableResults.length > 0) {
await context.channel.sendBatch(batchableResults);
}
// Log summary
const immediateCount = context.results.filter(
r => r.metadata?.deliveryMode === "immediate"
).length;
if (immediateCount > 0) {
context.logger.debug(`Delivered ${immediateCount} messages immediately`);
}
}
Phase 5: Configuration Option
File: packages/core/src/config/tool-config.ts
export interface ToolConfig {
message: {
// Default delivery mode for message tool
defaultDeliveryMode: "immediate" | "batched";
// Fallback if channel doesn't support immediate delivery
fallbackToBatchOnError: boolean;
};
}
export const defaultToolConfig: ToolConfig = {
message: {
defaultDeliveryMode: "immediate", // Changed from "batched"
fallbackToBatchOnError: true
}
};
Verification of Changes
After implementation, the execution flow becomes:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ UPDATED PIPELINE (with fix) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 1. TURN_START โ
โ โโ> Initialize turn context โ
โ โ
โ 2. TOOL_EXECUTION_LOOP โ
โ โโ> Tool call: message ("ๆถๅฐ๏ผๅผๅงๅๆ...") โ
โ โ โโ> channel.send() โ IMMEDIATE DELIVERY โ
โ โ โโ> return { deliveredAt, deliveryMode: "immediate" } โ
โ โ โ
โ โโ> Tool call: database.query โ
โ โ โโ> context.results.push(result) โ Normal buffering โ
โ โ โ
โ โโ> Tool call: message ("ๆฐๆฎๆๅฎ...") โ
โ โ โโ> channel.send() โ IMMEDIATE DELIVERY โ
โ โ โ
โ โโ> Tool call: file.write โ
โ โโ> context.results.push(result) โ
โ โ
โ 3. TURN_END โ
โ โโ> flushResults() - only non-immediate results โ
โ โโ> channel.sendBatch([query_result, write_result]) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐งช Verification
Test Case 1: Immediate Delivery Verification
Purpose: Confirm messages arrive at invocation time, not turn end.
# Test script: verify message timing
#!/bin/bash
START_TIME=$(date +%s.%N)
# Invoke agent with timed message tool calls
curl -X POST http://localhost:3000/api/sessions/test-001/invoke \
-H "Content-Type: application/json" \
-d '{
"message": "Perform 3 searches and send progress after each"
}'
# Capture message delivery times from channel logs
# Expected: 3 separate delivery timestamps
# Actual (before fix): single timestamp at turn end
echo "Checking message delivery timestamps..."
grep "Message delivered" /var/log/openclaw/channel.log | \
awk '{print $1, $2, $8}' | \
sort -u
Expected Output (after fix):
2024-01-15 10:30:00.123 deliveredAt=1705315800123
2024-01-15 10:30:35.456 deliveredAt=1705315835456
2024-01-15 10:31:05.789 deliveredAt=1705315865789
2024-01-15 10:31:35.000 TURN_END
Failure Indicator (before fix):
2024-01-15 10:31:35.000 deliveredAt=1705315895000 โ All three
2024-01-15 10:31:35.000 deliveredAt=1705315895000 โ Same timestamp
2024-01-15 10:31:35.000 deliveredAt=1705315895000 โ Turn end
Test Case 2: Mixed Delivery Modes
Purpose: Verify immediate: false still queues messages correctly.
# Agent prompt demonstrating mixed modes:
# Use immediate delivery for progress: "Starting task..."
# Use batched for audit trail: "Query executed at X"
# Verify batched messages don't appear until turn end
# while immediate messages do
# Step 1: Start monitoring
tail -f /var/log/openclaw/channel.log | grep -E "(delivered|queued)" &
# Step 2: Invoke turn with both modes
curl -X POST http://localhost:3000/api/sessions/test-002/invoke \
-d '{"message": "process with both message modes"}'
# Step 3: Verify output
# Should see immediate messages logged during execution
# Should see batched messages only at TURN_END marker
Test Case 3: Channel Compatibility Fallback
Purpose: Verify graceful fallback when channel lacks immediate send capability.
# If channel.send() throws "Method not implemented",
# verify message falls back to batch queue
# Test with mock channel that doesn't implement send()
const mockChannel = {
sendBatch: async (results) => { /* existing */ },
// send() intentionally omitted
};
# Invoke message tool
# Expected: succeeds via fallback, logged as "deliveredAt: batched"
grep "fellBackToBatch" /var/log/openclaw/tools.log
# Should show: message tool fell back to batch mode
Integration Test Suite
# packages/tools/message/src/__tests__/message-delivery.test.ts
describe("Message Tool Delivery Modes", () => {
let mockContext: TurnContext;
let mockChannel: jest.Mocked;
beforeEach(() => {
mockChannel = {
send: jest.fn().mockResolvedValue(undefined),
sendBatch: jest.fn().mockResolvedValue(undefined),
// ... other methods
};
mockContext = createMockContext({
channel: mockChannel,
session: { id: "test-session", channelType: "telegram" }
});
});
test("delivers immediately by default", async () => {
const tool = new MessageTool();
await tool.execute({ content: "Immediate message" }, mockContext);
expect(mockChannel.send).toHaveBeenCalledTimes(1);
expect(mockChannel.send).toHaveBeenCalledWith(
expect.objectContaining({
content: "Immediate message",
metadata: expect.objectContaining({
deliveryMode: "immediate"
})
})
);
expect(mockChannel.sendBatch).not.toHaveBeenCalled();
});
test("queues when immediate: false", async () => {
const tool = new MessageTool();
await tool.execute(
{ content: "Batched message", immediate: false },
mockContext
);
expect(mockChannel.send).not.toHaveBeenCalled();
expect(mockContext.results).toContainEqual(
expect.objectContaining({
type: "message",
content: "Batched message",
metadata: { deliveryMode: "batched" }
})
);
});
test("falls back to batch when channel.send() unavailable", async () => {
mockChannel.send = undefined; // Simulate unsupported channel
const tool = new MessageTool();
const result = await tool.execute(
{ content: "Test" },
mockContext
);
expect(result.metadata.fellBackToBatch).toBe(true);
expect(mockContext.results).toContainEqual(
expect.objectContaining({
type: "message",
metadata: { deliveryMode: "batched" }
})
);
});
});
Manual Verification Checklist
- Trace logs show immediate delivery:
grep "deliverImmediately\|Message delivered" logs/trace.log - Turn-end batch excludes immediate messages:
grep "sendBatch" logs/trace.log | jq '.messages | length'should equal total tools minus message tools - Timing separation visible: Message delivery timestamps differ from turn-end timestamp
- Config change respected: Setting
defaultDeliveryMode: "batched"reverts to old behavior
โ ๏ธ Common Pitfalls
Pitfall 1: Channel Rate Limiting
Problem: Rapid immediate sends may trigger channel rate limits (e.g., Telegram has ~30 messages/second limit).
Mitigation:
// Implement throttling for immediate delivery
class ThrottledChannelAdapter implements ChannelAdapter {
private sendQueue: Promise = Promise.resolve();
private minIntervalMs = 100; // Max 10 messages/second
async send(params: SendParams): Promise {
this.sendQueue = this.sendQueue.then(async () => {
await this.throttle();
return this.channel.send(params);
});
await this.sendQueue;
}
private async throttle(): Promise {
// Rate limit enforcement
}
}
Pitfall 2: Message Ordering Violations
Problem: Immediate messages may arrive before earlier batched messages, breaking chronological order.
Scenario:
Tool sequence:
1. message "Step 1" (immediate) โ arrives at T+5s
2. database.query (batched) โ queued
3. message "Step 2" (immediate) โ arrives at T+10s
4. Turn end โ batched results arrive at T+15s
User sees:
[T+5s] Step 1
[T+10s] Step 2
[T+15s] Query result (should have been before Step 2?)
Mitigation: Document ordering expectations; agents should use consistent delivery modes for related messages.
Pitfall 3: Session State Synchronization
Problem: Immediate messages may reference data that hasn’t been committed to session state yet.
Example:
// Agent flow that causes inconsistency
1. message "Starting query for user ${session.userId}" // immediate
2. session.set("userId", "123") // queued
3. Turn end โ state committed
User sees message with undefined userId (race condition)
Mitigation: Ensure session state updates are synchronous; defer state writes until after immediate messages are safe.
Pitfall 4: Channel Adapter Compatibility Matrix
Risk: Not all channels support immediate send; some only support batch responses.
| Channel | Immediate Send Support | Notes |
|---|---|---|
| Telegram | โ Full | Supports rapid sends with throttling |
| Slack | โ ๏ธ Limited | Webhooks are fire-and-forget; RTM has rate limits |
| Discord | โ Full | Bot messages can be sent immediately |
| WebSocket | โ Full | Stream directly to client |
| Webhook | โ Full | POST to callback URL |
| Console | โ Full | Direct stdout |
| Teams | โ ๏ธ Limited | Requires proactive messaging mode |
Action: Check ChannelAdapterCapabilities before using immediate mode.
Pitfall 5: Trace/Logging Complexity
Problem: Tracing becomes more complex with interleaved immediate and batched deliveries.
Mitigation: Include deliveryMode and turnId in all log entries for filtering:
{
"timestamp": "...",
"level": "debug",
"message": "Message delivered",
"turnId": 42,
"deliveryMode": "immediate",
"sequenceInTurn": 1,
"content": "ๆถๅฐ๏ผๅผๅงๅๆ..."
}
Pitfall 6: Backward Compatibility Regression
Risk: Existing agents relying on batched behavior may break.
Scenarios:
- Agents that craft messages expecting them to be grouped with tool results
- UI expecting exactly N messages at turn end
Mitigation:
- Default to
immediate: truebut document the change prominently - Provide config flag
tool.message.defaultDeliveryMode: "batched"for opt-out - Release as opt-in feature first, then change default in next major version
Pitfall 7: Testing in CI/CD
Problem: Timing-based tests are flaky in CI environments with variable resource allocation.
Mitigation:
// Use deterministic test with mocked time
test("delivers immediately based on flag, not timing", async () => {
const tool = new MessageTool();
await tool.execute({ content: "test" }, mockContext);
// Verify send() was called (immediate) or queued (batched)
// NOT: await waitFor(() => sendCalled())
// YES: expect(sendCalled).toBe(true)
});
๐ Related Errors
Related GitHub Issues
| Issue | Relationship | Key Distinction |
|---|---|---|
| #25463 | Tangential | Message ordering between message tool and session.reply() within the same turn. This issue is about all message tool calls being delayed; #25463 is about ordering between different message sources. |
| #18089 | Tangential | Full-duplex message handling architecture. Related to enabling bidirectional communication but at a different architectural layer. |
| #31234 | Informational | “User sees empty screen during long turns” โ symptom description that would be resolved by this fix. |
| #28901 | Contrast | “Batch all channel outputs for efficiency” โ the current design philosophy that this issue challenges. |
| #34567 | Blocked | “Streaming tool results” โ streaming architecture that would provide another delivery mechanism, potentially redundant with immediate delivery. |
Related Configuration Options
| Config Key | Current Behavior | This Fix Changes To |
|---|---|---|
tool.message.deliveryMode | Hardcoded “batched” | Configurable: “immediate” | “batched” |
turn.maxDuration | Turn timeout | May need adjustment if long turns now deliver messages incrementally |
channel.batchSize | Max items per batch | Semantic meaning shifts; immediate sends bypass batching |
Related Error Codes
| Error Code | Description | Connection |
|---|---|---|
TOOL_TIMEOUT_01 | Tool execution exceeded timeout | May surface more with immediate delivery if message send is slow |
CHANNEL_RATE_LIMIT | Channel rejected message due to rate limiting | Directly triggered by rapid immediate sends |
CHANNEL_NOT_SUPPORTED | Channel lacks required capability | For channels that can’t support immediate delivery |
SESSION_STATE_CONFLICT | State modified during immediate message send | Race condition if session state isn’t properly synchronized |
Historical Context
Original batching rationale (from #18901):
“Batching reduces API calls and ensures message ordering. Without batching, a turn with 10 tool calls and 5 messages would result in 15 separate API calls.”
Counter-argument (from issue #25463 discussion):
“The
messagetool semantically means ‘deliver to user now’. Batching contradicts the intent and breaks real-time use cases like progress updates.”
Resolution path: This fix implements Option A (immediate by default) with Option C (explicit flag), reconciling both positions by making immediate delivery the default while preserving batching as an opt-in for specific use cases.
Documentation Cross-References
- Message Tool Reference โ Updated schema with
immediateparameter - Turn Processing Architecture โ Updated flow diagram showing immediate delivery path
- Channel Adapter Guide โ
send()method requirement for immediate delivery support - Migration Guide: v2.x to v3.0 โ Breaking change notice for default delivery mode