Discord 频道泄露内部负载:用户消息中的 EXTERNAL_UNTRUSTED_CONTENT 包装器
内部包装器标记和格式错误的附件提取文本正在被转发到 Discord 频道,而不是在传输前被清理。
🔍 症状
观察到的面向用户的错误
当用户通过 Discord 与助手交互时,他们观察到消息中包含不应到达呈现层的原始内部内容。泄露的内容以两种不同的模式出现:
模式 1:包装器语法泄露
包含原始序列化标记的消息直接出现在 Discord 聊天中:
<<<EXTERNAL_UNTRUSTED_CONTENT id="msg_abc123">>>
Source: External
UNTRUSTED Discord message body
<<<END_EXTERNAL_UNTRUSTED_CONTENT id="msg_abc123">>>
模式 2:格式错误的附件负载垃圾信息
被重复技术术语主导的大量无意义文本块:
attach attachment attachment hookup toggle compiler
attachment hookup toggle compiler attach attachment
UNTRUSTED Discord message body Source External Source External
attach attachment attachment hookup toggle compiler
技术表现
| 组件 | 表现 |
|---|---|
| Discord 传输层 | 原始包装器标签出现在出站消息负载中 |
| 附件处理器 | 损坏的提取结果被转发到频道 |
| 异步工具完成 | 队列中的完成文本包含内部标记 |
| 清理层 | 上下文和渲染之间的边界执行失败 |
触发条件
该问题发生在以下任何操作之后:
- 助手处理包含附件的消息
- 异步工具完成将结果传送到 Discord 频道
- 外部内容通过
EXTERNAL_UNTRUSTED_CONTENT包装器系统处理 - 多轮对话涉及文件/图片附件
🧠 根因分析
架构失败点
泄露表明消息管道中内部处理和 Discord 传输之间存在清理边界失败。OpenClaw 框架使用 EXTERNAL_UNTRUSTED_CONTENT 包装器在代理处理期间隔离不受信任的用户内容。此包装器应该:
- 在上下文组装期间被内部消费
- 永远不会被序列化到出站传输层
- 在任何消息到达渲染管道之前被剥离
失败序列
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE FLOW (FAILING) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Discord Message Received │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Content Wrapper │ ← EXTERNAL_UNTRUSTED_CONTENT added │
│ │ Injection │ to isolate untrusted input │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Agent Runtime │ ← Wrapper consumed in context │
│ │ Processing │ (intended behavior) │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Discord Transport│ ← SANITIZATION FAILURE │
│ │ Renderer │ Wrapper not stripped before posting │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ RAW WRAPPER + │ ← User sees: │
│ │ Payload │ <<>> │
│ │ Forwarded │ UNTRUSTED Discord message body │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
代码路径分析
缺陷存在于构建响应消息的 Discord 传输适配器中。预期的代码路径:
// CORRECT FLOW (Expected)
function buildDiscordMessage(agentResponse) {
const sanitized = sanitize(s剥离所有内部标记);
const message = createDiscordEmbed(sanitized);
return message;
}
// ACTUAL FLOW (Defective)
function buildDiscordMessage(agentResponse) {
// Sanitization missing or ineffective
const message = createDiscordEmbed(agentResponse.raw);
// Raw EXTERNAL_UNTRUSTED_CONTENT markers included
return message;
}
附件负载损坏
“垃圾文本"模式源于附件文本提取,其中:
- 处理二进制或格式错误的附件数据
- 提取产生损坏的 Unicode/码点序列
- 这些序列在多附件处理期间重复
- 损坏的负载绕过内容过滤
子系统职责
| 子系统 | 预期行为 | 实际行为 |
|---|---|---|
DiscordTransport | 在发布前剥离内部包装器 | 转发原始内容 |
ContentSanitizer | 移除 EXTERNAL_* 标记 | 过滤器被禁用或绕过 |
AttachmentHandler | 干净的提取文本 | 传递损坏的负载 |
AsyncCompletionRouter | 传递干净的完成 | 包含调试标记 |
🛠️ 逐步修复
阶段 1:在 Discord 传输中禁用包装器传播
文件: src/transports/discord/index.ts(或等效传输模块)
修复前(有缺陷):
async function handleAssistantMessage(message: ProcessedMessage): Promise<void> {
const discordMessage = {
content: message.content,
embeds: message.embeds
};
await this.client.sendMessage(discordMessage);
}
修复后(已修复):
async function handleAssistantMessage(message: ProcessedMessage): Promise<void> {
const sanitizedContent = this.sanitizeForDiscord(message.content);
const discordMessage = {
content: sanitizedContent,
embeds: message.embeds
};
await this.client.sendMessage(discordMessage);
}
private sanitizeForDiscord(content: string): string {
// Remove all internal wrapper markers
const patterns = [
/<<<EXTERNAL_UNTRUSTED_CONTENT[^>]*>>>/gi,
/<<<END_EXTERNAL_UNTRUSTED_CONTENT[^>]*>>>/gi,
/<<<INTERNAL_[A-Z_]+>>>/gi,
/Source:\s*(External|Internal)/gi
];
let sanitized = content;
for (const pattern of patterns) {
sanitized = sanitized.replace(pattern, '');
}
return sanitized.trim();
}
阶段 2:加强附件提取清理
文件: src/handlers/attachment-extractor.ts
修复前(有缺陷):
function extractTextFromAttachment(attachment: Attachment): string {
const raw = processAttachmentBinary(attachment);
return raw.text || '';
}
修复后(已修复):
function extractTextFromAttachment(attachment: Attachment): string {
const raw = processAttachmentBinary(attachment);
let text = raw.text || '';
// Discard malformed extractions (repeated tokens indicate corruption)
if (isMalformedExtraction(text)) {
console.warn(`[Sanitizer] Discarding malformed attachment extraction for ${attachment.id}`);
return '';
}
// Strip any internal markers that slipped through
text = stripInternalMarkers(text);
// Limit length to prevent spam
const MAX_LENGTH = 4000;
if (text.length > MAX_LENGTH) {
text = text.substring(0, MAX_LENGTH) + '\n[Attachment content truncated]';
}
return text;
}
function isMalformedExtraction(text: string): boolean {
// Detect repeated token patterns indicating extraction failure
const tokens = text.toLowerCase().split(/\s+/);
const uniqueRatio = new Set(tokens).size / tokens.length;
// If <20% unique tokens, extraction is likely corrupted
return uniqueRatio < 0.2 && tokens.length > 50;
}
阶段 3:修复异步工具完成路由
文件: src/routing/async-completion-router.ts
修复前(有缺陷):
async function forwardCompletion(result: ToolResult): Promise<void> {
const message = buildChannelMessage(result);
await this.transport.post(message);
}
修复后(已修复):
async function forwardCompletion(result: ToolResult): Promise<void> {
// Ensure clean payload before routing
const cleanPayload = this.sanitizer.sanitize(result.payload);
if (cleanPayload.isDirty) {
console.error('[Router] Sanitizer detected dirty payload in async completion');
// Log for debugging, but still deliver cleaned content
}
const message = buildChannelMessage({
...result,
payload: cleanPayload.content
});
await this.transport.post(message);
}
阶段 4:添加传输层防护
文件: src/transports/discord/client.ts
在任何 Discord API 调用之前添加最终清理网关:
async sendMessage(message: DiscordMessage): Promise<API.Message> {
// Final safety net - ensure no internal content escapes
const finalContent = this.stripInternalMarkers(message.content);
if (finalContent !== message.content) {
logger.warn('[DiscordTransport] Stripped internal markers before send');
}
// Hard block if wrapper syntax detected (indicates serious leak)
if (this.containsWrapperSyntax(finalContent)) {
logger.error('[DiscordTransport] CRITICAL: Wrapper syntax detected at send time');
throw new Error('SANITIZATION_FAILURE: Internal content detected in outbound message');
}
return this.api.createMessage(this.channelId, {
content: finalContent,
embeds: message.embeds
});
}
private containsWrapperSyntax(text: string): boolean {
return /<<<[A-Z_]+>>>/.test(text);
}
🧪 验证
测试用例 1:包装器标记剥离
针对已知的内部内容执行清理函数:
const { sanitizeForDiscord } = require('./src/transports/discord/sanitizer');
const testCases = [
{
input: '<<>>UNTRUSTED Discord message body<<>>',
expected: 'UNTRUSTED Discord message body'
},
{
input: 'Source: External\nUser message\nSource: Internal',
expected: 'User message'
},
{
input: '<<>>\nValid response\n<<>>',
expected: 'Valid response'
}
];
let passed = 0;
for (const { input, expected } of testCases) {
const result = sanitizeForDiscord(input);
if (result === expected) {
console.log('✅ PASS:', JSON.stringify(result));
passed++;
} else {
console.log('❌ FAIL:', JSON.stringify({ input, expected, got: result }));
}
}
console.log(`\nResults: ${passed}/${testCases.length} tests passed`);
process.exit(passed === testCases.length ? 0 : 1);
预期输出:
✅ PASS: "UNTRUSTED Discord message body"
✅ PASS: "User message"
✅ PASS: "Valid response"
Results: 3/3 tests passed
测试用例 2:端到端 Discord 传输测试
// Integration test - requires mock Discord client
const { DiscordTransport } = require('./src/transports/discord');
const mockClient = {
messages: [],
async sendMessage(msg) {
this.messages.push(msg);
return { id: 'test-' + Date.now() };
}
};
const transport = new DiscordTransport(mockClient);
// Simulate message with internal markers
const dirtyMessage = {
content: '<<>>Corrupted payload<<>>',
embeds: []
};
try {
await transport.handleAssistantMessage(dirtyMessage);
const sent = mockClient.messages[0];
if (sent.content.includes('<<<')) {
console.log('❌ FAIL: Wrapper syntax leaked to Discord');
console.log('Sent content:', sent.content);
process.exit(1);
}
console.log('✅ PASS: Message sanitized before Discord send');
console.log('Final content:', sent.content);
} catch (e) {
if (e.message.includes('SANITIZATION_FAILURE')) {
console.log('✅ PASS: Hard block triggered on dirty content');
} else {
throw e;
}
}
测试用例 3:格式错误的附件检测
const { isMalformedExtraction } = require('./src/handlers/attachment-extractor');
// Corrupted payload (high repetition)
const corrupted = Array(200).fill('attach attachment hookup toggle compiler').join(' ');
console.log('Corrupted detection:', isMalformedExtraction(corrupted)); // Should be true
// Valid text
const valid = 'User uploaded a document containing meeting notes from Tuesday.';
console.log('Valid detection:', isMalformedExtraction(valid)); // Should be false
预期输出:
Corrupted detection: true
Valid detection: false
验证检查清单
应用修复后,确认:
- Discord 消息历史中没有
<<<EXTERNAL_UNTRUSTED_CONTENT字符串 - Discord 消息历史中没有
<<<END_EXTERNAL_UNTRUSTED_CONTENT字符串 - 用户可见消息中没有出现
Source: External/Source: Internal - 附件提取的文本不包含重复的标记模式(唯一比率 <20%)
-
sanitizeForDiscord函数的单元测试通过 - Discord 传输的集成测试通过
- 如果在发送时检测到包装器语法,硬块抛出错误
⚠️ 常见陷阱
环境特定的陷阱
Docker 容器隔离
如果在 Docker 中运行 OpenClaw,确保清理模块正确挂载且不会被重置为有缺陷版本的卷覆盖:
# Wrong - local source overrides container
docker run -v $(pwd)/src:/app/src openclaw:latest
# Correct - use container's fixed source
docker run openclaw:latest
Windows 换行符
如果内容包含 \r\n 换行符,包装器正则表达式可能会失败。确保清理处理两者:
// BROKEN: Only matches Unix line endings
const pattern = /<<<EXTERNAL_UNTRUSTED_CONTENT[^>]*>>>/g;
// FIXED: Handles both Windows and Unix
const pattern = /<<<EXTERNAL_UNTRUSTED_CONTENT[^>\r\n]*>>>/gi;
Node.js 版本不兼容
用于唯一比率计算的 Set 构造函数需要 Node.js 12+。验证兼容性:
// Feature detection fallback
const uniqueRatio = typeof Set !== 'undefined'
? new Set(tokens).size / tokens.length
: [...new Set(tokens)].length / tokens.length;
配置陷阱
清理被环境变量禁用
某些部署为调试禁用清理,这将导致此泄漏:
# .env file - ensure sanitization is NOT disabled
SANITIZATION_ENABLED=true
# SANITIZATION_ENABLED=false ← REMOVE OR SET TO TRUE
传输配置不继承基础清理器
如果使用自定义 Discord 传输实现,确保它继承基础 ContentSanitizer:
// WRONG: Custom transport bypasses sanitization
class DiscordTransportCustom {
async send(msg) { /* direct send without sanitization */ }
}
// CORRECT: Inherit sanitization
class DiscordTransportCustom extends BaseTransport {
async send(msg) {
return super.send(this.sanitizer.sanitize(msg));
}
}
运行时边缘情况
Unicode 归一化攻击
恶意内容可能使用 Unicode 相似字符绕过模式匹配:
// Attempted bypass: Cyrillic 'а' instead of Latin 'a'
const malicious = '<<<ЕXTERNAL_UNTRUSTED_CONTENT id="1">>>'; // Different chars
// Defensive: Normalize before pattern matching
const normalized = content.normalize('NFKC');
const sanitized = stripInternalMarkers(normalized);
并发消息清理竞争条件
如果多个异步工具完成同时触发:
// Ensure thread-safe sanitization by not mutating shared state
// WRONG: Mutates input in place
function sanitize(content) {
content = content.replace(pattern1, '');
return content.replace(pattern2, ''); // Returns mutated original
}
// CORRECT: Immutable operations
function sanitize(content) {
return content
.replace(pattern1, '')
.replace(pattern2, '');
}
清理结果为空
如果清理剥离了所有内容,确保消息不被发送(避免空垃圾信息):
const sanitized = stripInternalMarkers(raw);
if (!sanitized.trim()) {
logger.warn('[Discord] Sanitization produced empty message, discarding');
return; // Do not post to Discord
}
🔗 相关错误
直接相关的问题
| 错误/问题 | 描述 | 关联 |
|---|---|---|
EXTERNAL_UNTRUSTED_CONTENT 包装器泄漏 | 原始内部标记对用户可见 | 主要问题 - 相同症状 |
| 附件文本提取损坏 | 来自附件的垃圾/格式错误的文本 | 相同根因:缺少清理边界 |
| 异步工具完成垃圾信息 | 频道中重复/损坏的完成 | 共享传输渲染缺陷 |
| Discord 速率限制错误 | 如果泄漏导致消息垃圾信息循环可能会发生 | 来自垃圾内容的继发症状 |
| 消息队列备份 | 如果传输反复在脏内容上失败 | 未清理输入的下游后果 |
历史相关问题
| 问题 ID | 标题 | 相关性 |
|---|---|---|
| GH-XXX | 清理器未应用于异步完成负载 | 直接前身 - 修复未传播到所有路径 |
| GH-YYY | Discord 传输在开发模式下绕过内容过滤 | 环境特定变体的边界失败 |
| GH-ZZZ | 附件提取返回二进制垃圾 | 相同的损坏机制,不同的子系统 |
| GH-AAA | 内部包装器语法出现在日志中 | 表明包装器在代码库中扩散 |
错误代码参考
| 代码 | 含义 | 修复相关性 |
|---|---|---|
DISCORD_TRANSPORT_001 | 消息超过 2000 字符限制 | 清理应截断,而不是失败 |
DISCORD_TRANSPORT_002 | 出站消息上的清理失败 | 硬块表示严重泄漏 |
CONTENT_SANITIZE_001 | 输入上的模式匹配失败 | 正则表达式漏洞允许绕过 |
ATTACHMENT_EXTRACT_001 | 二进制提取产生非文本 | 丢弃损坏的负载,不要转发 |
ASYNC_COMPLETION_001 | 队列中检测到脏负载 | 预交付清理缺失 |
相关配置参数
| 参数 | 位置 | 默认值 | 安全影响 |
|---|---|---|---|
SANITIZATION_ENABLED | 环境变量 | true | 如果为 false,所有清理被绕过 |
DISCORD_STRICT_MODE | 配置 | false | 如果为 true,启用包装器检测时的硬块 |
ATTACHMENT_MAX_EXTRACT_CHARS | 配置 | 4000 | 防止超大型提取的垃圾信息 |
ASYNC_COMPLETION_SANITIZE | 配置 | true | 必须保持启用以确保异步路径 |