diff --git a/API.md b/API.md index 65d9cb6..a045b6c 100644 --- a/API.md +++ b/API.md @@ -37,7 +37,7 @@ - OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。 - 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。 -- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 ``、`<|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`),以及旧式 canonical XML `` → `` → ``,内部仍以 XML 解析语义为准,并在流式场景执行防泄漏筛分。 +- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 ``、`<|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo(如 ``),以及旧式 canonical XML `` → `` → ``。实现上采用窄容错结构扫描:只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123`、`true`、`null`、数组或对象),会按结构化值输出,不再一律当作字符串;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。 - `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。 --- @@ -344,7 +344,7 @@ data: [DONE] 补充说明: - **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call(前后普通文本仍可透传)。 -- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(``、`<|tool_calls>`、`<|tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)和旧式 canonical XML 工具块(`` / `` / ``)作为可执行调用解析;DSML 会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 ``、``、``、``、``、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。 +- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(``、`<|tool_calls>`、`<|tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)、`DSML` 与工具标签名黏连的常见 typo(如 `` / `` / ``)和旧式 canonical XML 工具块(`` / `` / ``)作为可执行调用解析;DSML 会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 ``、``、``、``、``、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。 - 当最终可见正文为空但思维链里包含可执行工具调用时,Chat / Responses 会在收尾阶段补发标准 OpenAI `tool_calls` / `function_call` 输出;如果客户端未开启 thinking / reasoning,该思维链只用于检测,不会作为可见正文或 `reasoning_content` 暴露。 - Markdown fenced code block(例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。 diff --git a/docs/toolcall-semantics.md b/docs/toolcall-semantics.md index 3466ce0..5529a4b 100644 --- a/docs/toolcall-semantics.md +++ b/docs/toolcall-semantics.md @@ -39,8 +39,9 @@ 兼容修复: - 如果模型漏掉 opening wrapper,但后面仍输出了一个或多个 invoke 并以 closing wrapper 收尾,Go 解析链路会在解析前补回缺失的 opening wrapper。 -- 如果模型把 DSML 标签里的分隔符 `|` 写漏成空格(例如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`,或无 leading pipe 的 `` 形态),Go / Node 会在固定工具标签名范围内归一化;相似但非工具标签名(如 `tool_calls_extra`)仍按普通文本处理。 +- 如果模型把 DSML 标签里的分隔符 `|` 写漏成空格(例如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`,或无 leading pipe 的 `` 形态),或把 `DSML` 与工具标签名直接黏连(例如 `` / `` / ``),Go / Node 会在固定工具标签名范围内归一化;相似但非工具标签名(如 `tool_calls_extra`)仍按普通文本处理。 - 这是一个针对常见模型失误的窄修复,不改变推荐输出格式;prompt 仍要求模型直接输出完整 DSML 外壳。 +- 裸 `` / `` 不会被当成“已支持的工具语法”;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 才会进入工具调用路径。 ## 2) 非兼容内容 @@ -52,20 +53,23 @@ 在流式链路中(Go / Node 一致): -- DSML `<|DSML|tool_calls>` wrapper、兼容变体(``、`<|tool_calls>`、`<|tool_calls>`)、窄容错空格分隔形态(如 `<|DSML tool_calls>`)和 canonical `` wrapper 都会进入结构化捕获 +- DSML `<|DSML|tool_calls>` wrapper、兼容变体(``、`<|tool_calls>`、`<|tool_calls>`)、窄容错空格分隔形态(如 `<|DSML tool_calls>`)、黏连形态(如 ``)和 canonical `` wrapper 都会进入结构化捕获 - 如果流里直接从 invoke 开始,但后面补上了 closing wrapper,Go 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复 - 已识别成功的工具调用不会再次回流到普通文本 - 不符合新格式的块不会执行,并继续按原样文本透传 - fenced code block(反引号 `` ``` `` 和波浪线 `~~~`)中的 XML 示例始终按普通文本处理 - 支持嵌套围栏(如 4 反引号嵌套 3 反引号)和 CDATA 内围栏保护 +- 如果模型把 `` 或 Markdown inline code 里的 `<|DSML|tool_calls>`)而后面紧跟真正工具调用时,sieve 会跳过不可解析的 mention 候选并继续匹配后续真实工具块,不会因 mention 导致工具调用丢失,也不会截断 mention 后的正文 +另外,`` 的值如果本身是合法 JSON 字面量,也会按结构化值解析,而不是一律保留为字符串。例如 `123`、`true`、`null`、`[1,2]`、`{"a":1}` 都会还原成对应的 number / boolean / null / array / object。 + ## 4) 输出结构 `ParseToolCallsDetailed` / `parseToolCallsDetailed` 返回: - `calls`:解析出的工具调用列表(`name` + `input`) -- `sawToolCallSyntax`:检测到 DSML / canonical wrapper,或命中“缺失 opening wrapper 但可修复”的形态时会为 `true` +- `sawToolCallSyntax`:检测到 DSML / canonical wrapper,或命中“缺失 opening wrapper 但可修复”的形态时会为 `true`;裸 `invoke` 不计入该标记 - `rejectedByPolicy`:当前固定为 `false` - `rejectedToolNames`:当前固定为空数组 @@ -88,7 +92,7 @@ node --test tests/node/stream-tool-sieve.test.js - DSML `<|DSML|tool_calls>` wrapper 正常解析 - legacy canonical `` wrapper 正常解析 -- 别名变体(``、`<|tool_calls>`、`<|tool_calls>`)和 DSML 空格分隔 typo(如 `<|DSML tool_calls>`)正常解析 +- 别名变体(``、`<|tool_calls>`、`<|tool_calls>`)、DSML 空格分隔 typo(如 `<|DSML tool_calls>`)和黏连 typo(如 ``)正常解析 - 混搭标签(DSML wrapper + canonical inner)归一化后正常解析 - 波浪线围栏 `~~~` 内的示例不执行 - 嵌套围栏(4 反引号嵌套 3 反引号)内的示例不执行 diff --git a/internal/js/helpers/stream-tool-sieve/parse.js b/internal/js/helpers/stream-tool-sieve/parse.js index d81661f..82f8f94 100644 --- a/internal/js/helpers/stream-tool-sieve/parse.js +++ b/internal/js/helpers/stream-tool-sieve/parse.js @@ -6,10 +6,10 @@ const { const { parseMarkupToolCalls, stripFencedCodeBlocks, + containsToolCallWrapperSyntaxOutsideIgnored, + sanitizeLooseCDATA, } = require('./parse_payload'); -const TOOL_MARKUP_PREFIXES = [' lower.includes(prefix)); + const styles = containsToolCallWrapperSyntaxOutsideIgnored(text); + return styles.dsml || styles.canonical; } function shouldSkipToolCallParsingForCodeFenceExample(text) { diff --git a/internal/js/helpers/stream-tool-sieve/parse_payload.js b/internal/js/helpers/stream-tool-sieve/parse_payload.js index 40d3e08..185ed4d 100644 --- a/internal/js/helpers/stream-tool-sieve/parse_payload.js +++ b/internal/js/helpers/stream-tool-sieve/parse_payload.js @@ -3,6 +3,7 @@ const TOOL_CALL_MARKUP_KV_PATTERN = /<(?:[a-z0-9_:-]+:)?([a-z0-9_.-]+)\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?\1>/gi; const CDATA_PATTERN = /^$/i; const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi; +const TOOL_MARKUP_NAMES = ['tool_calls', 'invoke', 'parameter']; const { toStringSafe, @@ -138,13 +139,10 @@ function normalizeDSMLToolCallMarkup(text) { if (!raw) { return { text: '', ok: true }; } - const styles = toolMarkupStylesOutsideIgnored(raw); + const styles = containsToolMarkupSyntaxOutsideIgnored(raw); if (!styles.dsml) { return { text: raw, ok: true }; } - // Always normalize DSML aliases to canonical form, even when canonical - // tags coexist. Models frequently mix DSML wrapper tags with canonical - // inner tags (e.g., <|tool_calls>). return { text: replaceDSMLToolMarkupOutsideIgnored(raw), ok: true, @@ -152,65 +150,21 @@ function normalizeDSMLToolCallMarkup(text) { } function containsDSMLToolMarkup(text) { - return toolMarkupStylesOutsideIgnored(text).dsml; + return containsToolMarkupSyntaxOutsideIgnored(text).dsml; } function containsCanonicalToolMarkup(text) { - return toolMarkupStylesOutsideIgnored(text).canonical; + return containsToolMarkupSyntaxOutsideIgnored(text).canonical; } -const DSML_TOOL_MARKUP_ALIASES = [ - { from: '<|dsml|tool_calls', to: '', to: '' }, - { from: '<|dsml|invoke', to: '', to: '' }, - { from: '<|dsml|parameter', to: '', to: '' }, - { from: '<|dsml tool_calls', to: '', to: '' }, - { from: '<|dsml invoke', to: '', to: '' }, - { from: '<|dsml parameter', to: '', to: '' }, - { from: '', to: '' }, - { from: '', to: '' }, - { from: '', to: '' }, - { from: '', to: '' }, - { from: '', to: '' }, - { from: '', to: '' }, - { from: '<|tool_calls', to: '', to: '' }, - { from: '<|invoke', to: '', to: '' }, - { from: '<|parameter', to: '', to: '' }, - { from: '<|tool_calls', to: '', to: '' }, - { from: '<|invoke', to: '', to: '' }, - { from: '<|parameter', to: '', to: '' }, -]; - -const CANONICAL_TOOL_MARKUP_PREFIXES = [ - '', - '', - '', -]; - -function toolMarkupStylesOutsideIgnored(text) { - const lower = toStringSafe(text).toLowerCase(); +function containsToolCallWrapperSyntaxOutsideIgnored(text) { + const raw = toStringSafe(text); const styles = { dsml: false, canonical: false }; - for (let i = 0; i < lower.length;) { + if (!raw) { + return styles; + } + const lower = raw.toLowerCase(); + for (let i = 0; i < raw.length;) { const skipped = skipXmlIgnoredSection(lower, i); if (skipped.blocked) { return styles; @@ -219,15 +173,55 @@ function toolMarkupStylesOutsideIgnored(text) { i = skipped.next; continue; } - if (CANONICAL_TOOL_MARKUP_PREFIXES.some(prefix => lower.startsWith(prefix, i))) { - styles.canonical = true; + const tag = scanToolMarkupTagAt(raw, i); + if (tag) { + if (tag.name !== 'tool_calls') { + i = tag.end + 1; + continue; + } + if (tag.dsmlLike) { + styles.dsml = true; + } else { + styles.canonical = true; + } + if (styles.dsml && styles.canonical) { + return styles; + } + i = tag.end + 1; + continue; } - if (DSML_TOOL_MARKUP_ALIASES.some(alias => lower.startsWith(alias.from, i))) { - styles.dsml = true; - } - if (styles.dsml && styles.canonical) { + i += 1; + } + return styles; +} +function containsToolMarkupSyntaxOutsideIgnored(text) { + const raw = toStringSafe(text); + const styles = { dsml: false, canonical: false }; + if (!raw) { + return styles; + } + for (let i = 0; i < raw.length;) { + const skipped = skipXmlIgnoredSection(raw.toLowerCase(), i); + if (skipped.blocked) { return styles; } + if (skipped.advanced) { + i = skipped.next; + continue; + } + const tag = scanToolMarkupTagAt(raw, i); + if (tag) { + if (tag.dsmlLike) { + styles.dsml = true; + } else { + styles.canonical = true; + } + if (styles.dsml && styles.canonical) { + return styles; + } + i = tag.end + 1; + continue; + } i += 1; } return styles; @@ -235,6 +229,9 @@ function toolMarkupStylesOutsideIgnored(text) { function replaceDSMLToolMarkupOutsideIgnored(text) { const raw = toStringSafe(text); + if (!raw) { + return ''; + } const lower = raw.toLowerCase(); let out = ''; for (let i = 0; i < raw.length;) { @@ -248,10 +245,14 @@ function replaceDSMLToolMarkupOutsideIgnored(text) { i = skipped.next; continue; } - const alias = DSML_TOOL_MARKUP_ALIASES.find(item => lower.startsWith(item.from, i)); - if (alias) { - out += alias.to; - i += alias.from.length; + const tag = scanToolMarkupTagAt(raw, i); + if (tag) { + if (tag.dsmlLike) { + out += `<${tag.closing ? '/' : ''}${tag.name}${raw.slice(tag.nameEnd, tag.end + 1)}`; + } else { + out += raw.slice(tag.start, tag.end + 1); + } + i = tag.end + 1; continue; } out += raw[i]; @@ -417,6 +418,150 @@ function skipXmlIgnoredSection(lower, i) { return { advanced: false, blocked: false, next: i }; } +function scanToolMarkupTagAt(text, start) { + const raw = toStringSafe(text); + if (!raw || start < 0 || start >= raw.length || raw[start] !== '<') { + return null; + } + const lower = raw.toLowerCase(); + let i = start + 1; + const closing = raw[i] === '/'; + if (closing) { + i += 1; + } + let dsmlLike = false; + if (i < raw.length && isToolMarkupPipe(raw[i])) { + dsmlLike = true; + i += 1; + } + if (lower.startsWith('dsml', i)) { + dsmlLike = true; + i += 'dsml'.length; + while (i < raw.length && isToolMarkupSeparator(raw[i])) { + i += 1; + } + } + const { name, len } = matchToolMarkupName(lower, i); + if (!name) { + return null; + } + const nameEnd = i + len; + if (!hasXmlTagBoundary(raw, nameEnd)) { + return null; + } + const end = findXmlTagEnd(raw, nameEnd); + if (end < 0) { + return null; + } + return { + start, + end, + nameStart: i, + nameEnd, + name, + closing, + selfClosing: raw.slice(start, end + 1).trim().endsWith('/>'), + dsmlLike, + canonical: !dsmlLike, + }; +} + +function findToolMarkupTagOutsideIgnored(text, from) { + const raw = toStringSafe(text); + const lower = raw.toLowerCase(); + for (let i = Math.max(0, from || 0); i < raw.length;) { + const skipped = skipXmlIgnoredSection(lower, i); + if (skipped.blocked) { + return null; + } + if (skipped.advanced) { + i = skipped.next; + continue; + } + const tag = scanToolMarkupTagAt(raw, i); + if (tag) { + return tag; + } + i += 1; + } + return null; +} + +function findMatchingToolMarkupClose(text, openTag) { + const raw = toStringSafe(text); + if (!raw || !openTag || !openTag.name || openTag.closing) { + return null; + } + let depth = 1; + for (let pos = openTag.end + 1; pos < raw.length;) { + const tag = findToolMarkupTagOutsideIgnored(raw, pos); + if (!tag) { + return null; + } + if (tag.name !== openTag.name) { + pos = tag.end + 1; + continue; + } + if (tag.closing) { + depth -= 1; + if (depth === 0) { + return tag; + } + } else if (!tag.selfClosing) { + depth += 1; + } + pos = tag.end + 1; + } + return null; +} + +function findPartialToolMarkupStart(text) { + const raw = toStringSafe(text); + const lastLT = raw.lastIndexOf('<'); + if (lastLT < 0) { + return -1; + } + const tail = raw.slice(lastLT); + if (tail.includes('>')) { + return -1; + } + const lowerTail = tail.toLowerCase(); + const candidates = [ + '= 0) { + const end = endRel + closeMarker.length; + out += raw.slice(start, end); + pos = end; + continue; + } + + changed = true; + out += raw.slice(contentStart); + pos = raw.length; + } + + return changed ? out : raw; +} + function parseTagAttributes(raw) { const source = toStringSafe(raw); const out = {}; @@ -631,4 +830,10 @@ module.exports = { stripFencedCodeBlocks, parseMarkupToolCalls, normalizeDSMLToolCallMarkup, + containsToolMarkupSyntaxOutsideIgnored, + containsToolCallWrapperSyntaxOutsideIgnored, + findToolMarkupTagOutsideIgnored, + findMatchingToolMarkupClose, + findPartialToolMarkupStart, + sanitizeLooseCDATA, }; diff --git a/internal/js/helpers/stream-tool-sieve/sieve-xml.js b/internal/js/helpers/stream-tool-sieve/sieve-xml.js index ef69ac1..463e4db 100644 --- a/internal/js/helpers/stream-tool-sieve/sieve-xml.js +++ b/internal/js/helpers/stream-tool-sieve/sieve-xml.js @@ -1,71 +1,53 @@ 'use strict'; const { parseToolCalls } = require('./parse'); - -// XML wrapper tag pair used by the streaming sieve. -const XML_TOOL_TAG_PAIRS = [ - { open: '<|dsml|tool_calls', close: '' }, - { open: '<|dsml tool_calls', close: '' }, - { open: '' }, - { open: '' }, - { open: '<|tool_calls', close: '' }, - { open: '<|tool_calls', close: '' }, - { open: '' }, -]; - -const XML_TOOL_OPENING_TAGS = [ - ...XML_TOOL_TAG_PAIRS.map(p => p.open), - '<|dsml|invoke', '<|dsml invoke', ' 0) { - const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart); - if (!best || openIdx < best.start) { - best = { - start: openIdx, - prefix: trimmedFence.prefix, - calls: parsed, - suffix: trimmedFence.suffix, - }; - } - break; - } - if (!rejected || openIdx < rejected.start) { - rejected = { - start: openIdx, - prefix: prefixPart + xmlBlock, - suffix: suffixPart, + // Scan every recognized wrapper occurrence. Prose can mention a wrapper tag + // before the actual tool block, including the same variant as the real block. + for (let searchFrom = 0; searchFrom < captured.length;) { + const openTag = findFirstToolTag(captured, searchFrom, 'tool_calls', false); + if (!openTag) { + break; + } + const closeTag = findMatchingToolMarkupClose(captured, openTag); + if (!closeTag) { + anyOpenFound = true; + searchFrom = openTag.end + 1; + continue; + } + const xmlBlock = captured.slice(openTag.start, closeTag.end + 1); + const prefixPart = captured.slice(0, openTag.start); + const suffixPart = captured.slice(closeTag.end + 1); + const parsed = parseToolCalls(xmlBlock, toolNames); + if (Array.isArray(parsed) && parsed.length > 0) { + const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart); + if (!best || openTag.start < best.start) { + best = { + start: openTag.start, + prefix: trimmedFence.prefix, + calls: parsed, + suffix: trimmedFence.suffix, }; } - searchFrom = openIdx + pair.open.length; + break; } + if (!rejected || openTag.start < rejected.start) { + rejected = { + start: openTag.start, + prefix: prefixPart + xmlBlock, + suffix: suffixPart, + }; + } + searchFrom = openTag.end + 1; } if (best) { return { ready: true, prefix: best.prefix, calls: best.calls, suffix: best.suffix }; @@ -78,17 +60,15 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) { // If this block failed to become a tool call, pass it through as text. return { ready: true, prefix: rejected.prefix, calls: [], suffix: rejected.suffix }; } - if (!containsAnyToolCallWrapper(lower)) { - const found = firstInvokeIndex(lower); - if (found.index >= 0) { - const closeTag = found.dsml ? '' : ''; - const openWrapper = found.dsml ? '<|DSML|tool_calls>' : ''; - const closeIdx = findXMLCloseOutsideCDATA(captured, closeTag, found.index); - if (closeIdx > found.index) { - const closeEnd = closeIdx + closeTag.length; - const xmlBlock = openWrapper + captured.slice(found.index, closeIdx) + closeTag; - let prefixPart = captured.slice(0, found.index); - let suffixPart = captured.slice(closeEnd); + const invokeTag = findFirstToolTag(captured, 0, 'invoke', false); + if (invokeTag) { + const wrapperOpen = findFirstToolTag(captured, 0, 'tool_calls', false); + if (!wrapperOpen || wrapperOpen.start > invokeTag.start) { + const closeTag = findFirstToolTag(captured, invokeTag.start + 1, 'tool_calls', true); + if (closeTag && closeTag.start > invokeTag.start) { + const xmlBlock = '' + captured.slice(invokeTag.start, closeTag.end + 1); + const prefixPart = captured.slice(0, invokeTag.start); + const suffixPart = captured.slice(closeTag.end + 1); const parsed = parseToolCalls(xmlBlock, toolNames); if (Array.isArray(parsed) && parsed.length > 0) { const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart); @@ -99,194 +79,43 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) { suffix: trimmedFence.suffix, }; } - return { ready: true, prefix: prefixPart + captured.slice(found.index, closeEnd), calls: [], suffix: suffixPart }; + return { ready: true, prefix: prefixPart + captured.slice(invokeTag.start, closeTag.end + 1), calls: [], suffix: suffixPart }; } } } return { ready: false, prefix: '', calls: [], suffix: '' }; } -function findMatchingXMLToolWrapperClose(s, openTag, closeTag, openIdx) { - const text = typeof s === 'string' ? s : ''; - const openTarget = String(openTag || '').toLowerCase(); - const closeTarget = String(closeTag || '').toLowerCase(); - if (!text || !openTarget || !closeTarget || openIdx < 0) { - return -1; - } - const lower = text.toLowerCase(); - let depth = 1; - for (let i = openIdx + openTarget.length; i < text.length;) { - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith(closeTarget, i)) { - depth -= 1; - if (depth === 0) { - return i; - } - i += closeTarget.length; - continue; - } - if (lower.startsWith(openTarget, i) && hasXMLToolTagBoundary(text, i + openTarget.length)) { - depth += 1; - i += openTarget.length; - continue; - } - i += 1; - } - return -1; -} - -function findXMLOpenOutsideCDATA(s, openTag, start) { - const text = typeof s === 'string' ? s : ''; - const target = String(openTag || '').toLowerCase(); - if (!text || !target) { - return -1; - } - const lower = text.toLowerCase(); - for (let i = Math.max(0, start || 0); i < text.length;) { - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith(target, i) && hasXMLToolTagBoundary(text, i + target.length)) { - return i; - } - i += 1; - } - return -1; -} - -function hasXMLToolTagBoundary(text, idx) { - if (idx >= text.length) { - return true; - } - return [' ', '\t', '\n', '\r', '>', '/'].includes(text[idx]); -} - function hasOpenXMLToolTag(captured) { - for (const pair of XML_TOOL_TAG_PAIRS) { - const openIdx = findXMLOpenOutsideCDATA(captured, pair.open, 0); - if (openIdx >= 0) { - if (findMatchingXMLToolWrapperClose(captured, pair.open, pair.close, openIdx) < 0) { - return true; - } + for (let pos = 0; pos < captured.length;) { + const tag = findFirstToolTag(captured, pos, 'tool_calls', false); + if (!tag) { + return false; } + if (!findMatchingToolMarkupClose(captured, tag)) { + return true; + } + pos = tag.end + 1; } return false; } -function containsAnyToolCallWrapper(lower) { - return lower.includes('= 0 && (dsmlIdx < 0 || idx < dsmlIdx)) { - dsmlIdx = idx; +function findFirstToolTag(text, from, name, closing) { + for (let pos = Math.max(0, from || 0); pos < text.length;) { + const tag = findToolMarkupTagOutsideIgnored(text, pos); + if (!tag) { + return null; } - } - if (xmlIdx < 0) { - return { index: dsmlIdx, dsml: dsmlIdx >= 0 }; - } - if (dsmlIdx < 0) { - return { index: xmlIdx, dsml: false }; - } - if (dsmlIdx < xmlIdx) { - return { index: dsmlIdx, dsml: true }; - } - return { index: xmlIdx, dsml: false }; -} - -function findPartialXMLToolTagStart(s) { - const lastLT = s.lastIndexOf('<'); - if (lastLT < 0) { - return -1; - } - const tail = s.slice(lastLT); - if (tail.includes('>')) { - return -1; - } - const lowerTail = tail.toLowerCase(); - for (const tag of XML_TOOL_OPENING_TAGS) { - const tagWithLT = tag.startsWith('<') ? tag : '<' + tag; - if (tagWithLT.startsWith(lowerTail)) { - return lastLT; + if (tag.name === name && tag.closing === closing) { + return tag; } + pos = tag.end + 1; } - return -1; -} - -function findXMLCloseOutsideCDATA(s, closeTag, start) { - const text = typeof s === 'string' ? s : ''; - const target = String(closeTag || '').toLowerCase(); - if (!text || !target) { - return -1; - } - const lower = text.toLowerCase(); - for (let i = Math.max(0, start || 0); i < text.length;) { - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith('', i + ''.length; - continue; - } - if (lower.startsWith(target, i)) { - return i; - } - i += 1; - } - return -1; + return null; } module.exports = { consumeXMLToolCapture, hasOpenXMLToolTag, - findPartialXMLToolTagStart, + findPartialXMLToolTagStart: findPartialToolMarkupStart, }; diff --git a/internal/js/helpers/stream-tool-sieve/sieve.js b/internal/js/helpers/stream-tool-sieve/sieve.js index 8a31888..a90a662 100644 --- a/internal/js/helpers/stream-tool-sieve/sieve.js +++ b/internal/js/helpers/stream-tool-sieve/sieve.js @@ -6,8 +6,9 @@ const { } = require('./state'); const { trimWrappingJSONFence } = require('./jsonscan'); const { - XML_TOOL_SEGMENT_TAGS, -} = require('./tool-keywords'); + findToolMarkupTagOutsideIgnored, + sanitizeLooseCDATA, +} = require('./parse_payload'); const { consumeXMLToolCapture: consumeXMLToolCaptureImpl, hasOpenXMLToolTag, @@ -117,8 +118,27 @@ function flushToolSieve(state, toolNames) { } } else if (state.capture) { const content = state.capture; - noteText(state, content); - events.push({ type: 'text', text: content }); + const recovered = sanitizeLooseCDATA(content); + if (recovered !== content) { + const recoveredResult = consumeXMLToolCaptureImpl(recovered, toolNames, trimWrappingJSONFence); + if (recoveredResult.ready && Array.isArray(recoveredResult.calls) && recoveredResult.calls.length > 0) { + if (recoveredResult.prefix) { + noteText(state, recoveredResult.prefix); + events.push({ type: 'text', text: recoveredResult.prefix }); + } + events.push({ type: 'tool_calls', calls: recoveredResult.calls }); + if (recoveredResult.suffix) { + noteText(state, recoveredResult.suffix); + events.push({ type: 'text', text: recoveredResult.suffix }); + } + } else { + noteText(state, content); + events.push({ type: 'text', text: content }); + } + } else { + noteText(state, content); + events.push({ type: 'text', text: content }); + } } state.capture = ''; state.capturing = false; @@ -155,26 +175,16 @@ function findToolSegmentStart(state, s) { if (!s) { return -1; } - const lower = s.toLowerCase(); let offset = 0; while (true) { - // Only check XML tool tags. - let bestIdx = -1; - let matchedTag = ''; - for (const tag of XML_TOOL_SEGMENT_TAGS) { - const idx = lower.indexOf(tag, offset); - if (idx >= 0 && (bestIdx < 0 || idx < bestIdx)) { - bestIdx = idx; - matchedTag = tag; - } - } - if (bestIdx < 0) { + const tag = findToolMarkupTagOutsideIgnored(s, offset); + if (!tag) { return -1; } - if (!insideCodeFenceWithState(state, s.slice(0, bestIdx))) { - return bestIdx; + if (!insideCodeFenceWithState(state, s.slice(0, tag.start))) { + return tag.start; } - offset = bestIdx + matchedTag.length; + offset = tag.end + 1; } } diff --git a/internal/js/helpers/stream-tool-sieve/tool-keywords.js b/internal/js/helpers/stream-tool-sieve/tool-keywords.js index 0aaaccb..382e5a2 100644 --- a/internal/js/helpers/stream-tool-sieve/tool-keywords.js +++ b/internal/js/helpers/stream-tool-sieve/tool-keywords.js @@ -3,10 +3,14 @@ const XML_TOOL_SEGMENT_TAGS = [ '<|dsml|tool_calls>', '<|dsml|tool_calls\n', '<|dsml|tool_calls ', '<|dsml|invoke ', '<|dsml|invoke\n', '<|dsml|invoke\t', '<|dsml|invoke\r', + '<|dsmltool_calls>', '<|dsmltool_calls\n', '<|dsmltool_calls ', + '<|dsmlinvoke ', '<|dsmlinvoke\n', '<|dsmlinvoke\t', '<|dsmlinvoke\r', '<|dsml tool_calls>', '<|dsml tool_calls\n', '<|dsml tool_calls ', '<|dsml invoke ', '<|dsml invoke\n', '<|dsml invoke\t', '<|dsml invoke\r', '', '', '', '', '<|tool_calls\n', '<|tool_calls ', @@ -19,8 +23,10 @@ const XML_TOOL_SEGMENT_TAGS = [ const XML_TOOL_OPENING_TAGS = [ '<|dsml|tool_calls', + '<|dsmltool_calls', '<|dsml tool_calls', '', + '', '', '', + '', '', '', '', diff --git a/internal/toolcall/regression_test.go b/internal/toolcall/regression_test.go index 7615fa3..fc88db0 100644 --- a/internal/toolcall/regression_test.go +++ b/internal/toolcall/regression_test.go @@ -12,9 +12,9 @@ func TestRegression_RobustXMLAndCDATA(t *testing.T) { expected []ParsedToolCall }{ { - name: "Standard JSON parameters (Regression)", + name: "Standard JSON scalar parameters (Regression)", text: `1`, - expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"a": "1"}}}, + expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"a": float64(1)}}}, }, { name: "XML tags parameters (Regression)", diff --git a/internal/toolcall/toolcalls_dsml.go b/internal/toolcall/toolcalls_dsml.go index 4801a78..c93e04c 100644 --- a/internal/toolcall/toolcalls_dsml.go +++ b/internal/toolcall/toolcalls_dsml.go @@ -6,96 +6,17 @@ func normalizeDSMLToolCallMarkup(text string) (string, bool) { if text == "" { return "", true } - hasAliasLikeMarkup, _ := toolMarkupStylesOutsideIgnored(text) + hasAliasLikeMarkup, _ := ContainsToolMarkupSyntaxOutsideIgnored(text) if !hasAliasLikeMarkup { return text, true } - // Always normalize DSML aliases to canonical form, even when canonical - // tags coexist. Models frequently mix DSML wrapper tags with canonical - // inner tags (e.g., <|tool_calls>). - return replaceDSMLToolMarkupOutsideIgnored(text), true + return rewriteDSMLToolMarkupOutsideIgnored(text), true } -var dsmlToolMarkupAliases = []struct { - from string - to string -}{ - {"<|dsml|tool_calls", "", ""}, - {"<|dsml|invoke", "", ""}, - {"<|dsml|parameter", "", ""}, - {"<|dsml tool_calls", "", ""}, - {"<|dsml invoke", "", ""}, - {"<|dsml parameter", "", ""}, - {"", ""}, - {"", ""}, - {"", ""}, - {"", ""}, - {"", ""}, - {"", ""}, - {"<|tool_calls", "", ""}, - {"<|invoke", "", ""}, - {"<|parameter", "", ""}, - {"<|tool_calls", "", ""}, - {"<|invoke", "", ""}, - {"<|parameter", "", ""}, -} - -var canonicalToolMarkupPrefixes = []string{ - "", - "", - "", -} - -func toolMarkupStylesOutsideIgnored(text string) (hasDSML, hasCanonical bool) { - lower := strings.ToLower(text) - for i := 0; i < len(text); { - next, advanced, blocked := skipXMLIgnoredSection(lower, i) - if blocked { - return hasDSML, hasCanonical - } - if advanced { - i = next - continue - } - if hasPrefixAt(lower, i, canonicalToolMarkupPrefixes) { - hasCanonical = true - } - for _, alias := range dsmlToolMarkupAliases { - if strings.HasPrefix(lower[i:], alias.from) { - hasDSML = true - break - } - } - if hasDSML && hasCanonical { - return true, true - } - i++ +func rewriteDSMLToolMarkupOutsideIgnored(text string) string { + if text == "" { + return "" } - return hasDSML, hasCanonical -} - -func replaceDSMLToolMarkupOutsideIgnored(text string) string { lower := strings.ToLower(text) var b strings.Builder b.Grow(len(text)) @@ -110,29 +31,24 @@ func replaceDSMLToolMarkupOutsideIgnored(text string) string { i = next continue } - replaced := false - for _, alias := range dsmlToolMarkupAliases { - if strings.HasPrefix(lower[i:], alias.from) { - b.WriteString(alias.to) - i += len(alias.from) - replaced = true - break - } - } - if replaced { + tag, ok := scanToolMarkupTagAt(text, i) + if !ok { + b.WriteByte(text[i]) + i++ continue } - b.WriteByte(text[i]) - i++ + if tag.DSMLLike { + b.WriteByte('<') + if tag.Closing { + b.WriteByte('/') + } + b.WriteString(tag.Name) + b.WriteString(text[tag.NameEnd : tag.End+1]) + i = tag.End + 1 + continue + } + b.WriteString(text[tag.Start : tag.End+1]) + i = tag.End + 1 } return b.String() } - -func hasPrefixAt(text string, idx int, prefixes []string) bool { - for _, prefix := range prefixes { - if strings.HasPrefix(text[idx:], prefix) { - return true - } - } - return false -} diff --git a/internal/toolcall/toolcalls_markup.go b/internal/toolcall/toolcalls_markup.go index b01ba21..f9f2b4f 100644 --- a/internal/toolcall/toolcalls_markup.go +++ b/internal/toolcall/toolcalls_markup.go @@ -111,5 +111,72 @@ func extractStandaloneCDATA(inner string) (string, bool) { if cdataMatches := cdataPattern.FindStringSubmatch(trimmed); len(cdataMatches) >= 2 { return cdataMatches[1], true } + if strings.HasPrefix(strings.ToLower(trimmed), "" + + var b strings.Builder + b.Grow(len(text)) + changed := false + pos := 0 + for pos < len(text) { + startRel := strings.Index(lower[pos:], openMarker) + if startRel < 0 { + b.WriteString(text[pos:]) + break + } + start := pos + startRel + contentStart := start + len(openMarker) + b.WriteString(text[pos:start]) + + if endRel := strings.Index(lower[contentStart:], closeMarker); endRel >= 0 { + end := contentStart + endRel + len(closeMarker) + b.WriteString(text[start:end]) + pos = end + continue + } + + changed = true + b.WriteString(text[contentStart:]) + pos = len(text) + } + + if !changed { + return text + } + return b.String() +} diff --git a/internal/toolcall/toolcalls_parse.go b/internal/toolcall/toolcalls_parse.go index ff0b87a..f5f9d39 100644 --- a/internal/toolcall/toolcalls_parse.go +++ b/internal/toolcall/toolcalls_parse.go @@ -65,6 +65,12 @@ func parseToolCallsDetailedXMLOnly(text string) ToolCallParseResult { return result } parsed := parseXMLToolCalls(normalized) + if len(parsed) == 0 && strings.Contains(strings.ToLower(normalized), " 0 { - if len(parsed) == 1 { - if rawValue, ok := parsed["_raw"].(string); ok { - return rawValue + decoded := html.UnescapeString(extractRawTagValue(trimmed)) + if strings.Contains(decoded, "<") && strings.Contains(decoded, ">") { + if parsed := parseStructuredToolCallInput(decoded); len(parsed) > 0 { + if len(parsed) == 1 { + if rawValue, ok := parsed["_raw"].(string); ok { + return rawValue + } } + return parsed } + } + if parsed, ok := parseJSONLiteralValue(decoded); ok { return parsed } - return html.UnescapeString(extractRawTagValue(trimmed)) + return decoded } diff --git a/internal/toolcall/toolcalls_scan.go b/internal/toolcall/toolcalls_scan.go new file mode 100644 index 0000000..099f73b --- /dev/null +++ b/internal/toolcall/toolcalls_scan.go @@ -0,0 +1,219 @@ +package toolcall + +import "strings" + +var toolMarkupNames = []string{"tool_calls", "invoke", "parameter"} + +type ToolMarkupTag struct { + Start int + End int + NameStart int + NameEnd int + Name string + Closing bool + SelfClosing bool + DSMLLike bool + Canonical bool +} + +func ContainsToolMarkupSyntaxOutsideIgnored(text string) (hasDSML, hasCanonical bool) { + lower := strings.ToLower(text) + for i := 0; i < len(text); { + next, advanced, blocked := skipXMLIgnoredSection(lower, i) + if blocked { + return hasDSML, hasCanonical + } + if advanced { + i = next + continue + } + if tag, ok := scanToolMarkupTagAt(text, i); ok { + if tag.DSMLLike { + hasDSML = true + } else { + hasCanonical = true + } + if hasDSML && hasCanonical { + return true, true + } + i = tag.End + 1 + continue + } + i++ + } + return hasDSML, hasCanonical +} + +func ContainsToolCallWrapperSyntaxOutsideIgnored(text string) (hasDSML, hasCanonical bool) { + lower := strings.ToLower(text) + for i := 0; i < len(text); { + next, advanced, blocked := skipXMLIgnoredSection(lower, i) + if blocked { + return hasDSML, hasCanonical + } + if advanced { + i = next + continue + } + if tag, ok := scanToolMarkupTagAt(text, i); ok { + if tag.Name != "tool_calls" { + i = tag.End + 1 + continue + } + if tag.DSMLLike { + hasDSML = true + } else { + hasCanonical = true + } + if hasDSML && hasCanonical { + return true, true + } + i = tag.End + 1 + continue + } + i++ + } + return hasDSML, hasCanonical +} + +func FindToolMarkupTagOutsideIgnored(text string, start int) (ToolMarkupTag, bool) { + lower := strings.ToLower(text) + for i := maxInt(start, 0); i < len(text); { + next, advanced, blocked := skipXMLIgnoredSection(lower, i) + if blocked { + return ToolMarkupTag{}, false + } + if advanced { + i = next + continue + } + if tag, ok := scanToolMarkupTagAt(text, i); ok { + return tag, true + } + i++ + } + return ToolMarkupTag{}, false +} + +func FindMatchingToolMarkupClose(text string, open ToolMarkupTag) (ToolMarkupTag, bool) { + if text == "" || open.Name == "" || open.Closing { + return ToolMarkupTag{}, false + } + depth := 1 + for pos := open.End + 1; pos < len(text); { + tag, ok := FindToolMarkupTagOutsideIgnored(text, pos) + if !ok { + return ToolMarkupTag{}, false + } + if tag.Name != open.Name { + pos = tag.End + 1 + continue + } + if tag.Closing { + depth-- + if depth == 0 { + return tag, true + } + } else if !tag.SelfClosing { + depth++ + } + pos = tag.End + 1 + } + return ToolMarkupTag{}, false +} + +func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) { + if start < 0 || start >= len(text) || text[start] != '<' { + return ToolMarkupTag{}, false + } + lower := strings.ToLower(text) + i := start + 1 + closing := false + if i < len(text) && text[i] == '/' { + closing = true + i++ + } + dsmlLike := false + if next, ok := consumeToolMarkupPipe(text, i); ok { + dsmlLike = true + i = next + } + if strings.HasPrefix(lower[i:], "dsml") { + dsmlLike = true + i += len("dsml") + for next, ok := consumeToolMarkupSeparator(text, i); ok; next, ok = consumeToolMarkupSeparator(text, i) { + i = next + } + } + name, nameLen := matchToolMarkupName(lower, i) + if nameLen == 0 { + return ToolMarkupTag{}, false + } + nameEnd := i + nameLen + if !hasToolMarkupBoundary(text, nameEnd) { + return ToolMarkupTag{}, false + } + end := findXMLTagEnd(text, nameEnd) + if end < 0 { + return ToolMarkupTag{}, false + } + trimmed := strings.TrimSpace(text[start : end+1]) + return ToolMarkupTag{ + Start: start, + End: end, + NameStart: i, + NameEnd: nameEnd, + Name: name, + Closing: closing, + SelfClosing: strings.HasSuffix(trimmed, "/>"), + DSMLLike: dsmlLike, + Canonical: !dsmlLike, + }, true +} + +func matchToolMarkupName(lower string, start int) (string, int) { + for _, name := range toolMarkupNames { + if strings.HasPrefix(lower[start:], name) { + return name, len(name) + } + } + return "", 0 +} + +func consumeToolMarkupPipe(text string, idx int) (int, bool) { + if idx >= len(text) { + return idx, false + } + if text[idx] == '|' { + return idx + 1, true + } + if strings.HasPrefix(text[idx:], "|") { + return idx + len("|"), true + } + return idx, false +} + +func consumeToolMarkupSeparator(text string, idx int) (int, bool) { + if idx >= len(text) { + return idx, false + } + if text[idx] == ' ' || text[idx] == '\t' || text[idx] == '\r' || text[idx] == '\n' { + return idx + 1, true + } + if next, ok := consumeToolMarkupPipe(text, idx); ok { + return next, true + } + return idx, false +} + +func hasToolMarkupBoundary(text string, idx int) bool { + if idx >= len(text) { + return true + } + switch text[idx] { + case ' ', '\t', '\n', '\r', '>', '/': + return true + default: + return false + } +} diff --git a/internal/toolcall/toolcalls_test.go b/internal/toolcall/toolcalls_test.go index 091d9ec..b48f88c 100644 --- a/internal/toolcall/toolcalls_test.go +++ b/internal/toolcall/toolcalls_test.go @@ -53,6 +53,18 @@ func TestParseToolCallsSupportsDSMLShellWithCanonicalExampleInCDATA(t *testing.T } } +func TestParseToolCallsTreatsUnclosedCDATAAsText(t *testing.T) { + text := `` + res := ParseToolCallsDetailed(text, []string{"Write"}) + if len(res.Calls) != 1 { + t.Fatalf("expected unclosed CDATA to still parse via outer wrapper, got %#v", res.Calls) + } + got, _ := res.Calls[0].Input["content"].(string) + if got != "hello world" { + t.Fatalf("expected recovered CDATA payload, got %q", got) + } +} + func TestParseToolCallsNormalizesMixedDSMLAndCanonicalToolTags(t *testing.T) { // Models commonly mix DSML wrapper tags with canonical inner tags. // These should be normalized and parsed, not rejected. @@ -130,6 +142,23 @@ func TestParseToolCallsSupportsInvokeParameters(t *testing.T) { } } +func TestParseToolCallsSupportsJSONScalarParameters(t *testing.T) { + text := `123true` + calls := ParseToolCalls(text, []string{"configure"}) + if len(calls) != 1 { + t.Fatalf("expected 1 call, got %#v", calls) + } + if got, ok := calls[0].Input["count"].(float64); !ok || got != 123 { + t.Fatalf("expected numeric count, got %#v", calls[0].Input["count"]) + } + if got, ok := calls[0].Input["max_tokens"].(float64); !ok || got != 256 { + t.Fatalf("expected numeric max_tokens, got %#v", calls[0].Input["max_tokens"]) + } + if got, ok := calls[0].Input["enabled"].(bool); !ok || !got { + t.Fatalf("expected boolean enabled, got %#v", calls[0].Input["enabled"]) + } +} + func TestParseToolCallsPreservesRawMalformedParams(t *testing.T) { text := `cd /root && git status` calls := ParseToolCalls(text, []string{"execute_command"}) @@ -478,6 +507,49 @@ func TestParseToolCallsDoesNotAcceptDSMLSpaceLookalikeTagName(t *testing.T) { } } +func TestParseToolCallsToleratesDSMLCollapsedTagNames(t *testing.T) { + todos := `[x] 检查 toolcalls_format.go 格式化逻辑 +[x] 检查 toolcalls_parse.go 解析逻辑 +[x] 检查 toolcalls_xml.go 和 toolcalls_dsml.go +[x] 检查 toolcalls_markup.go 和 toolcalls_json_repair.go +[x] 检查 prompt/tool_calls.go 注入逻辑 +[x] 检查 toolstream 流式解析 +[x] 查看测试文件确认预期行为 +[x] 给出调查结论` + text := strings.Join([]string{ + "[]", + "", + "", + "", + "", + "", + }, "\n") + calls := ParseToolCalls(text, []string{"update_todo_list"}) + if len(calls) != 1 { + t.Fatalf("expected one call from collapsed DSML tags, got %#v", calls) + } + if calls[0].Name != "update_todo_list" { + t.Fatalf("expected update_todo_list call, got %#v", calls[0]) + } + if got, _ := calls[0].Input["todos"].(string); got != todos { + t.Fatalf("expected todos to round-trip, got %q", got) + } +} + +func TestParseToolCallsDoesNotAcceptDSMLCollapsedLookalikeTagName(t *testing.T) { + text := strings.Join([]string{ + "", + "", + "x", + "", + "", + }, "\n") + calls := ParseToolCalls(text, []string{"update_todo_list"}) + if len(calls) != 0 { + t.Fatalf("expected no calls from collapsed lookalike tag, got %#v", calls) + } +} + func TestParseToolCallsSkipsProseMentionOfSameWrapperVariant(t *testing.T) { text := strings.Join([]string{ "Summary: support canonical and DSML <|DSML|tool_calls> wrappers.", diff --git a/internal/toolstream/complex_edge_test.go b/internal/toolstream/complex_edge_test.go index c1c6488..759a80f 100644 --- a/internal/toolstream/complex_edge_test.go +++ b/internal/toolstream/complex_edge_test.go @@ -615,3 +615,68 @@ func TestSieve_DSMLSpaceLookalikeTagNameStaysText(t *testing.T) { t.Fatalf("相似标签名应作为正文透传, got %q", text.String()) } } + +func TestSieve_DSMLCollapsedTagNamesWithPrefixText(t *testing.T) { + var state State + todos := `[x] 检查 toolcalls_format.go 格式化逻辑 +[x] 检查 toolcalls_parse.go 解析逻辑 +[x] 检查 toolcalls_xml.go 和 toolcalls_dsml.go +[x] 检查 toolcalls_markup.go 和 toolcalls_json_repair.go +[x] 检查 prompt/tool_calls.go 注入逻辑 +[x] 检查 toolstream 流式解析 +[x] 查看测试文件确认预期行为 +[x] 给出调查结论` + chunks := []string{ + "[]\n", + "\n", + "\n", + "\n", + "\n", + "", + } + var events []Event + for _, c := range chunks { + events = append(events, ProcessChunk(&state, c, []string{"update_todo_list"})...) + } + events = append(events, Flush(&state, []string{"update_todo_list"})...) + + var text strings.Builder + var gotTodos string + callCount := 0 + for _, e := range events { + text.WriteString(e.Content) + for _, call := range e.ToolCalls { + callCount++ + gotTodos, _ = call.Input["todos"].(string) + } + } + if callCount != 1 { + t.Fatalf("应解析出 1 个工具调用,got %d, text=%q", callCount, text.String()) + } + if gotTodos != todos { + t.Fatalf("todos 应完整保留,got %q", gotTodos) + } + if text.String() != "[]\n" { + t.Fatalf("前置正文应完整保留且不泄漏工具块, got %q", text.String()) + } +} + +func TestSieve_DSMLCollapsedLookalikeTagNameStaysText(t *testing.T) { + var state State + input := "x" + events := ProcessChunk(&state, input, []string{"update_todo_list"}) + events = append(events, Flush(&state, []string{"update_todo_list"})...) + + var text strings.Builder + callCount := 0 + for _, e := range events { + text.WriteString(e.Content) + callCount += len(e.ToolCalls) + } + if callCount != 0 { + t.Fatalf("相似 collapsed 标签名不应触发工具调用,got %d", callCount) + } + if text.String() != input { + t.Fatalf("相似 collapsed 标签名应作为正文透传, got %q", text.String()) + } +} diff --git a/internal/toolstream/tool_sieve_core.go b/internal/toolstream/tool_sieve_core.go index 3f77b8e..a228c13 100644 --- a/internal/toolstream/tool_sieve_core.go +++ b/internal/toolstream/tool_sieve_core.go @@ -114,10 +114,30 @@ func Flush(state *State, toolNames []string) []Event { } else { content := state.capture.String() if content != "" { - // If capture never resolved into a real tool call, release the - // buffered text instead of swallowing it. - state.noteText(content) - events = append(events, Event{Content: content}) + recovered := toolcall.SanitizeLooseCDATA(content) + if recovered != content { + if prefix, calls, suffix, recoveredReady := consumeXMLToolCapture(recovered, toolNames); recoveredReady && len(calls) > 0 { + if prefix != "" { + state.noteText(prefix) + events = append(events, Event{Content: prefix}) + } + events = append(events, Event{ToolCalls: calls}) + if suffix != "" { + state.noteText(suffix) + events = append(events, Event{Content: suffix}) + } + } else { + // If capture never resolved into a real tool call, release + // the buffered text instead of swallowing it. + state.noteText(content) + events = append(events, Event{Content: content}) + } + } else { + // If capture never resolved into a real tool call, release the + // buffered text instead of swallowing it. + state.noteText(content) + events = append(events, Event{Content: content}) + } } } state.capture.Reset() diff --git a/internal/toolstream/tool_sieve_xml.go b/internal/toolstream/tool_sieve_xml.go index 06fc469..9a6789e 100644 --- a/internal/toolstream/tool_sieve_xml.go +++ b/internal/toolstream/tool_sieve_xml.go @@ -7,7 +7,6 @@ import ( // consumeXMLToolCapture tries to extract complete XML tool call blocks from captured text. func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, calls []toolcall.ParsedToolCall, suffix string, ready bool) { - lower := strings.ToLower(captured) anyOpenFound := false type candidate struct { start int @@ -23,41 +22,40 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, var best *candidate var rejected *rejectedBlock - // Scan every wrapper occurrence. Prose can mention a wrapper tag before the - // actual tool block, including the same variant as the real block. - for _, pair := range xmlToolCallTagPairs { - searchFrom := 0 - for searchFrom < len(lower) { - openIdx := findXMLOpenOutsideCDATA(captured, pair.open, searchFrom) - if openIdx < 0 { - break - } - // Find the matching closing tag outside CDATA. Long write-file tool - // calls often contain XML examples in CDATA, including . - closeIdx := findMatchingXMLToolWrapperClose(captured, pair.open, pair.close, openIdx) - if closeIdx < 0 { - anyOpenFound = true - searchFrom = openIdx + len(pair.open) - continue - } - closeEnd := closeIdx + len(pair.close) - - xmlBlock := captured[openIdx:closeEnd] - prefixPart := captured[:openIdx] - suffixPart := captured[closeEnd:] - parsed := toolcall.ParseToolCalls(xmlBlock, toolNames) - if len(parsed) > 0 { - prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart) - if best == nil || openIdx < best.start { - best = &candidate{start: openIdx, prefix: prefixPart, calls: parsed, suffix: suffixPart} - } - break - } - if rejected == nil || openIdx < rejected.start { - rejected = &rejectedBlock{start: openIdx, prefix: prefixPart + xmlBlock, suffix: suffixPart} - } - searchFrom = openIdx + len(pair.open) + // Scan every recognized tool tag occurrence. Prose can mention a wrapper + // tag before the actual tool block, including the same variant as the real + // block. We only accept complete tool_calls wrappers that parse cleanly. + for searchFrom := 0; searchFrom < len(captured); { + tag, ok := toolcall.FindToolMarkupTagOutsideIgnored(captured, searchFrom) + if !ok { + break } + if tag.Closing || tag.Name != "tool_calls" { + searchFrom = tag.End + 1 + continue + } + closeTag, ok := toolcall.FindMatchingToolMarkupClose(captured, tag) + if !ok { + anyOpenFound = true + searchFrom = tag.End + 1 + continue + } + + xmlBlock := captured[tag.Start : closeTag.End+1] + prefixPart := captured[:tag.Start] + suffixPart := captured[closeTag.End+1:] + parsed := toolcall.ParseToolCalls(xmlBlock, toolNames) + if len(parsed) > 0 { + prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart) + if best == nil || tag.Start < best.start { + best = &candidate{start: tag.Start, prefix: prefixPart, calls: parsed, suffix: suffixPart} + } + break + } + if rejected == nil || tag.Start < rejected.start { + rejected = &rejectedBlock{start: tag.Start, prefix: prefixPart + xmlBlock, suffix: suffixPart} + } + searchFrom = tag.End + 1 } if best != nil { return best.prefix, best.calls, best.suffix, true @@ -71,26 +69,19 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, // If this block failed to become a tool call, pass it through as text. return rejected.prefix, nil, rejected.suffix, true } - if !containsAnyToolCallWrapper(lower) { - invokeIdx, dsml := firstInvokeIndex(lower) - closeTag := "" - openWrapper := "" - if dsml { - closeTag = "" - openWrapper = "<|DSML|tool_calls>" - } - closeIdx := findXMLCloseOutsideCDATA(captured, closeTag, invokeIdx) - if invokeIdx >= 0 && closeIdx > invokeIdx { - closeEnd := closeIdx + len(closeTag) - xmlBlock := openWrapper + captured[invokeIdx:closeIdx] + closeTag - prefixPart := captured[:invokeIdx] - suffixPart := captured[closeEnd:] - parsed := toolcall.ParseToolCalls(xmlBlock, toolNames) - if len(parsed) > 0 { - prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart) - return prefixPart, parsed, suffixPart, true + if invokeTag, ok := findFirstToolMarkupTagByName(captured, 0, "invoke"); ok { + if wrapperOpen, ok := findFirstToolMarkupTagByName(captured, 0, "tool_calls"); !ok || wrapperOpen.Start > invokeTag.Start { + if closeTag, ok := findFirstToolMarkupTagByNameFrom(captured, invokeTag.Start+1, "tool_calls", true); ok && closeTag.Start > invokeTag.Start { + xmlBlock := "" + captured[invokeTag.Start:closeTag.End+1] + prefixPart := captured[:invokeTag.Start] + suffixPart := captured[closeTag.End+1:] + parsed := toolcall.ParseToolCalls(xmlBlock, toolNames) + if len(parsed) > 0 { + prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart) + return prefixPart, parsed, suffixPart, true + } + return prefixPart + captured[invokeTag.Start:closeTag.End+1], nil, suffixPart, true } - return prefixPart + captured[invokeIdx:closeEnd], nil, suffixPart, true } } return "", nil, "", false @@ -99,46 +90,35 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, // hasOpenXMLToolTag returns true if captured text contains an XML tool opening tag // whose SPECIFIC closing tag has not appeared yet. func hasOpenXMLToolTag(captured string) bool { - for _, pair := range xmlToolCallTagPairs { - openIdx := findXMLOpenOutsideCDATA(captured, pair.open, 0) - if openIdx >= 0 { - if findMatchingXMLToolWrapperClose(captured, pair.open, pair.close, openIdx) < 0 { - return true - } + for searchFrom := 0; searchFrom < len(captured); { + tag, ok := toolcall.FindToolMarkupTagOutsideIgnored(captured, searchFrom) + if !ok { + return false } + if tag.Closing || tag.Name != "tool_calls" { + searchFrom = tag.End + 1 + continue + } + if _, ok := toolcall.FindMatchingToolMarkupClose(captured, tag); !ok { + return true + } + searchFrom = tag.End + 1 } return false } func shouldKeepBareInvokeCapture(captured string) bool { - lower := strings.ToLower(captured) - invokeIdx, dsml := firstInvokeIndex(lower) - if invokeIdx < 0 || containsAnyToolCallWrapper(lower) { + invokeTag, ok := findFirstToolMarkupTagByName(captured, 0, "invoke") + if !ok { return false } - invokeOpenLen := len(" invokeIdx { + if closeTag, ok := findFirstToolMarkupTagByNameFrom(captured, invokeTag.Start+1, "tool_calls", true); ok && closeTag.Start > invokeTag.Start { return true } - - startEnd := findXMLTagEnd(captured, invokeIdx+invokeOpenLen) + startEnd := invokeTag.End if startEnd < 0 { return true } @@ -148,84 +128,16 @@ func shouldKeepBareInvokeCapture(captured string) bool { return true } - invokeCloseIdx := findAnyXMLCloseOutsideCDATA(captured, possibleInvokeCloseTags(dsml), startEnd+1) - if invokeCloseIdx >= 0 { - afterClose := captured[invokeCloseIdx:] - for _, closeTag := range possibleInvokeCloseTags(dsml) { - if strings.HasPrefix(strings.ToLower(afterClose), closeTag) { - afterClose = afterClose[len(closeTag):] - break - } - } - return strings.TrimSpace(afterClose) == "" + if invokeCloseTag, ok := findFirstToolMarkupTagByNameFrom(captured, startEnd+1, "invoke", true); ok { + return strings.TrimSpace(captured[invokeCloseTag.End+1:]) == "" } trimmedLower := strings.ToLower(trimmedBody) - return strings.HasPrefix(trimmedLower, parameterOpen) || + return strings.HasPrefix(trimmedLower, ""} - } - return []string{"", "", "", "", "", ""} -} - -func possibleInvokeCloseTags(dsml bool) []string { - if !dsml { - return []string{""} - } - return []string{"", "", "", "", "", ""} -} - -func findAnyXMLCloseOutsideCDATA(s string, closeTags []string, start int) int { - best := -1 - for _, closeTag := range closeTags { - idx := findXMLCloseOutsideCDATA(s, closeTag, start) - if idx >= 0 && (best < 0 || idx < best) { - best = idx - } - } - return best -} - -func firstInvokeIndex(lower string) (int, bool) { - xmlIdx := strings.Index(lower, "= 0 && (dsmlIdx < 0 || idx < dsmlIdx) { - dsmlIdx = idx - } - } - switch { - case xmlIdx < 0: - return dsmlIdx, dsmlIdx >= 0 - case dsmlIdx < 0: - return xmlIdx, false - case dsmlIdx < xmlIdx: - return dsmlIdx, true - default: - return xmlIdx, false - } -} - -// findPartialXMLToolTagStart checks if the string ends with a partial canonical -// XML wrapper tag (e.g., "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], closeTarget): - depth-- - if depth == 0 { - return i - } - i += len(closeTarget) - case strings.HasPrefix(lower[i:], openTarget) && hasXMLToolTagBoundary(s, i+len(openTarget)): - depth++ - i += len(openTarget) - default: - i++ - } - } - return -1 +func findFirstToolMarkupTagByName(s string, start int, name string) (toolcall.ToolMarkupTag, bool) { + return findFirstToolMarkupTagByNameFrom(s, start, name, false) } -func findXMLOpenOutsideCDATA(s, openTag string, start int) int { - if s == "" || openTag == "" { - return -1 - } - if start < 0 { - start = 0 - } - lower := strings.ToLower(s) - target := strings.ToLower(openTag) - for i := start; i < len(s); { - switch { - case strings.HasPrefix(lower[i:], "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], target) && hasXMLToolTagBoundary(s, i+len(target)): - return i - default: - i++ +func findFirstToolMarkupTagByNameFrom(s string, start int, name string, closing bool) (toolcall.ToolMarkupTag, bool) { + for pos := maxInt(start, 0); pos < len(s); { + tag, ok := toolcall.FindToolMarkupTagOutsideIgnored(s, pos) + if !ok { + return toolcall.ToolMarkupTag{}, false } + if tag.Name == name && tag.Closing == closing { + return tag, true + } + pos = tag.End + 1 } - return -1 + return toolcall.ToolMarkupTag{}, false } -func findXMLCloseOutsideCDATA(s, closeTag string, start int) int { - if s == "" || closeTag == "" { - return -1 +func maxInt(a, b int) int { + if a > b { + return a } - if start < 0 { - start = 0 - } - lower := strings.ToLower(s) - target := strings.ToLower(closeTag) - for i := start; i < len(s); { - switch { - case strings.HasPrefix(lower[i:], "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], "") - if end < 0 { - return -1 - } - i += len("") - case strings.HasPrefix(lower[i:], target): - return i - default: - i++ - } - } - return -1 -} - -func hasXMLToolTagBoundary(text string, idx int) bool { - if idx >= len(text) { - return true - } - switch text[idx] { - case ' ', '\t', '\n', '\r', '>', '/': - return true - default: - return false - } -} - -func findXMLTagEnd(s string, start int) int { - quote := byte(0) - for i := start; i < len(s); i++ { - ch := s[i] - if quote != 0 { - if ch == quote { - quote = 0 - } - continue - } - if ch == '"' || ch == '\'' { - quote = ch - continue - } - if ch == '>' { - return i - } - } - return -1 + return b } diff --git a/internal/toolstream/tool_sieve_xml_tags.go b/internal/toolstream/tool_sieve_xml_tags.go index 6a9a19c..d4179bd 100644 --- a/internal/toolstream/tool_sieve_xml_tags.go +++ b/internal/toolstream/tool_sieve_xml_tags.go @@ -5,28 +5,7 @@ import "regexp" // --- XML tool call support for the streaming sieve --- //nolint:unused // kept as explicit tag inventory for future XML sieve refinements. -var xmlToolCallClosingTags = []string{"", "", "", "", "", "", ""} -var xmlToolCallOpeningTags = []string{ - ""}, - {"<|dsml tool_calls", ""}, - {""}, - {""}, - {"<|tool_calls", ""}, - {"<|tool_calls", ""}, - {""}, -} +var xmlToolCallClosingTags = []string{"", "", "", "", "", "", "", "", ""} // xmlToolCallBlockPattern matches a complete canonical XML tool call block. // @@ -37,10 +16,14 @@ var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)((?:", "<|dsml|tool_calls\n", "<|dsml|tool_calls ", "<|dsml|invoke ", "<|dsml|invoke\n", "<|dsml|invoke\t", "<|dsml|invoke\r", + "<|dsmltool_calls>", "<|dsmltool_calls\n", "<|dsmltool_calls ", + "<|dsmlinvoke ", "<|dsmlinvoke\n", "<|dsmlinvoke\t", "<|dsmlinvoke\r", "<|dsml tool_calls>", "<|dsml tool_calls\n", "<|dsml tool_calls ", "<|dsml invoke ", "<|dsml invoke\n", "<|dsml invoke\t", "<|dsml invoke\r", "", "", "", "", "<|tool_calls\n", "<|tool_calls ", diff --git a/internal/toolstream/tool_sieve_xml_test.go b/internal/toolstream/tool_sieve_xml_test.go index 0b9a7bb..efcf56d 100644 --- a/internal/toolstream/tool_sieve_xml_test.go +++ b/internal/toolstream/tool_sieve_xml_test.go @@ -174,6 +174,41 @@ func TestProcessToolSieveKeepsCDATAEmbeddedToolClosingBuffered(t *testing.T) { } } +func TestProcessToolSieveFallsBackWhenCDATANeverCloses(t *testing.T) { + var state State + chunks := []string{ + "\n \n \n \n", + } + var events []Event + for _, c := range chunks { + events = append(events, ProcessChunk(&state, c, []string{"Write"})...) + } + events = append(events, Flush(&state, []string{"Write"})...) + + var textContent strings.Builder + toolCalls := 0 + for _, evt := range events { + if evt.Content != "" { + textContent.WriteString(evt.Content) + } + toolCalls += len(evt.ToolCalls) + if len(evt.ToolCalls) > 0 { + if got, _ := evt.ToolCalls[0].Input["content"].(string); got != "hello world" { + t.Fatalf("expected recovered CDATA payload, got %q", got) + } + } + } + + if toolCalls != 1 { + t.Fatalf("expected unclosed CDATA payload to still parse, got %d tool calls events=%#v", toolCalls, events) + } + if textContent.Len() != 0 { + t.Fatalf("expected no leaked text, got %q", textContent.String()) + } +} + func TestProcessToolSieveXMLWithLeadingText(t *testing.T) { var state State // Model outputs some prose then an XML tool call. diff --git a/tests/node/stream-tool-sieve.test.js b/tests/node/stream-tool-sieve.test.js index dabaae2..1938984 100644 --- a/tests/node/stream-tool-sieve.test.js +++ b/tests/node/stream-tool-sieve.test.js @@ -71,6 +71,30 @@ test('parseToolCalls ignores DSML space lookalike tag names', () => { assert.equal(calls.length, 0); }); +test('parseToolCalls tolerates collapsed DSML tag names', () => { + const todos = [ + '[x] 检查 toolcalls_format.go 格式化逻辑', + '[x] 检查 toolcalls_parse.go 解析逻辑', + '[x] 检查 toolcalls_xml.go 和 toolcalls_dsml.go', + '[x] 检查 toolcalls_markup.go 和 toolcalls_json_repair.go', + '[x] 检查 prompt/tool_calls.go 注入逻辑', + '[x] 检查 toolstream 流式解析', + '[x] 查看测试文件确认预期行为', + '[x] 给出调查结论', + ].join('\n'); + const payload = ``; + const calls = parseToolCalls(payload, ['update_todo_list']); + assert.equal(calls.length, 1); + assert.equal(calls[0].name, 'update_todo_list'); + assert.equal(calls[0].input.todos, todos); +}); + +test('parseToolCalls ignores collapsed DSML lookalike tag names', () => { + const payload = 'x'; + const calls = parseToolCalls(payload, ['update_todo_list']); + assert.equal(calls.length, 0); +}); + test('parseToolCalls keeps canonical XML examples inside DSML CDATA', () => { const content = 'x'; const payload = `<|DSML|tool_calls><|DSML|invoke name="write_file"><|DSML|parameter name="path">notes.md<|DSML|parameter name="content">`; @@ -80,6 +104,24 @@ test('parseToolCalls keeps canonical XML examples inside DSML CDATA', () => { assert.deepEqual(calls[0].input, { path: 'notes.md', content }); }); +test('parseToolCalls recovers when CDATA never closes inside a valid wrapper', () => { + const payload = ''; + const calls = parseToolCalls(payload, ['Write']); + assert.equal(calls.length, 1); + assert.equal(calls[0].name, 'Write'); + assert.equal(calls[0].input.content, 'hello world'); +}); + +test('parseToolCalls supports JSON scalar parameters', () => { + const payload = '123true'; + const calls = parseToolCalls(payload, ['configure']); + assert.equal(calls.length, 1); + assert.equal(calls[0].name, 'configure'); + assert.equal(calls[0].input.count, 123); + assert.equal(calls[0].input.max_tokens, 256); + assert.equal(calls[0].input.enabled, true); +}); + test('parseToolCalls normalizes mixed DSML and XML tool tags', () => { // Models commonly mix DSML wrapper tags with canonical inner tags. const payload = '<|DSML|tool_calls><|DSML|parameter name="path">README.MD'; @@ -147,6 +189,41 @@ test('sieve keeps DSML space lookalike tag names as text', () => { assert.equal(collectText(events), input); }); +test('sieve emits tool_calls for collapsed DSML tag names and preserves prefix text', () => { + const todos = [ + '[x] 检查 toolcalls_format.go 格式化逻辑', + '[x] 检查 toolcalls_parse.go 解析逻辑', + '[x] 检查 toolcalls_xml.go 和 toolcalls_dsml.go', + '[x] 检查 toolcalls_markup.go 和 toolcalls_json_repair.go', + '[x] 检查 prompt/tool_calls.go 注入逻辑', + '[x] 检查 toolstream 流式解析', + '[x] 查看测试文件确认预期行为', + '[x] 给出调查结论', + ].join('\n'); + const events = runSieve([ + '[]\n', + '\n', + '\n', + `\n`, + '\n', + '', + ], ['update_todo_list']); + const text = collectText(events); + const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []); + assert.equal(finalCalls.length, 1); + assert.equal(finalCalls[0].name, 'update_todo_list'); + assert.equal(finalCalls[0].input.todos, todos); + assert.equal(text, '[]\n'); +}); + +test('sieve keeps collapsed DSML lookalike tag names as text', () => { + const input = 'x'; + const events = runSieve([input], ['update_todo_list']); + const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []); + assert.equal(finalCalls.length, 0); + assert.equal(collectText(events), input); +}); + test('sieve preserves review body with alias mentions before real DSML tool calls', () => { const events = runSieve([ "Done reviewing the diff. Here's my analysis before we commit:\n\n", @@ -277,6 +354,23 @@ test('sieve keeps long XML tool calls buffered until the closing tag arrives', ( assert.equal(finalCalls[0].input.content, longContent); }); +test('sieve recovers when CDATA never closes inside a valid wrapper', () => { + const events = runSieve( + [ + '\n \n \n \n', + ], + ['Write'], + ); + const leakedText = collectText(events); + const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []); + assert.equal(finalCalls.length, 1); + assert.equal(finalCalls[0].name, 'Write'); + assert.equal(finalCalls[0].input.content, 'hello world'); + assert.equal(leakedText, ''); +}); + test('sieve keeps CDATA tool examples buffered until the outer closing tag arrives', () => { const content = [ '# DS2API 4.0 更新内容',