feat(toolcall): prioritize XML for model output and parsing

2026-05-05 08:55:28 +08:00 · 2026-03-29 10:53:38 +08:00
parent 6e8f3185d5
commit 958f4e39b5
7 changed files with 90 additions and 54 deletions
--- a/docs/toolcall-semantics.md
+++ b/docs/toolcall-semantics.md
@@ -1,41 +1,68 @@
-# Tool call parsing semantics (Go canonical spec)
+# Tool call parsing semantics（Go/Node 统一语义）

-This document defines the cross-runtime contract for `ParseToolCallsDetailed` / `parseToolCallsDetailed`.
+本文档描述当前代码中 `ParseToolCallsDetailed` / `parseToolCallsDetailed` 的**实际行为**，用于对齐 Go 与 Node Runtime。

-## Output contract
+## 1) 输出结构（当前实现）

- `calls`: accepted tool calls with normalized tool names.
- `sawToolCallSyntax`: true when tool-call-like syntax is detected (`tool_calls`, `<tool_call>`, `<function_call>`, `<invoke>`) or a valid call is parsed.
- `rejectedByPolicy`: true when parser extracted call syntax but all calls are rejected by allow-list policy.
- `rejectedToolNames`: de-duplicated rejected tool names in first-seen order.
+- `calls`：解析得到的工具调用列表（`name` + `input`）。
+- `sawToolCallSyntax`：检测到工具调用语法特征时为 `true`（例如 `tool_calls`、`<tool_call>`、`<function_call>`、`<invoke>`、`function.name:`）。
+- `rejectedByPolicy`：当前实现固定为 `false`（预留字段，尚未启用 allow-list 拒绝）。
+- `rejectedToolNames`：当前实现固定为空数组（预留字段）。

-## Parse pipeline
+> 说明：`filterToolCallsDetailed` 当前仅做结构清洗，不做工具名策略拒绝。

-1. Strip fenced code blocks for non-standalone parsing.
-2. Build candidates from:
-   - full text,
-   - fenced JSON snippets,
-   - extracted JSON objects around `tool_calls`,
-   - first `{` to last `}` object slice.
-3. Parse each candidate in order:
-   - JSON payload parser (`tool_calls`, list, single call object),
-   - XML/Markup parser (`<tool_call>`, `<function_call>`, `<invoke>`; supports attributes + nested fields),
-   - Text KV fallback parser (`function.name: <name>` ... `function.arguments: {json}`).
-4. Stop at first candidate that yields at least one call.
+## 2) 解析管线

-## Name normalization policy
+1. **示例保护**：若判定为 fenced code block 示例上下文，则跳过执行型解析。
+2. **候选片段构建**：从完整文本中构建候选（原文、围绕 `tool_calls` 的 JSON 片段、首尾大括号切片等）。
+3. **按序尝试解析（命中即停）**：
+   - XML 解析（`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / `antml:function_call` 等）；
+   - JSON 解析（`{"tool_calls": [...]}`、列表、单对象）；
+   - Markup 解析；
+   - Text-KV 回退（如 `function.name:` + `function.arguments:`）。
+4. **兜底**：候选全部失败后，再对全文做 XML / Text-KV 回退。

-When matching parsed names against configured tools:
+## 3) XML 能力边界（当前）

-1. exact match,
-2. case-insensitive match,
-3. namespace tail match (`a.b.c` => `c`),
-4. loose alnum match (remove non `[a-z0-9]`, compare).
+当前已支持输入端的“多 XML/标记风格”解析，包括但不限于：

-## Standalone mode
+- `<tool_call><tool_name>...</tool_name><parameters>...</parameters></tool_call>`
+- `<function_call>tool</function_call><function parameter name="x">...</function parameter>`
+- `<invoke name="tool"><parameter name="x">...</parameter></invoke>`
+- `antml:function_call` / `antml:argument` / `antml:parameters`
+- `tool_use` 家族标签

-Standalone mode (`ParseStandaloneToolCallsDetailed`) parses the whole input directly (no candidate slicing), while still applying:
+但**输出端仍统一转换为 OpenAI 兼容 JSON 事件/对象**（`message.tool_calls`、`delta.tool_calls`、`response.function_call_arguments.*`）。

- example-context guard,
- JSON then markup fallback,
- the same allow-list normalization policy.
+## 4) 关于“是否可以封装成 XML 再喂给模型”
+
+结论：**可以做，而且建议把 XML 作为模型侧第一优先格式**，同时保持“输入兼容层 + 输出按客户端协议渲染”。
+
+推荐架构：
+
+1. **Prompt 约束层**（默认开启）：强约束模型优先输出规范 XML tool block（例如 `<tool_calls><tool_call>...</tool_call></tool_calls>`）。
+2. **解析兼容层**（已具备基础）：继续在 parser 中同时接受 JSON/XML/ANTML/invoke/text-kv。
+3. **协议归一层**（必须）：无论模型输出什么格式，统一落到内部 `ParsedToolCall`。
+4. **对外渲染层**：根据客户端请求协议渲染（OpenAI/Claude/Gemini 各自格式）。
+
+这样可以同时获得：
+
+- 减少模型端 JSON 转义/引号错误；
+- 不破坏现有 SDK/客户端生态；
+- 逐步灰度（按模型、按租户、按请求开关）。
+
+## 5) 落地建议（低风险迭代）
+
+- 新增配置项：`toolcall.prefer_xml_output`（默认 `false`）。
+- 对 `true` 场景在系统提示词里加入 XML 模板；保留 JSON 模板作为回退。
+- 增加观测指标：
+  - `toolcall_parse_source`（json/xml/markup/textkv）；
+  - `toolcall_parse_success_rate`；
+  - `toolcall_malformed_rate`；
+  - `toolcall_repair_rate`。
+- 先在 `responses` 链路灰度，再扩展 `chat.completions`。
+
+## 6) 兼容性提醒
+
+- 上游模型若输出混合文本 + XML，仍可能出现“半结构化”噪声，需要依赖现有 sieve 增量消费策略。
+- XML 不等于安全：仍需做 tool 名、参数 schema、执行权限的服务端校验。