From 7475defeca5a7642940aa7fe93f6364cd6f75fc8 Mon Sep 17 00:00:00 2001 From: CJACK Date: Sun, 26 Apr 2026 04:26:51 +0800 Subject: [PATCH] fix: align tool call protocol and thinking controls --- API.en.md | 4 +- API.md | 4 +- README.MD | 4 +- README.en.md | 4 +- docs/ARCHITECTURE.en.md | 2 +- docs/ARCHITECTURE.md | 2 +- docs/prompt-compatibility.md | 24 ++- docs/toolcall-semantics.md | 101 +++++---- internal/adapter/claude/handler_messages.go | 22 ++ .../adapter/claude/handler_stream_test.go | 6 +- internal/adapter/claude/handler_util_test.go | 6 +- internal/adapter/claude/proxy_vercel_test.go | 20 ++ .../adapter/openai/handler_toolcall_test.go | 4 +- internal/adapter/openai/history_split.go | 9 +- internal/adapter/openai/history_split_test.go | 6 +- .../adapter/openai/message_normalize_test.go | 12 +- internal/adapter/openai/prompt_build_test.go | 6 +- .../adapter/openai/responses_stream_test.go | 4 +- .../adapter/openai/standard_request_test.go | 16 ++ internal/adapter/openai/tool_sieve_xml.go | 36 +--- .../adapter/openai/tool_sieve_xml_test.go | 146 ++++++------- .../js/helpers/stream-tool-sieve/parse.js | 2 +- .../stream-tool-sieve/parse_payload.js | 76 +++++-- .../js/helpers/stream-tool-sieve/sieve-xml.js | 9 +- .../stream-tool-sieve/tool-keywords.js | 6 +- internal/prompt/tool_calls.go | 111 ++++++++-- internal/prompt/tool_calls_test.go | 6 +- internal/sse/consumer_edge_test.go | 15 ++ internal/sse/parser.go | 21 ++ internal/toolcall/regression_test.go | 18 +- internal/toolcall/tool_prompt.go | 197 +++++++++--------- internal/toolcall/tool_prompt_test.go | 8 +- internal/toolcall/toolcalls_parse.go | 7 +- internal/toolcall/toolcalls_parse_markup.go | 104 ++++++--- internal/toolcall/toolcalls_test.go | 63 +++--- internal/util/thinking.go | 63 ++++-- internal/util/thinking_test.go | 13 +- .../toolcalls_canonical_nested_param.json | 14 ++ .../toolcalls_canonical_tool_call.json | 13 ++ .../expected/toolcalls_function_call_tag.json | 6 - .../expected/toolcalls_invoke_attr.json | 6 - .../expected/toolcalls_xml_tool_call.json | 6 - ...olcalls_xml_tool_name_parameters_json.json | 6 - .../toolcalls/canonical_nested_param.json | 6 + .../toolcalls/canonical_tool_call.json | 6 + .../fixtures/toolcalls/function_call_tag.json | 6 - .../fixtures/toolcalls/invoke_attr.json | 6 - .../fixtures/toolcalls/xml_tool_call.json | 6 - .../xml_tool_name_parameters_json.json | 6 - tests/node/stream-tool-sieve.test.js | 42 ++-- .../meta.json | 2 +- 51 files changed, 799 insertions(+), 489 deletions(-) create mode 100644 tests/compat/expected/toolcalls_canonical_nested_param.json create mode 100644 tests/compat/expected/toolcalls_canonical_tool_call.json delete mode 100644 tests/compat/expected/toolcalls_function_call_tag.json delete mode 100644 tests/compat/expected/toolcalls_invoke_attr.json delete mode 100644 tests/compat/expected/toolcalls_xml_tool_call.json delete mode 100644 tests/compat/expected/toolcalls_xml_tool_name_parameters_json.json create mode 100644 tests/compat/fixtures/toolcalls/canonical_nested_param.json create mode 100644 tests/compat/fixtures/toolcalls/canonical_tool_call.json delete mode 100644 tests/compat/fixtures/toolcalls/function_call_tag.json delete mode 100644 tests/compat/fixtures/toolcalls/invoke_attr.json delete mode 100644 tests/compat/fixtures/toolcalls/xml_tool_call.json delete mode 100644 tests/compat/fixtures/toolcalls/xml_tool_name_parameters_json.json diff --git a/API.en.md b/API.en.md index 15b491c..3e0173f 100644 --- a/API.en.md +++ b/API.en.md @@ -37,7 +37,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl - OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`. - Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths. -- Tool-calling semantics are aligned between Go and Node runtime: parsing is now centered on XML/Markup-family tool syntax (`` / `` / `` / `tool_use` / antml variants), plus stream-time anti-leak filtering. +- Tool-calling semantics are aligned between Go and Node runtime: the only executable model-output syntax is the canonical XML tool block `` → `` → ``, plus stream-time anti-leak filtering. - `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior. --- @@ -330,7 +330,7 @@ When `tools` is present, DS2API performs anti-leak handling: Additional notes: -- The parser currently follows XML/Markup-family tool payloads (``, ``, ``, `tool_use`, antml variants). Standalone JSON `tool_calls` payloads are not treated as executable tool calls by default. +- The parser currently treats only canonical XML tool blocks (`` / `` / ``) as executable tool calls. Legacy ``, ``, ``, ``, ``, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text. - `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls. --- diff --git a/API.md b/API.md index e309eff..d09eb27 100644 --- a/API.md +++ b/API.md @@ -37,7 +37,7 @@ - OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。 - 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。 -- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:当前以 XML/Markup 家族解析为主(含 `` / `` / `` / `tool_use` / antml 变体),并在流式场景执行防泄漏筛分。 +- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:当前唯一可执行的模型输出语法是 canonical XML 工具块 `` → `` → ``,并在流式场景执行防泄漏筛分。 - `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。 --- @@ -333,7 +333,7 @@ data: [DONE] 补充说明: - **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call(前后普通文本仍可透传)。 -- 解析器当前走 XML/Markup 家族(包含 ``、``、``、`tool_use`、antml 风格);纯 JSON `tool_calls` 片段默认不会直接作为可执行调用解析。 +- 解析器当前只把 canonical XML 工具块(`` / `` / ``)作为可执行调用解析;旧式 ``、``、``、``、``、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。 - Markdown fenced code block(例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。 --- diff --git a/README.MD b/README.MD index 2e78110..e7ba7d4 100644 --- a/README.MD +++ b/README.MD @@ -142,7 +142,7 @@ flowchart LR - `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址(例如 `http://127.0.0.1:5001`),Claude Code 会请求 `/v1/messages?beta=true`。 - `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致;建议同时保留常规 key 与 `sk-ant-*` 形态 key,兼容不同客户端校验习惯。 - 若系统设置了代理,建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`,避免本地回环请求被代理拦截。 -- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为受支持的 XML/Markup 工具块(例如 `` / `` / `` / `tool_use`),而不是纯 JSON `tool_calls` 片段。 +- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为当前唯一受支持的 XML 工具块:`...`,而不是旧式 `` / `` / `` / ``、``、`tool_use` 或纯 JSON `tool_calls` 片段。 ### Gemini 接口 @@ -409,7 +409,7 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 ` 当请求中带 `tools` 时,DS2API 会做防泄漏处理与结构化转译: 1. 只在**非代码块上下文**启用执行型 toolcall 识别(代码块示例默认不触发) -2. 解析层当前以 XML/Markup 家族为准(`` / `` / `` / `tool_use` / antml 变体);纯 JSON `tool_calls` 片段默认不作为可执行调用解析 +2. 解析层当前只把 canonical XML 工具块视为可执行调用:`` → `` → ``;旧式 `` / `` / `` / ``、``、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理 3. `responses` 流式严格使用官方 item 生命周期事件(`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`) 4. `responses` 支持并执行 `tool_choice`(`auto`/`none`/`required`/强制函数);`required` 违规时非流式返回 `422`,流式返回 `response.failed` 5. 客户端请求哪种协议,就按该协议返回工具调用(OpenAI/Claude/Gemini 各自原生结构);模型侧优先约束输出规范 XML,再由兼容层转译 diff --git a/README.en.md b/README.en.md index f2e77cb..68f3017 100644 --- a/README.en.md +++ b/README.en.md @@ -138,7 +138,7 @@ Besides the current primary aliases above, `/anthropic/v1/models` also returns C - Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`. - `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility. - If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,` for DS2API to avoid proxy interception of local traffic. -- If tool calls are rendered as plain text and not executed, first verify the model output uses supported XML/Markup tool blocks (`` / `` / `` / `tool_use`) rather than standalone JSON `tool_calls`. +- If tool calls are rendered as plain text and not executed, first verify the model output uses the only supported XML block: `...`, not legacy `` / `` / `` / ``, ``, `tool_use`, or standalone JSON `tool_calls`. ### Gemini Endpoint @@ -383,7 +383,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency) When `tools` is present in the request, DS2API performs anti-leak handling: 1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored) -2. The parser currently targets XML/Markup-family tool syntax (`` / `` / `` / `tool_use` / antml variants); standalone JSON `tool_calls` payloads are not treated as executable calls by default +2. The parser now treats only the canonical XML wrapper as executable tool-calling syntax: `` → `` → ``; legacy `` / `` / `` / ``, ``, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text 3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`) 4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream 5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation diff --git a/docs/ARCHITECTURE.en.md b/docs/ARCHITECTURE.en.md index 81bb928..651098f 100644 --- a/docs/ARCHITECTURE.en.md +++ b/docs/ARCHITECTURE.en.md @@ -116,7 +116,7 @@ flowchart LR - `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI. - `internal/deepseek`: upstream request/session/PoW/SSE handling. - `internal/stream` + `internal/sse`: stream parsing and incremental assembly. -- `internal/toolcall`: XML/Markup-family tool-call parsing + anti-leak sieve (`` / `` / `` / `tool_use` / antml variants). +- `internal/toolcall`: canonical XML tool-call parsing + anti-leak sieve (the only executable format is `` / `` / ``). - `internal/admin`: config/accounts/vercel sync/version/dev-capture endpoints. - `internal/config`: config loading/validation + runtime settings hot-reload. - `internal/account`: managed account pool, inflight slots, waiting queue. diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index b439127..757ae6b 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -116,7 +116,7 @@ flowchart LR - `internal/translatorcliproxy`:Claude/Gemini 与 OpenAI 结构互转。 - `internal/deepseek`:上游请求、会话、PoW、SSE 消费。 - `internal/stream` + `internal/sse`:流式解析与增量处理。 -- `internal/toolcall`:以 XML/Markup 家族为核心的工具调用解析与防泄漏筛分(`` / `` / `` / `tool_use` / antml 变体)。 +- `internal/toolcall`:canonical XML 工具调用解析与防泄漏筛分(唯一可执行格式:`` / `` / ``)。 - `internal/admin`:配置管理、账号管理、Vercel 同步、版本检查、开发抓包。 - `internal/config`:配置加载、校验、运行时 settings 热更新。 - `internal/account`:托管账号池、并发槽位、等待队列。 diff --git a/docs/prompt-compatibility.md b/docs/prompt-compatibility.md index 1b03ca0..c9baaa6 100644 --- a/docs/prompt-compatibility.md +++ b/docs/prompt-compatibility.md @@ -96,6 +96,7 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools` - `prompt` 才是对话上下文主载体。 - `ref_file_ids` 只承载文件引用,不承载普通文本消息。 - `tools` 不会作为“原生工具 schema”直接下发给下游,而是被改写进 `prompt`。 +- 客户端显式传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。 ## 5. prompt 是怎么拼出来的 @@ -178,16 +179,15 @@ assistant 的 reasoning 会变成一个显式标签块: assistant 历史 `tool_calls` 不会保留成 OpenAI 原生 JSON,而会转成 prompt 可见的 XML: ```xml - - - read_file - - - - - + + + + + ``` +这也是当前项目里唯一受支持的 canonical tool-calling 形态;其他形态都会作为普通文本保留,不会作为可执行调用语法。 + 这件事很重要,因为它决定了: - 历史工具调用在 prompt 中是“可见文本历史” @@ -242,15 +242,17 @@ OpenAI 文件相关实现: 1. 旧历史消息被切出去。 2. 旧历史会被重新序列化成一个文本文件。 -3. 文件名固定是 `IGNORE`。 -4. 该文件上传后,其 `file_id` 会排在 `ref_file_ids` 最前面。 -5. live prompt 只保留: +3. 真正上传的文件名固定是 `HISTORY.txt`。 +4. 文件内容内部会使用 `IGNORE` 这层包装名来闭合 DeepSeek 官网原生文件标记。 +5. 该文件上传后,其 `file_id` 会排在 `ref_file_ids` 最前面。 +6. live prompt 只保留: - system / developer - 最新 user turn 起的上下文 历史文件内容不是普通自由文本,而是用同一套角色标记再次序列化出的 transcript: ```text +[uploaded filename]: HISTORY.txt [file content end] <|begin▁of▁sentence|><|User|>...<|Assistant|>...<|Tool|>... diff --git a/docs/toolcall-semantics.md b/docs/toolcall-semantics.md index 889e3ca..5d0b619 100644 --- a/docs/toolcall-semantics.md +++ b/docs/toolcall-semantics.md @@ -1,74 +1,67 @@ # Tool call parsing semantics(Go/Node 统一语义) -本文档描述当前代码中工具调用解析链路的**实际行为**(以 `internal/toolcall` 与 `internal/js/helpers/stream-tool-sieve` 为准)。 +本文档描述当前代码中的**实际行为**,以 `internal/toolcall` 与 `internal/js/helpers/stream-tool-sieve` 为准。 文档导航:[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [测试指南](./TESTING.md) -## 1) 当前输出结构 +## 1) 当前唯一可执行格式 + +当前版本只把下面这类 canonical XML 视为可执行工具调用: + +```xml + + + + + +``` + +约束: + +- 必须有 `...` wrapper +- 每个调用必须在 `...` 内 +- 工具名必须放在 `invoke` 的 `name` 属性 +- 参数必须使用 `...` + +## 2) 非 canonical 内容 + +任何不满足上述 canonical XML 形态的内容,都会保留为普通文本,不会执行。 + +## 3) 流式与防泄漏行为 + +在流式链路中(Go / Node 一致): + +- 只有从 ` 当前 `filterToolCallsDetailed` 仅做结构清洗,不做 allow-list 工具名硬拒绝。 +## 5) 落地建议 -## 2) 解析范围(重点) +1. Prompt 里只示范 canonical XML 语法。 +2. 上游客户端需要直接输出 canonical XML;DS2API 不会把其他形态改写成工具调用。 +3. 不要依赖 parser 做安全控制;执行器侧仍应做工具名和参数校验。 -当前版本的可执行解析以 **XML/Markup 家族**为主: - -- `...` -- `...` -- `...`(含自闭合) -- `...` -- antml 变体(如 `antml:function_call` / `antml:argument`) - -并支持在这些标记块内部解析: - -- JSON 参数字符串 -- 标签参数(`...`) -- key/value 风格子标签 - -## 3) 不应再假设的行为 - -以下说法在当前实现中已不成立: - -1. “纯 JSON `tool_calls` 片段会被直接当作可执行工具调用解析”。 -2. “存在 `toolcall.mode` / `toolcall.early_emit_confidence` 等可配置开关可以改变解析策略”。 - -当前策略在代码中固定为: - -- 特征匹配开启(feature-match on) -- 高置信度早发开启(early emit on) -- policy 拒绝字段保留但未启用 - -## 4) 流式与防泄漏语义 - -在流式链路中(OpenAI / Claude / Gemini 统一内核): - -- 工具调用片段会被优先提取为结构化增量输出; -- 已识别的工具调用原始片段不会作为普通文本再次回流; -- fenced code block 中的示例内容按文本处理,不作为可执行工具调用。 - -## 5) 落地建议(按当前实现) - -1. Prompt 里优先约束模型输出 XML/Markup 工具块。 -2. 执行器侧继续做工具名白名单与参数 schema 校验(不要依赖 parser 代替安全策略)。 -3. 需要兼容历史“纯 JSON tool_calls”模型输出时,请在上游模板层把输出规范化为 XML/Markup 风格再进入 DS2API。 - -## 6) 回归验证建议 +## 6) 回归验证 可直接运行: ```bash -go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/ +go test -v -run 'TestParseToolCalls|TestProcessToolSieve' ./internal/toolcall ./internal/adapter/openai node --test tests/node/stream-tool-sieve.test.js ``` 重点覆盖: -- `` / `` / `` / `tool_use` / antml 变体 -- 参数 JSON 修复与解析 -- 流式增量下的工具调用提取与文本防泄漏 +- canonical `` wrapper 正常解析 +- 非 canonical 内容按普通文本透传 +- 代码块示例不执行 diff --git a/internal/adapter/claude/handler_messages.go b/internal/adapter/claude/handler_messages.go index 6ae23ab..d4f099e 100644 --- a/internal/adapter/claude/handler_messages.go +++ b/internal/adapter/claude/handler_messages.go @@ -52,6 +52,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C } } translatedReq := translatorcliproxy.ToOpenAI(sdktranslator.FormatClaude, translateModel, raw, stream) + translatedReq = applyExplicitThinkingOverrideToOpenAIRequest(translatedReq, req) isVercelPrepare := strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1" isVercelRelease := strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1" @@ -123,6 +124,27 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C return true } +func applyExplicitThinkingOverrideToOpenAIRequest(translated []byte, original map[string]any) []byte { + enabled, ok := util.ResolveThinkingOverride(original) + if !ok { + return translated + } + req := map[string]any{} + if err := json.Unmarshal(translated, &req); err != nil { + return translated + } + typ := "disabled" + if enabled { + typ = "enabled" + } + req["thinking"] = map[string]any{"type": typ} + out, err := json.Marshal(req) + if err != nil { + return translated + } + return out +} + func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Request, resp *http.Response, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string) { defer func() { _ = resp.Body.Close() }() if resp.StatusCode != http.StatusOK { diff --git a/internal/adapter/claude/handler_stream_test.go b/internal/adapter/claude/handler_stream_test.go index aabc2f3..354ed89 100644 --- a/internal/adapter/claude/handler_stream_test.go +++ b/internal/adapter/claude/handler_stream_test.go @@ -251,14 +251,14 @@ func TestHandleClaudeStreamRealtimeToolSafetyAcrossStructuredFormats(t *testing. payload string wantToolUse bool }{ - {name: "canonical_tools_wrapper", payload: `Bashpwd`, wantToolUse: true}, + {name: "invoke_parameter_wrapper", payload: `pwd`, wantToolUse: true}, {name: "legacy_single_tool_root", payload: `Bashpwd`, wantToolUse: false}, {name: "legacy_tool_call_json", payload: `{"tool":"Bash","params":{"command":"pwd"}}`, wantToolUse: false}, {name: "legacy_nested_tool_tag_style", payload: `pwd`, wantToolUse: false}, {name: "legacy_function_tag_style", payload: `Bashpwd`, wantToolUse: false}, {name: "legacy_antml_argument_style", payload: `pwd`, wantToolUse: false}, {name: "legacy_antml_function_attr_parameters", payload: `{"command":"pwd"}`, wantToolUse: false}, - {name: "legacy_invoke_parameter_style", payload: `pwd`, wantToolUse: false}, + {name: "legacy_function_calls_wrapper", payload: `pwd`, wantToolUse: false}, } for _, tc := range tests { @@ -291,7 +291,7 @@ func TestHandleClaudeStreamRealtimeToolSafetyAcrossStructuredFormats(t *testing. func TestHandleClaudeStreamRealtimeDetectsToolUseWithLeadingProse(t *testing.T) { h := &Handler{} - payload := "I'll call a tool now.\\nwrite_file{\\\"path\\\":\\\"/tmp/a.txt\\\",\\\"content\\\":\\\"abc\\\"}" + payload := "I'll call a tool now.\\n/tmp/a.txtabc" resp := makeClaudeSSEHTTPResponse( `data: {"p":"response/content","v":"`+payload+`"}`, `data: [DONE]`, diff --git a/internal/adapter/claude/handler_util_test.go b/internal/adapter/claude/handler_util_test.go index 7efa8dd..68f68ca 100644 --- a/internal/adapter/claude/handler_util_test.go +++ b/internal/adapter/claude/handler_util_test.go @@ -93,10 +93,10 @@ func TestNormalizeClaudeMessagesToolUseToAssistantToolCalls(t *testing.T) { t.Fatalf("expected call id preserved, got %#v", call) } content, _ := m["content"].(string) - if !containsStr(content, "") || !containsStr(content, "search_web") { + if !containsStr(content, "") || !containsStr(content, ``) { t.Fatalf("expected assistant content to include XML tool call history, got %q", content) } - if !containsStr(content, "\n \n ") { + if !containsStr(content, ``) { t.Fatalf("expected assistant content to include serialized parameters, got %q", content) } } @@ -292,7 +292,7 @@ func TestBuildClaudeToolPromptSingleTool(t *testing.T) { if !containsStr(prompt, "Search the web") { t.Fatalf("expected description in prompt") } - if !containsStr(prompt, "") { + if !containsStr(prompt, "") { t.Fatalf("expected XML tool_calls format in prompt") } if !containsStr(prompt, "TOOL CALL FORMAT") { diff --git a/internal/adapter/claude/proxy_vercel_test.go b/internal/adapter/claude/proxy_vercel_test.go index 67e62de..750e092 100644 --- a/internal/adapter/claude/proxy_vercel_test.go +++ b/internal/adapter/claude/proxy_vercel_test.go @@ -106,6 +106,26 @@ func TestClaudeProxyViaOpenAIUsesGlobalAliasMapping(t *testing.T) { } } +func TestClaudeProxyViaOpenAIPreservesThinkingOverride(t *testing.T) { + openAI := &openAIProxyCaptureStub{} + h := &Handler{ + Store: claudeProxyStoreStub{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}}, + OpenAI: openAI, + } + req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"disabled"},"stream":false}`)) + rec := httptest.NewRecorder() + + h.Messages(rec, req) + + if rec.Code != http.StatusOK { + t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String()) + } + thinking, _ := openAI.seenReq["thinking"].(map[string]any) + if thinking["type"] != "disabled" { + t.Fatalf("expected translated OpenAI request to preserve disabled thinking, got %#v", openAI.seenReq) + } +} + func TestClaudeProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) { openAI := &openAIProxyCaptureStub{} h := &Handler{OpenAI: openAI} diff --git a/internal/adapter/openai/handler_toolcall_test.go b/internal/adapter/openai/handler_toolcall_test.go index e2015fa..4810caa 100644 --- a/internal/adapter/openai/handler_toolcall_test.go +++ b/internal/adapter/openai/handler_toolcall_test.go @@ -217,8 +217,8 @@ func TestHandleStreamIncompleteCapturedToolJSONFlushesAsTextOnFinalize(t *testin func TestHandleStreamEmitsDistinctToolCallIDsAcrossSeparateToolBlocks(t *testing.T) { h := &Handler{} resp := makeSSEHTTPResponse( - `data: {"p":"response/content","v":"前置文本\n\n \n read_file\n {\"path\":\"README.MD\"}\n \n"}`, - `data: {"p":"response/content","v":"中间文本\n\n \n search\n {\"q\":\"golang\"}\n \n"}`, + `data: {"p":"response/content","v":"前置文本\n\n \n README.MD\n \n"}`, + `data: {"p":"response/content","v":"中间文本\n\n \n golang\n \n"}`, `data: [DONE]`, ) rec := httptest.NewRecorder() diff --git a/internal/adapter/openai/history_split.go b/internal/adapter/openai/history_split.go index e40ff1e..8589e63 100644 --- a/internal/adapter/openai/history_split.go +++ b/internal/adapter/openai/history_split.go @@ -12,9 +12,10 @@ import ( ) const ( - historySplitFilename = "IGNORE" - historySplitContentType = "text/plain; charset=utf-8" - historySplitPurpose = "assistants" + historySplitFilename = "HISTORY.txt" + historySplitInjectedFilename = "IGNORE" + historySplitContentType = "text/plain; charset=utf-8" + historySplitPurpose = "assistants" ) func (h *Handler) applyHistorySplit(ctx context.Context, a *auth.RequestAuth, stdReq util.StandardRequest) (util.StandardRequest, error) { @@ -114,7 +115,7 @@ func buildOpenAIHistoryTranscript(messages []any) string { if transcript == "" { return "" } - return fmt.Sprintf("[file content end]\n\n%s\n\n[file name]: %s\n[file content begin]\n", transcript, historySplitFilename) + return fmt.Sprintf("[file content end]\n\n%s\n\n[file name]: %s\n[file content begin]\n", transcript, historySplitInjectedFilename) } func prependUniqueRefFileID(existing []string, fileID string) []string { diff --git a/internal/adapter/openai/history_split_test.go b/internal/adapter/openai/history_split_test.go index 554346b..5703d75 100644 --- a/internal/adapter/openai/history_split_test.go +++ b/internal/adapter/openai/history_split_test.go @@ -76,7 +76,7 @@ func TestBuildOpenAIHistoryTranscriptUsesInjectedFileWrapper(t *testing.T) { if !strings.Contains(transcript, "[reasoning_content]") || !strings.Contains(transcript, "hidden reasoning") { t.Fatalf("expected reasoning block preserved, got %q", transcript) } - if !strings.Contains(transcript, "") { + if !strings.Contains(transcript, "") { t.Fatalf("expected tool calls preserved, got %q", transcript) } if !strings.HasSuffix(transcript, "\n[file name]: IGNORE\n[file content begin]\n") { @@ -180,7 +180,7 @@ func TestApplyHistorySplitCarriesHistoryText(t *testing.T) { } } -func TestChatCompletionsHistorySplitUploadsIgnoreFileAndKeepsLatestPrompt(t *testing.T) { +func TestChatCompletionsHistorySplitUploadsHistoryFileAndKeepsLatestPrompt(t *testing.T) { ds := &inlineUploadDSStub{} h := &Handler{ Store: mockOpenAIConfig{ @@ -210,7 +210,7 @@ func TestChatCompletionsHistorySplitUploadsIgnoreFileAndKeepsLatestPrompt(t *tes t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls)) } upload := ds.uploadCalls[0] - if upload.Filename != "IGNORE" { + if upload.Filename != "HISTORY.txt" { t.Fatalf("unexpected upload filename: %q", upload.Filename) } if upload.Purpose != "assistants" { diff --git a/internal/adapter/openai/message_normalize_test.go b/internal/adapter/openai/message_normalize_test.go index b7a4bb6..1354a9e 100644 --- a/internal/adapter/openai/message_normalize_test.go +++ b/internal/adapter/openai/message_normalize_test.go @@ -38,10 +38,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes t.Fatalf("expected 4 normalized messages with assistant tool history preserved, got %d", len(normalized)) } assistantContent, _ := normalized[2]["content"].(string) - if !strings.Contains(assistantContent, "") { + if !strings.Contains(assistantContent, "") { t.Fatalf("assistant tool history should be preserved in XML form, got %q", assistantContent) } - if !strings.Contains(assistantContent, "get_weather") { + if !strings.Contains(assistantContent, ``) { t.Fatalf("expected tool name in preserved history, got %q", assistantContent) } if !strings.Contains(normalized[3]["content"].(string), `"temp":18`) { @@ -49,7 +49,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes } prompt := util.MessagesPrepare(normalized) - if !strings.Contains(prompt, "") { + if !strings.Contains(prompt, "") { t.Fatalf("expected preserved assistant tool history in prompt: %q", prompt) } } @@ -177,10 +177,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantMultipleToolCallsRemainSepara t.Fatalf("expected assistant tool_call-only message preserved, got %#v", normalized) } content, _ := normalized[0]["content"].(string) - if strings.Count(content, "") != 2 { + if strings.Count(content, "search_web") || !strings.Contains(content, "eval_javascript") { + if !strings.Contains(content, ``) || !strings.Contains(content, ``) { t.Fatalf("expected both tool names in preserved history, got %q", content) } } @@ -258,7 +258,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantNilContentDoesNotInjectNullLi if strings.Contains(content, "null") { t.Fatalf("expected no null literal injection, got %q", content) } - if !strings.Contains(content, "") { + if !strings.Contains(content, "") { t.Fatalf("expected assistant tool history in normalized content, got %q", content) } } diff --git a/internal/adapter/openai/prompt_build_test.go b/internal/adapter/openai/prompt_build_test.go index 989399c..cf345e8 100644 --- a/internal/adapter/openai/prompt_build_test.go +++ b/internal/adapter/openai/prompt_build_test.go @@ -47,10 +47,10 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes if !strings.Contains(finalPrompt, `"condition":"sunny"`) { t.Fatalf("handler finalPrompt should preserve tool output content: %q", finalPrompt) } - if !strings.Contains(finalPrompt, "") { + if !strings.Contains(finalPrompt, "") { t.Fatalf("handler finalPrompt should preserve assistant tool history: %q", finalPrompt) } - if !strings.Contains(finalPrompt, "get_weather") { + if !strings.Contains(finalPrompt, ``) { t.Fatalf("handler finalPrompt should include tool name history: %q", finalPrompt) } } @@ -74,7 +74,7 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t * } finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false) - if !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools is the ... XML block at the end of your response.") { + if !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools is the ... XML block at the end of your response.") { t.Fatalf("vercel prepare finalPrompt missing final tool-call anchor instruction: %q", finalPrompt) } if !strings.Contains(finalPrompt, "TOOL CALL FORMAT") { diff --git a/internal/adapter/openai/responses_stream_test.go b/internal/adapter/openai/responses_stream_test.go index 2fa2184..44284e3 100644 --- a/internal/adapter/openai/responses_stream_test.go +++ b/internal/adapter/openai/responses_stream_test.go @@ -122,8 +122,8 @@ func TestHandleResponsesStreamEmitsDistinctToolCallIDsAcrossSeparateToolBlocks(t return "data: " + string(b) + "\n" } - streamBody := sseLine("前置文本\n\n \n read_file\n {\"path\":\"README.MD\"}\n \n") + - sseLine("中间文本\n\n \n search\n {\"q\":\"golang\"}\n \n") + + streamBody := sseLine("前置文本\n\n \n README.MD\n \n") + + sseLine("中间文本\n\n \n golang\n \n") + "data: [DONE]\n" resp := &http.Response{ StatusCode: http.StatusOK, diff --git a/internal/adapter/openai/standard_request_test.go b/internal/adapter/openai/standard_request_test.go index 83c67c5..a242953 100644 --- a/internal/adapter/openai/standard_request_test.go +++ b/internal/adapter/openai/standard_request_test.go @@ -136,6 +136,22 @@ func TestNormalizeOpenAIResponsesRequestThinkingExtraBodyFallback(t *testing.T) } } +func TestNormalizeOpenAIResponsesRequestReasoningDisablesThinking(t *testing.T) { + store := newEmptyStoreForNormalizeTest(t) + req := map[string]any{ + "model": "gpt-4o", + "input": "ping", + "reasoning": map[string]any{"effort": "none"}, + } + n, err := normalizeOpenAIResponsesRequest(store, req, "") + if err != nil { + t.Fatalf("normalize failed: %v", err) + } + if n.Thinking { + t.Fatalf("expected reasoning.effort=none to disable thinking") + } +} + func TestNormalizeOpenAIResponsesRequestToolChoiceRequired(t *testing.T) { store := newEmptyStoreForNormalizeTest(t) req := map[string]any{ diff --git a/internal/adapter/openai/tool_sieve_xml.go b/internal/adapter/openai/tool_sieve_xml.go index d00c86b..3a8ee4c 100644 --- a/internal/adapter/openai/tool_sieve_xml.go +++ b/internal/adapter/openai/tool_sieve_xml.go @@ -9,42 +9,27 @@ import ( // --- XML tool call support for the streaming sieve --- //nolint:unused // kept as explicit tag inventory for future XML sieve refinements. -var xmlToolCallClosingTags = []string{"", "", - // Agent-style XML tags (Roo Code, Cline, etc.) - "", "", "", ""} -var xmlToolCallOpeningTags = []string{""} +var xmlToolCallOpeningTags = []string{""}, - {""}, - // Agent-style: these are XML "tool call" patterns from coding agents. - // They get captured → parsed. If parsing fails, the raw XML is preserved - // so the caller can still see the original text. - {""}, - {""}, - {""}, + {""}, } -// xmlToolCallBlockPattern matches a complete XML tool call block (wrapper or standalone). +// xmlToolCallBlockPattern matches a complete canonical XML tool call block. // //nolint:unused // reserved for future fast-path XML block detection. -var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)(]*>\s*(?:.*?)\s*|]*>(?:.*?)|(?:.*?)|(?:.*?)|(?:.*?))`) +var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)(]*>\s*(?:.*?)\s*)`) // xmlToolTagsToDetect is the set of XML tag prefixes used by findToolSegmentStart. -var xmlToolTagsToDetect = []string{"", "", "", "", ""} +var xmlToolTagsToDetect = []string{"", "\n", - " \n", - " read_file\n", - ` {"path":"README.MD"}` + "\n", - " \n", - "", + "\n", + ` ` + "\n", + ` README.MD` + "\n", + " \n", + "", } var events []toolStreamEvent for _, c := range chunks { @@ -31,7 +30,7 @@ func TestProcessToolSieveInterceptsXMLToolCallWithoutLeak(t *testing.T) { toolCalls += len(evt.ToolCalls) } - if strings.Contains(textContent, "\n \n " + toolName + "\n \n \n \n \n \n \n", + "]]>\n \n", } var events []toolStreamEvent @@ -90,8 +89,8 @@ func TestProcessToolSieveXMLWithLeadingText(t *testing.T) { // Model outputs some prose then an XML tool call. chunks := []string{ "Let me check the file.\n", - "\n \n read_file\n", - ` {"path":"go.mod"}` + "\n \n", + "\n \n", + ` go.mod` + "\n \n", } var events []toolStreamEvent for _, c := range chunks { @@ -113,7 +112,7 @@ func TestProcessToolSieveXMLWithLeadingText(t *testing.T) { t.Fatalf("expected leading text to be emitted, got %q", textContent) } // The XML itself should NOT leak. - if strings.Contains(textContent, "plain xmlread_file{"path":"README.MD"}` + chunk := `plain xmlREADME.MD` events := processToolSieveChunk(&state, chunk, []string{"read_file"}) events = append(events, flushToolSieve(&state, []string{"read_file"})...) @@ -156,8 +155,8 @@ func TestProcessToolSieveNonToolXMLKeepsSuffixForToolParsing(t *testing.T) { if !strings.Contains(textContent.String(), `plain xml`) { t.Fatalf("expected leading non-tool XML to be preserved, got %q", textContent.String()) } - if strings.Contains(textContent.String(), ``) { - t.Fatalf("expected canonical tool XML to be intercepted, got %q", textContent.String()) + if strings.Contains(textContent.String(), `{"path":"README.md"}` + chunk := `{"path":"README.md"}` events := processToolSieveChunk(&state, chunk, []string{"read_file"}) events = append(events, flushToolSieve(&state, []string{"read_file"})...) @@ -189,17 +188,17 @@ func TestProcessToolSievePassesThroughFencedXMLToolCallExamples(t *testing.T) { var state toolStreamSieveState input := strings.Join([]string{ "Before first example.\n```", - "xml\nread_file{\"path\":\"README.md\"}\n```\n", + "xml\nREADME.md\n```\n", "Between examples.\n```xml\n", - "search{\"q\":\"golang\"}\n", + "golang\n", "```\nAfter examples.", }, "") chunks := []string{ "Before first example.\n```", - "xml\nread_file{\"path\":\"README.md\"}\n```\n", + "xml\nREADME.md\n```\n", "Between examples.\n```xml\n", - "search{\"q\":\"golang\"}\n", + "golang\n", "```\nAfter examples.", } @@ -230,13 +229,13 @@ func TestProcessToolSieveKeepsPartialXMLTagInsideFencedExample(t *testing.T) { var state toolStreamSieveState input := strings.Join([]string{ "Example:\n```xml\nread_file{\"path\":\"README.md\"}\n```\n", + "lls>README.md\n```\n", "Done.", }, "") chunks := []string{ "Example:\n```xml\nread_file{\"path\":\"README.md\"}\n```\n", + "lls>README.md\n```\n", "Done.", } @@ -266,15 +265,15 @@ func TestProcessToolSieveKeepsPartialXMLTagInsideFencedExample(t *testing.T) { func TestProcessToolSievePartialXMLTagHeldBack(t *testing.T) { var state toolStreamSieveState // Chunk ends with a partial XML tool tag. - events := processToolSieveChunk(&state, "Hello \n", 10}, - {"tool_call_tag", "prefix \n", 7}, - {"xml_inside_code_fence", "```xml\nread_file\n```", -1}, + {"tool_calls_tag", "some text \n", 10}, + {"bare_tool_call_text", "prefix \n", -1}, + {"xml_inside_code_fence", "```xml\n\n```", -1}, {"no_xml", "just plain text", -1}, {"gemini_json_no_detect", `some text {"functionCall":{"name":"search"}}`, -1}, } @@ -310,10 +309,10 @@ func TestFindPartialXMLToolTagStart(t *testing.T) { input string want int }{ - {"partial_tools", "Hello done", -1}, + {"complete_tag", "Text done", -1}, {"no_lt", "plain text", -1}, {"closed_lt", "a < b > c", -1}, } @@ -328,10 +327,10 @@ func TestFindPartialXMLToolTagStart(t *testing.T) { } func TestHasOpenXMLToolTag(t *testing.T) { - if !hasOpenXMLToolTag("\n\nfoo") { + if !hasOpenXMLToolTag("\n") { t.Fatal("should detect open XML tool tag without closing tag") } - if hasOpenXMLToolTag("\n\nfoo\n") { + if hasOpenXMLToolTag("\n\n") { t.Fatal("should return false when closing tag is present") } if hasOpenXMLToolTag("plain text without any XML") { @@ -340,44 +339,29 @@ func TestHasOpenXMLToolTag(t *testing.T) { } // Test the EXACT scenario the user reports: token-by-token streaming where -// tag arrives in small pieces. +// tag arrives in small pieces. func TestProcessToolSieveTokenByTokenXMLNoLeak(t *testing.T) { var state toolStreamSieveState // Simulate DeepSeek model generating tokens one at a time. chunks := []string{ "<", "tool", - "s", + "_ca", + "lls", ">\n", - " <", - "tool", - "_call", - ">\n", - " <", - "tool", - "_name", - ">", + " ` + "\n", + " `, + "README.MD", + "\n", + " \n", "\n", - " <", - "param", - ">", - `{"path"`, - `: "README.MD"`, - `}`, - "\n", - " \n", - "", } var events []toolStreamEvent @@ -395,10 +379,10 @@ func TestProcessToolSieveTokenByTokenXMLNoLeak(t *testing.T) { toolCalls += len(evt.ToolCalls) } - if strings.Contains(textContent, "") { + if strings.Contains(textContent, "tool_calls>") { t.Fatalf("closing tag fragment leaked to text: %q", textContent) } if strings.Contains(textContent, "read_file") { @@ -414,9 +398,8 @@ func TestFlushToolSieveIncompleteXMLFallsBackToText(t *testing.T) { var state toolStreamSieveState // XML block starts but stream ends before completion. chunks := []string{ - "\n", - " \n", - " read_file\n", + "\n", + " \n", } var events []toolStreamEvent for _, c := range chunks { @@ -437,19 +420,19 @@ func TestFlushToolSieveIncompleteXMLFallsBackToText(t *testing.T) { } } -// Test that the opening tag "\n " is NOT emitted as text content. +// Test that the opening tag "\n " is NOT emitted as text content. func TestOpeningXMLTagNotLeakedAsContent(t *testing.T) { var state toolStreamSieveState // First chunk is the opening tag - should be held, not emitted. - evts1 := processToolSieveChunk(&state, "\n ", []string{"read_file"}) + evts1 := processToolSieveChunk(&state, "\n ", []string{"read_file"}) for _, evt := range evts1 { - if strings.Contains(evt.Content, "") { + if strings.Contains(evt.Content, "") { t.Fatalf("opening tag leaked on first chunk: %q", evt.Content) } } // Remaining content arrives. - evts2 := processToolSieveChunk(&state, "\n read_file\n {\"path\":\"README.MD\"}\n \n", []string{"read_file"}) + evts2 := processToolSieveChunk(&state, "\n README.MD\n \n", []string{"read_file"}) evts2 = append(evts2, flushToolSieve(&state, []string{"read_file"})...) var textContent string @@ -462,7 +445,7 @@ func TestOpeningXMLTagNotLeakedAsContent(t *testing.T) { toolCalls += len(evt.ToolCalls) } - if strings.Contains(textContent, "README.md` + events := processToolSieveChunk(&state, chunk, []string{"read_file"}) + events = append(events, flushToolSieve(&state, []string{"read_file"})...) + + var textContent strings.Builder + toolCalls := 0 + for _, evt := range events { + textContent.WriteString(evt.Content) + toolCalls += len(evt.ToolCalls) + } + + if toolCalls != 0 { + t.Fatalf("expected bare invoke to remain text, got %d events=%#v", toolCalls, events) + } + if textContent.String() != chunk { + t.Fatalf("expected bare invoke to pass through unchanged, got %q", textContent.String()) + } +} diff --git a/internal/js/helpers/stream-tool-sieve/parse.js b/internal/js/helpers/stream-tool-sieve/parse.js index b7993c9..0e7d552 100644 --- a/internal/js/helpers/stream-tool-sieve/parse.js +++ b/internal/js/helpers/stream-tool-sieve/parse.js @@ -8,7 +8,7 @@ const { stripFencedCodeBlocks, } = require('./parse_payload'); -const TOOL_MARKUP_PREFIXES = [']*>([\s\S]*?)<\/tools>/gi; -const TOOL_CALL_MARKUP_BLOCK_PATTERN = /<(?:[a-z0-9_:-]+:)?tool_call\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?tool_call>/gi; -const TOOL_CALL_CANONICAL_BODY_PATTERN = /^\s*<(?:[a-z0-9_:-]+:)?tool_name\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?tool_name>\s*<(?:[a-z0-9_:-]+:)?param\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?param>\s*$/i; +const TOOLS_WRAPPER_PATTERN = /]*>([\s\S]*?)<\/tool_calls>/gi; +const TOOL_CALL_MARKUP_BLOCK_PATTERN = /<(?:[a-z0-9_:-]+:)?invoke\b([^>]*)>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?invoke>/gi; +const PARAMETER_BLOCK_PATTERN = /<(?:[a-z0-9_:-]+:)?parameter\b([^>]*)>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?parameter>/gi; const TOOL_CALL_MARKUP_KV_PATTERN = /<(?:[a-z0-9_:-]+:)?([a-z0-9_.-]+)\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?\1>/gi; const CDATA_PATTERN = /^$/i; +const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi; const { toStringSafe, @@ -27,7 +28,7 @@ function parseMarkupToolCalls(text) { for (const wrapper of raw.matchAll(TOOLS_WRAPPER_PATTERN)) { const body = toStringSafe(wrapper[1]); for (const block of body.matchAll(TOOL_CALL_MARKUP_BLOCK_PATTERN)) { - const parsed = parseMarkupSingleToolCall(toStringSafe(block[1]).trim()); + const parsed = parseMarkupSingleToolCall(block); if (parsed) { out.push(parsed); } @@ -36,33 +37,43 @@ function parseMarkupToolCalls(text) { return out; } -function parseMarkupSingleToolCall(inner) { - // Try inline JSON parse for the inner content. +function parseMarkupSingleToolCall(block) { + const attrs = parseTagAttributes(block[1]); + const name = toStringSafe(attrs.name).trim(); + if (!name) { + return null; + } + const inner = toStringSafe(block[2]).trim(); + if (inner) { try { const decoded = JSON.parse(inner); - if (decoded && typeof decoded === 'object' && !Array.isArray(decoded) && decoded.name) { + if (decoded && typeof decoded === 'object' && !Array.isArray(decoded)) { return { - name: toStringSafe(decoded.name), - input: decoded.input && typeof decoded.input === 'object' && !Array.isArray(decoded.input) ? decoded.input : {}, + name, + input: decoded.input && typeof decoded.input === 'object' && !Array.isArray(decoded.input) + ? decoded.input + : decoded.parameters && typeof decoded.parameters === 'object' && !Array.isArray(decoded.parameters) + ? decoded.parameters + : {}, }; } } catch (_err) { // Not JSON, continue with markup parsing. } } - - const match = inner.match(TOOL_CALL_CANONICAL_BODY_PATTERN); - if (!match || match.length < 3) { + const input = {}; + for (const match of inner.matchAll(PARAMETER_BLOCK_PATTERN)) { + const parameterAttrs = parseTagAttributes(match[1]); + const paramName = toStringSafe(parameterAttrs.name).trim(); + if (!paramName) { + continue; + } + appendMarkupValue(input, paramName, parseMarkupValue(match[2])); + } + if (Object.keys(input).length === 0 && inner.trim() !== '') { return null; } - - const name = extractRawTagValue(match[1]).trim(); - if (!name) { - return null; - } - - const input = parseMarkupInput(match[2]); return { name, input }; } @@ -124,11 +135,14 @@ function parseMarkupValue(raw) { } } - try { - return JSON.parse(s); - } catch (_err) { - return s; + if (s.startsWith('{') || s.startsWith('[')) { + try { + return JSON.parse(s); + } catch (_err) { + return s; + } } + return s; } function extractRawTagValue(inner) { @@ -158,6 +172,22 @@ function unescapeHtml(safe) { .replace(/'/g, "'"); } +function parseTagAttributes(raw) { + const source = toStringSafe(raw); + const out = {}; + if (!source) { + return out; + } + for (const match of source.matchAll(XML_ATTR_PATTERN)) { + const key = toStringSafe(match[1]).trim().toLowerCase(); + if (!key) { + continue; + } + out[key] = match[3] || match[4] || ''; + } + return out; +} + function parseToolCallInput(v) { if (v == null) { return {}; diff --git a/internal/js/helpers/stream-tool-sieve/sieve-xml.js b/internal/js/helpers/stream-tool-sieve/sieve-xml.js index 6eb5280..cc8ee43 100644 --- a/internal/js/helpers/stream-tool-sieve/sieve-xml.js +++ b/internal/js/helpers/stream-tool-sieve/sieve-xml.js @@ -1,17 +1,16 @@ 'use strict'; const { parseToolCalls } = require('./parse'); -// Tag pairs ordered longest-first: wrapper tags checked before inner tags. +// XML wrapper tag pair used by the streaming sieve. const XML_TOOL_TAG_PAIRS = [ - { open: '' }, - { open: '' }, + { open: '' }, ]; const XML_TOOL_OPENING_TAGS = XML_TOOL_TAG_PAIRS.map(p => p.open); function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) { const lower = captured.toLowerCase(); - // Find the FIRST matching open/close pair, preferring wrapper tags. + // Find the FIRST matching open/close pair for the canonical wrapper. for (const pair of XML_TOOL_TAG_PAIRS) { const openIdx = lower.indexOf(pair.open); if (openIdx < 0) { @@ -21,7 +20,7 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) { const closeIdx = lower.lastIndexOf(pair.close); if (closeIdx < openIdx) { // Opening tag present but specific closing tag hasn't arrived. - // Return not-ready — do NOT fall through to inner pairs. + // Return not-ready so buffering continues until the wrapper closes. return { ready: false, prefix: '', calls: [], suffix: '' }; } const closeEnd = closeIdx + pair.close.length; diff --git a/internal/js/helpers/stream-tool-sieve/tool-keywords.js b/internal/js/helpers/stream-tool-sieve/tool-keywords.js index 473600b..93efd5d 100644 --- a/internal/js/helpers/stream-tool-sieve/tool-keywords.js +++ b/internal/js/helpers/stream-tool-sieve/tool-keywords.js @@ -1,15 +1,15 @@ 'use strict'; const XML_TOOL_SEGMENT_TAGS = [ - '', '', '', '', '', + '', ]; module.exports = { diff --git a/internal/prompt/tool_calls.go b/internal/prompt/tool_calls.go index b24234d..d38e9fa 100644 --- a/internal/prompt/tool_calls.go +++ b/internal/prompt/tool_calls.go @@ -16,8 +16,8 @@ var promptXMLTextEscaper = strings.NewReplacer( var promptXMLNamePattern = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_.:-]*$`) -// FormatToolCallsForPrompt renders a tool_calls slice into the canonical -// prompt-visible history block used across adapters. +// FormatToolCallsForPrompt renders a tool_calls slice into the prompt-visible +// invoke/parameter history block used across adapters. func FormatToolCallsForPrompt(raw any) string { calls, ok := raw.([]any) if !ok || len(calls) == 0 { @@ -38,7 +38,7 @@ func FormatToolCallsForPrompt(raw any) string { if len(blocks) == 0 { return "" } - return "\n" + strings.Join(blocks, "\n") + "\n" + return "\n" + strings.Join(blocks, "\n") + "\n" } // StringifyToolCallArguments normalizes tool arguments into a compact string @@ -93,28 +93,99 @@ func formatToolCallForPrompt(call map[string]any) string { } parameters := formatToolCallParametersForPrompt(argsRaw) + if parameters == "" { + return ` ` + } - return " \n" + - " " + escapeXMLText(name) + "\n" + + return " \n" + parameters + "\n" + - " " + " " } func formatToolCallParametersForPrompt(raw any) string { value := normalizePromptToolCallValue(raw) - body, ok := renderPromptToolXMLBody(value, " ") - if ok { - if strings.TrimSpace(body) == "" { - return " " - } - return " \n" + body + "\n " + body, ok := renderPromptToolParameters(value, " ") + if ok && strings.TrimSpace(body) != "" { + return body } fallback := StringifyToolCallArguments(raw) if strings.TrimSpace(fallback) == "" { - fallback = "{}" + return "" + } + return " " + renderPromptXMLText(fallback) + "" +} + +func renderPromptToolParameters(value any, indent string) (string, bool) { + switch v := value.(type) { + case nil: + return "", true + case map[string]any: + if len(v) == 0 { + return "", true + } + keys := make([]string, 0, len(v)) + for k := range v { + keys = append(keys, k) + } + sort.Strings(keys) + lines := make([]string, 0, len(keys)) + for _, key := range keys { + rendered, ok := renderPromptParameterNode(key, v[key], indent) + if !ok { + return "", false + } + lines = append(lines, rendered) + } + return strings.Join(lines, "\n"), true + case []any: + lines := make([]string, 0, len(v)) + for _, item := range v { + rendered, ok := renderPromptParameterNode("item", item, indent) + if !ok { + return "", false + } + lines = append(lines, rendered) + } + return strings.Join(lines, "\n"), true + case string: + return indent + `` + renderPromptXMLText(v) + ``, true + default: + return indent + `` + renderPromptXMLText(fmt.Sprint(v)) + ``, true + } +} + +func renderPromptParameterNode(name string, value any, indent string) (string, bool) { + trimmedName := strings.TrimSpace(name) + if trimmedName == "" { + return "", false + } + switch v := value.(type) { + case nil: + return indent + ``, true + case map[string]any: + body, ok := renderPromptToolXMLBody(v, indent+" ") + if !ok { + return "", false + } + if strings.TrimSpace(body) == "" { + return indent + ``, true + } + return indent + `\n" + body + "\n" + indent + ``, true + case []any: + body, ok := renderPromptToolXMLArray(v, indent+" ") + if !ok { + return "", false + } + if strings.TrimSpace(body) == "" { + return indent + ``, true + } + return indent + `\n" + body + "\n" + indent + ``, true + case string: + return indent + `` + renderPromptXMLText(v) + ``, true + default: + return indent + `` + renderPromptXMLText(fmt.Sprint(v)) + ``, true } - return " " + renderPromptXMLText(fallback) + "" } func normalizePromptToolCallValue(raw any) any { @@ -246,6 +317,18 @@ func isValidPromptXMLName(name string) bool { return promptXMLNamePattern.MatchString(strings.TrimSpace(name)) } +func escapeXMLAttribute(text string) string { + if text == "" { + return "" + } + return strings.NewReplacer( + "&", "&", + `"`, """, + "<", "<", + ">", ">", + ).Replace(text) +} + func normalizeToolArgumentString(raw string) string { trimmed := strings.TrimSpace(raw) if trimmed == "" { diff --git a/internal/prompt/tool_calls_test.go b/internal/prompt/tool_calls_test.go index 451120b..b26658c 100644 --- a/internal/prompt/tool_calls_test.go +++ b/internal/prompt/tool_calls_test.go @@ -22,7 +22,7 @@ func TestFormatToolCallsForPromptXML(t *testing.T) { if got == "" { t.Fatal("expected non-empty formatted tool calls") } - if got != "\n \n search_web\n \n \n \n \n" { + if got != "\n \n \n \n" { t.Fatalf("unexpected formatted tool call XML: %q", got) } } @@ -34,7 +34,7 @@ func TestFormatToolCallsForPromptEscapesXMLEntities(t *testing.T) { "arguments": `{"q":"a < b && c > d"}`, }, }) - want := "\n \n search<&>\n \n d]]>\n \n \n" + want := "\n \n d]]>\n \n" if got != want { t.Fatalf("unexpected escaped tool call XML: %q", got) } @@ -50,7 +50,7 @@ func TestFormatToolCallsForPromptUsesCDATAForMultilineContent(t *testing.T) { }, }, }) - want := "\n \n write_file\n \n \n \n \n \n" + want := "\n \n \n \n \n" if got != want { t.Fatalf("unexpected multiline cdata tool call XML: %q", got) } diff --git a/internal/sse/consumer_edge_test.go b/internal/sse/consumer_edge_test.go index 99679c5..4654ef8 100644 --- a/internal/sse/consumer_edge_test.go +++ b/internal/sse/consumer_edge_test.go @@ -56,6 +56,21 @@ func TestCollectStreamThinkingAndText(t *testing.T) { } } +func TestCollectStreamDropsThinkingWhenDisabled(t *testing.T) { + resp := makeHTTPResponse( + "data: {\"p\":\"response/thinking_content\",\"v\":\"Thinking...\"}\n" + + "data: {\"p\":\"response/content\",\"v\":\"Answer\"}\n" + + "data: [DONE]\n", + ) + result := CollectStream(resp, false, true) + if result.Thinking != "" { + t.Fatalf("expected disabled thinking to be dropped, got %q", result.Thinking) + } + if result.Text != "Answer" { + t.Fatalf("expected only visible answer, got %q", result.Text) + } +} + func TestCollectStreamOnlyThinking(t *testing.T) { resp := makeHTTPResponse( "data: {\"p\":\"response/thinking_content\",\"v\":\"Only thinking\"}\n" + diff --git a/internal/sse/parser.go b/internal/sse/parser.go index 34813be..a5ed223 100644 --- a/internal/sse/parser.go +++ b/internal/sse/parser.go @@ -99,6 +99,10 @@ func ParseSSEChunkForContent(chunk map[string]any, thinkingEnabled bool, current if transitioned { newType = "text" } + if !thinkingEnabled { + parts = dropThinkingParts(parts) + newType = "text" + } return parts, false, newType } @@ -172,6 +176,9 @@ func updateTypeFromNestedResponse(path string, v any, newType *string) { func resolvePartType(path string, thinkingEnabled bool, newType string) string { switch { case path == "response/thinking_content": + if !thinkingEnabled { + return "thinking" + } if newType == "text" { return "text" } @@ -187,6 +194,20 @@ func resolvePartType(path string, thinkingEnabled bool, newType string) string { } } +func dropThinkingParts(parts []ContentPart) []ContentPart { + if len(parts) == 0 { + return parts + } + out := parts[:0] + for _, p := range parts { + if p.Type == "thinking" { + continue + } + out = append(out, p) + } + return out +} + func appendChunkValueContent(v any, partType string, newType *string, parts *[]ContentPart, path string) bool { switch val := v.(type) { case string: diff --git a/internal/toolcall/regression_test.go b/internal/toolcall/regression_test.go index 0e58952..7615fa3 100644 --- a/internal/toolcall/regression_test.go +++ b/internal/toolcall/regression_test.go @@ -13,18 +13,18 @@ func TestRegression_RobustXMLAndCDATA(t *testing.T) { }{ { name: "Standard JSON parameters (Regression)", - text: `foo{"a": 1}`, - expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"a": float64(1)}}}, + text: `1`, + expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"a": "1"}}}, }, { name: "XML tags parameters (Regression)", - text: `foohello`, + text: `hello`, expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"arg1": "hello"}}}, }, { name: "CDATA parameters (New Feature)", - text: `write_file and & symbols]]>`, + text: ` and & symbols]]>`, expected: []ParsedToolCall{{ Name: "write_file", Input: map[string]any{"content": "line 1\nline 2 with and & symbols"}, @@ -32,9 +32,9 @@ line 2 with and & symbols]]>`, }, { name: "Nested XML with repeated parameters (New Feature)", - text: `write_filescript.shscript.shfirstsecond`, +]]>firstsecond`, expected: []ParsedToolCall{{ Name: "write_file", Input: map[string]any{ @@ -46,7 +46,7 @@ echo "hello" }, { name: "Dirty XML with unescaped symbols (Robustness Improvement)", - text: `bashecho "hello" > out.txt && cat out.txt`, + text: `echo "hello" > out.txt && cat out.txt`, expected: []ParsedToolCall{{ Name: "bash", Input: map[string]any{"command": "echo \"hello\" > out.txt && cat out.txt"}, @@ -54,7 +54,7 @@ echo "hello" }, { name: "Mixed JSON inside CDATA (New Hybrid Case)", - text: `foo`, + text: ``, expected: []ParsedToolCall{{ Name: "foo", Input: map[string]any{"json_param": "works"}, diff --git a/internal/toolcall/tool_prompt.go b/internal/toolcall/tool_prompt.go index 7990ef8..6b1ce0e 100644 --- a/internal/toolcall/tool_prompt.go +++ b/internal/toolcall/tool_prompt.go @@ -36,93 +36,139 @@ func BuildToolCallInstructions(toolNames []string) string { return `TOOL CALL FORMAT — FOLLOW EXACTLY: - - - TOOL_NAME_HERE - - - - - + + + + + RULES: -1) Use the XML wrapper format only. -2) Put one or more entries under a single root. -3) Use for the tool name and for the argument container. +1) Use the XML wrapper format only. +2) Put one or more entries under a single root. +3) Put the tool name in the invoke name attribute: . 4) All string values must use , even short ones. This includes code, scripts, file contents, prompts, paths, names, and queries. -5) Objects use nested XML elements. Arrays may repeat the same tag or use children. -6) Numbers, booleans, and null stay plain text. -7) Use only the parameter names in the tool schema. Do not invent fields. -8) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue. +5) Every top-level argument must be a ... node. +6) Objects use nested XML elements inside the parameter body. Arrays may repeat children. +7) Numbers, booleans, and null stay plain text. +8) Use only the parameter names in the tool schema. Do not invent fields. +9) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue. PARAMETER SHAPES: -- string => -- object => nested XML elements -- array => repeated tags or children -- number/bool/null => plain text +- string => +- object => ... +- array => ...... +- number/bool/null => plain_text 【WRONG — Do NOT do these】: Wrong 1 — mixed text after XML: - ... I hope this helps. -Wrong 2 — JSON payload inside : - ` + ex1 + `{"path":"x"} -Wrong 3 — Markdown code fences: + ... I hope this helps. +Wrong 2 — Markdown code fences: ` + "```xml" + ` - ... + ... ` + "```" + ` -Remember: The ONLY valid way to use tools is the ... XML block at the end of your response. +Remember: The ONLY valid way to use tools is the ... XML block at the end of your response. 【CORRECT EXAMPLES】: Example A — Single tool: - - - ` + ex1 + ` - ` + ex1Params + ` - - + + +` + indentPromptParameters(ex1Params, " ") + ` + + Example B — Two tools in parallel: - - - ` + ex1 + ` - ` + ex1Params + ` - - - ` + ex2 + ` - ` + ex2Params + ` - - + + +` + indentPromptParameters(ex1Params, " ") + ` + + +` + indentPromptParameters(ex2Params, " ") + ` + + + ` + promptCDATA("/abs/path/to/another-file.txt") + ` + + Example C — Tool with nested XML parameters: - - - ` + ex3 + ` - ` + ex3Params + ` - - - + + +` + indentPromptParameters(ex3Params, " ") + ` + + + Example D — Tool with long script using CDATA (RELIABLE FOR CODE/SCRIPTS): - - - ` + ex2 + ` - - ` + promptCDATA("script.sh") + ` - + + ` + promptCDATA("script.sh") + ` + - - - +]]> + + ` } +func indentPromptParameters(body, indent string) string { + if strings.TrimSpace(body) == "" { + return indent + `` + } + lines := strings.Split(body, "\n") + for i, line := range lines { + if strings.TrimSpace(line) == "" { + lines[i] = line + continue + } + lines[i] = indent + line + } + return strings.Join(lines, "\n") +} + +func wrapParameter(name, inner string) string { + return `` + inner + `` +} + +func exampleReadParams(name string) string { + switch strings.TrimSpace(name) { + case "Read": + return wrapParameter("file_path", promptCDATA("README.md")) + case "Glob": + return wrapParameter("pattern", promptCDATA("**/*.go")) + "\n" + wrapParameter("path", promptCDATA(".")) + default: + return wrapParameter("path", promptCDATA("src/main.go")) + } +} + +func exampleWriteOrExecParams(name string) string { + switch strings.TrimSpace(name) { + case "Bash", "execute_command": + return wrapParameter("command", promptCDATA("pwd")) + case "exec_command": + return wrapParameter("cmd", promptCDATA("pwd")) + case "Edit": + return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + wrapParameter("old_string", promptCDATA("foo")) + "\n" + wrapParameter("new_string", promptCDATA("bar")) + case "MultiEdit": + return wrapParameter("file_path", promptCDATA("README.md")) + "\n" + `` + promptCDATA("foo") + `` + promptCDATA("bar") + `` + default: + return wrapParameter("path", promptCDATA("output.txt")) + "\n" + wrapParameter("content", promptCDATA("Hello world")) + } +} + +func exampleInteractiveParams(name string) string { + switch strings.TrimSpace(name) { + case "Task": + return wrapParameter("description", promptCDATA("Investigate flaky tests")) + "\n" + wrapParameter("prompt", promptCDATA("Run targeted tests and summarize failures")) + default: + return wrapParameter("question", promptCDATA("Which approach do you prefer?")) + "\n" + `` + promptCDATA("Option A") + `` + promptCDATA("Option B") + `` + } +} + func matchAny(name string, candidates ...string) bool { for _, c := range candidates { if name == c { @@ -132,41 +178,6 @@ func matchAny(name string, candidates ...string) bool { return false } -func exampleReadParams(name string) string { - switch strings.TrimSpace(name) { - case "Read": - return `` + promptCDATA("README.md") + `` - case "Glob": - return `` + promptCDATA("**/*.go") + `` + promptCDATA(".") + `` - default: - return `` + promptCDATA("src/main.go") + `` - } -} - -func exampleWriteOrExecParams(name string) string { - switch strings.TrimSpace(name) { - case "Bash", "execute_command": - return `` + promptCDATA("pwd") + `` - case "exec_command": - return `` + promptCDATA("pwd") + `` - case "Edit": - return `` + promptCDATA("README.md") + `` + promptCDATA("foo") + `` + promptCDATA("bar") + `` - case "MultiEdit": - return `` + promptCDATA("README.md") + `` + promptCDATA("foo") + `` + promptCDATA("bar") + `` - default: - return `` + promptCDATA("output.txt") + `` + promptCDATA("Hello world") + `` - } -} - -func exampleInteractiveParams(name string) string { - switch strings.TrimSpace(name) { - case "Task": - return `` + promptCDATA("Investigate flaky tests") + `` + promptCDATA("Run targeted tests and summarize failures") + `` - default: - return `` + promptCDATA("Which approach do you prefer?") + `` + promptCDATA("Option A") + `` + promptCDATA("Option B") + `` - } -} - func promptCDATA(text string) string { if text == "" { return "" diff --git a/internal/toolcall/tool_prompt_test.go b/internal/toolcall/tool_prompt_test.go index 85cdeb4..c063bca 100644 --- a/internal/toolcall/tool_prompt_test.go +++ b/internal/toolcall/tool_prompt_test.go @@ -7,20 +7,20 @@ import ( func TestBuildToolCallInstructions_ExecCommandUsesCmdExample(t *testing.T) { out := BuildToolCallInstructions([]string{"exec_command"}) - if !strings.Contains(out, `exec_command`) { + if !strings.Contains(out, ``) { t.Fatalf("expected exec_command in examples, got: %s", out) } - if !strings.Contains(out, ``) { + if !strings.Contains(out, ``) { t.Fatalf("expected cmd parameter example for exec_command, got: %s", out) } } func TestBuildToolCallInstructions_ExecuteCommandUsesCommandExample(t *testing.T) { out := BuildToolCallInstructions([]string{"execute_command"}) - if !strings.Contains(out, `execute_command`) { + if !strings.Contains(out, ``) { t.Fatalf("expected execute_command in examples, got: %s", out) } - if !strings.Contains(out, ``) { + if !strings.Contains(out, ``) { t.Fatalf("expected command parameter example for execute_command, got: %s", out) } } diff --git a/internal/toolcall/toolcalls_parse.go b/internal/toolcall/toolcalls_parse.go index 70c1529..16743ac 100644 --- a/internal/toolcall/toolcalls_parse.go +++ b/internal/toolcall/toolcalls_parse.go @@ -74,12 +74,7 @@ func filterToolCallsDetailed(parsed []ParsedToolCall) ([]ParsedToolCall, []strin func looksLikeToolCallSyntax(text string) bool { lower := strings.ToLower(text) - return strings.Contains(lower, "]*>\s*(.*?)\s*`) -var xmlToolCallPattern = regexp.MustCompile(`(?is)]*>\s*(.*?)\s*`) -var xmlCanonicalToolCallBodyPattern = regexp.MustCompile(`(?is)^\s*<(?:[a-z0-9_:-]+:)?tool_name\b[^>]*>(.*?)\s*<(?:[a-z0-9_:-]+:)?param\b[^>]*>(.*?)\s*$`) +var xmlToolCallsWrapperPattern = regexp.MustCompile(`(?is)]*>\s*(.*?)\s*`) +var xmlInvokePattern = regexp.MustCompile(`(?is)]*)>\s*(.*?)\s*`) +var xmlParameterPattern = regexp.MustCompile(`(?is)]*)>\s*(.*?)\s*`) +var xmlAttrPattern = regexp.MustCompile(`(?is)\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')`) func parseXMLToolCalls(text string) []ParsedToolCall { - wrappers := xmlToolsWrapperPattern.FindAllStringSubmatch(text, -1) + wrappers := xmlToolCallsWrapperPattern.FindAllStringSubmatch(text, -1) if len(wrappers) == 0 { return nil } @@ -21,7 +22,7 @@ func parseXMLToolCalls(text string) []ParsedToolCall { if len(wrapper) < 2 { continue } - for _, block := range xmlToolCallPattern.FindAllString(wrapper[1], -1) { + for _, block := range xmlInvokePattern.FindAllStringSubmatch(wrapper[1], -1) { call, ok := parseSingleXMLToolCall(block) if !ok { continue @@ -35,37 +36,90 @@ func parseXMLToolCalls(text string) []ParsedToolCall { return out } -func parseSingleXMLToolCall(block string) (ParsedToolCall, bool) { - inner := strings.TrimSpace(block) - inner = strings.TrimPrefix(inner, "") - inner = strings.TrimSuffix(inner, "") - inner = strings.TrimSpace(inner) +func parseSingleXMLToolCall(block []string) (ParsedToolCall, bool) { + if len(block) < 3 { + return ParsedToolCall{}, false + } + attrs := parseXMLTagAttributes(block[1]) + name := strings.TrimSpace(html.UnescapeString(attrs["name"])) + if name == "" { + return ParsedToolCall{}, false + } + + inner := strings.TrimSpace(block[2]) if strings.HasPrefix(inner, "{") { var payload map[string]any if err := json.Unmarshal([]byte(inner), &payload); err == nil { - name := strings.TrimSpace(asString(payload["name"])) - if name != "" { - input := map[string]any{} - if params, ok := payload["input"].(map[string]any); ok { + input := map[string]any{} + if params, ok := payload["input"].(map[string]any); ok { + input = params + } + if len(input) == 0 { + if params, ok := payload["parameters"].(map[string]any); ok { input = params } - return ParsedToolCall{Name: name, Input: input}, true } + return ParsedToolCall{Name: name, Input: input}, true } } - m := xmlCanonicalToolCallBodyPattern.FindStringSubmatch(inner) - if len(m) < 3 { - return ParsedToolCall{}, false + input := map[string]any{} + for _, paramMatch := range xmlParameterPattern.FindAllStringSubmatch(inner, -1) { + if len(paramMatch) < 3 { + continue + } + paramAttrs := parseXMLTagAttributes(paramMatch[1]) + paramName := strings.TrimSpace(html.UnescapeString(paramAttrs["name"])) + if paramName == "" { + continue + } + value := parseInvokeParameterValue(paramMatch[2]) + appendMarkupValue(input, paramName, value) } - name := strings.TrimSpace(html.UnescapeString(extractRawTagValue(m[1]))) - if strings.TrimSpace(name) == "" { - return ParsedToolCall{}, false + + if len(input) == 0 { + if strings.TrimSpace(inner) != "" { + return ParsedToolCall{}, false + } + return ParsedToolCall{Name: name, Input: map[string]any{}}, true } - return ParsedToolCall{Name: name, Input: parseStructuredToolCallInput(m[2])}, true + return ParsedToolCall{Name: name, Input: input}, true } -func asString(v any) string { - s, _ := v.(string) - return s +func parseXMLTagAttributes(raw string) map[string]string { + if strings.TrimSpace(raw) == "" { + return map[string]string{} + } + out := map[string]string{} + for _, m := range xmlAttrPattern.FindAllStringSubmatch(raw, -1) { + if len(m) < 5 { + continue + } + key := strings.ToLower(strings.TrimSpace(m[1])) + if key == "" { + continue + } + value := m[3] + if value == "" { + value = m[4] + } + out[key] = value + } + return out +} + +func parseInvokeParameterValue(raw string) any { + trimmed := strings.TrimSpace(raw) + if trimmed == "" { + return "" + } + if parsed := parseStructuredToolCallInput(trimmed); len(parsed) > 0 { + if len(parsed) == 1 { + if rawValue, ok := parsed["_raw"].(string); ok { + return rawValue + } + } + return parsed + } + return html.UnescapeString(extractRawTagValue(trimmed)) } diff --git a/internal/toolcall/toolcalls_test.go b/internal/toolcall/toolcalls_test.go index 25ee32c..13d0bef 100644 --- a/internal/toolcall/toolcalls_test.go +++ b/internal/toolcall/toolcalls_test.go @@ -16,8 +16,8 @@ func TestFormatOpenAIToolCalls(t *testing.T) { } } -func TestParseToolCallsSupportsToolsWrapper(t *testing.T) { - text := `Bashpwdshow cwd` +func TestParseToolCallsSupportsToolCallsWrapper(t *testing.T) { + text := `pwdshow cwd` calls := ParseToolCalls(text, []string{"bash"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -31,9 +31,9 @@ func TestParseToolCallsSupportsToolsWrapper(t *testing.T) { } func TestParseToolCallsSupportsStandaloneToolWithMultilineCDATAAndRepeatedXMLTags(t *testing.T) { - text := `write_filescript.shscript.shfirstsecond` +]]>firstsecond` calls := ParseToolCalls(text, []string{"write_file"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -54,8 +54,8 @@ echo "hello" } } -func TestParseToolCallsSupportsCanonicalParamsJSON(t *testing.T) { - text := `get_weather{"city":"beijing","unit":"c"}` +func TestParseToolCallsSupportsInvokeParameters(t *testing.T) { + text := `beijingc` calls := ParseToolCalls(text, []string{"get_weather"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -69,7 +69,7 @@ func TestParseToolCallsSupportsCanonicalParamsJSON(t *testing.T) { } func TestParseToolCallsPreservesRawMalformedParams(t *testing.T) { - text := `execute_commandcd /root && git status` + text := `cd /root && git status` calls := ParseToolCalls(text, []string{"execute_command"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -77,9 +77,9 @@ func TestParseToolCallsPreservesRawMalformedParams(t *testing.T) { if calls[0].Name != "execute_command" { t.Fatalf("expected tool name execute_command, got %q", calls[0].Name) } - raw, ok := calls[0].Input["_raw"].(string) + raw, ok := calls[0].Input["command"].(string) if !ok { - t.Fatalf("expected raw argument tracking, got %#v", calls[0].Input) + t.Fatalf("expected raw command tracking, got %#v", calls[0].Input) } if raw != "cd /root && git status" { t.Fatalf("expected raw arguments to be preserved, got %q", raw) @@ -87,7 +87,7 @@ func TestParseToolCallsPreservesRawMalformedParams(t *testing.T) { } func TestParseToolCallsSupportsParamsJSONWithAmpersandCommand(t *testing.T) { - text := `execute_command{"command":"sshpass -p 'xxx' ssh -o StrictHostKeyChecking=no -p 1111 root@111.111.111.111 'cd /root && git clone https://github.com/ericc-ch/copilot-api.git'","cwd":null,"timeout":null}` + text := `sshpass -p 'xxx' ssh -o StrictHostKeyChecking=no -p 1111 root@111.111.111.111 'cd /root && git clone https://github.com/ericc-ch/copilot-api.git'` calls := ParseToolCalls(text, []string{"execute_command"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -102,7 +102,7 @@ func TestParseToolCallsSupportsParamsJSONWithAmpersandCommand(t *testing.T) { } func TestParseToolCallsDoesNotTreatParamsNameTagAsToolName(t *testing.T) { - text := `execute_commandfile.txtpwd` + text := `file.txtpwd` calls := ParseToolCalls(text, []string{"execute_command"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -115,8 +115,8 @@ func TestParseToolCallsDoesNotTreatParamsNameTagAsToolName(t *testing.T) { } } -func TestParseToolCallsDetailedMarksToolsSyntax(t *testing.T) { - text := `Bashpwd` +func TestParseToolCallsDetailedMarksToolCallsSyntax(t *testing.T) { + text := `pwd` res := ParseToolCallsDetailed(text, []string{"bash"}) if !res.SawToolCallSyntax { t.Fatalf("expected SawToolCallSyntax=true, got %#v", res) @@ -127,7 +127,7 @@ func TestParseToolCallsDetailedMarksToolsSyntax(t *testing.T) { } func TestParseToolCallsSupportsInlineJSONToolObject(t *testing.T) { - text := `{"name":"Bash","input":{"command":"pwd","description":"show cwd"}}` + text := `{"input":{"command":"pwd","description":"show cwd"}}` calls := ParseToolCalls(text, []string{"bash"}) if len(calls) != 1 { t.Fatalf("expected 1 call, got %#v", calls) @@ -141,7 +141,7 @@ func TestParseToolCallsSupportsInlineJSONToolObject(t *testing.T) { } func TestParseToolCallsDoesNotAcceptMismatchedMarkupTags(t *testing.T) { - text := `read_file{"path":"README.md"}` + text := `README.md` calls := ParseToolCalls(text, []string{"read_file"}) if len(calls) != 0 { t.Fatalf("expected mismatched tags to be rejected, got %#v", calls) @@ -149,26 +149,37 @@ func TestParseToolCallsDoesNotAcceptMismatchedMarkupTags(t *testing.T) { } func TestParseToolCallsDoesNotTreatNameInsideParamsAsToolName(t *testing.T) { - text := `data_onlyREADME.md` + text := `README.md` calls := ParseToolCalls(text, []string{"read_file"}) if len(calls) != 0 { t.Fatalf("expected no tool call when name appears only under params, got %#v", calls) } } -func TestParseToolCallsRejectsLegacyToolCallsRoot(t *testing.T) { - text := `read_file{"path":"README.md"}` +func TestParseToolCallsRejectsLegacyToolsWrapper(t *testing.T) { + text := `read_file{"path":"README.md"}` calls := ParseToolCalls(text, []string{"read_file"}) if len(calls) != 0 { - t.Fatalf("expected legacy tool_calls root to be rejected, got %#v", calls) + t.Fatalf("expected legacy tools wrapper to be rejected, got %#v", calls) } } -func TestParseToolCallsRejectsLegacyParametersTag(t *testing.T) { - text := `read_file{"path":"README.md"}` +func TestParseToolCallsRejectsBareInvokeWithoutToolCallsWrapper(t *testing.T) { + text := `README.md` + res := ParseToolCallsDetailed(text, []string{"read_file"}) + if len(res.Calls) != 0 { + t.Fatalf("expected bare invoke to be rejected, got %#v", res.Calls) + } + if res.SawToolCallSyntax { + t.Fatalf("expected bare invoke to no longer count as supported syntax, got %#v", res) + } +} + +func TestParseToolCallsRejectsLegacyCanonicalBody(t *testing.T) { + text := `read_file{"path":"README.md"}` calls := ParseToolCalls(text, []string{"read_file"}) if len(calls) != 0 { - t.Fatalf("expected legacy parameters tag to be rejected, got %#v", calls) + t.Fatalf("expected legacy canonical body to be rejected, got %#v", calls) } } @@ -310,7 +321,7 @@ func TestRepairLooseJSONWithNestedObjects(t *testing.T) { } func TestParseToolCallsUnescapesHTMLEntityArguments(t *testing.T) { - text := `Bash{"command":"echo a > out.txt"}` + text := `echo a > out.txt` calls := ParseToolCalls(text, []string{"bash"}) if len(calls) != 1 { t.Fatalf("expected one call, got %#v", calls) @@ -322,7 +333,7 @@ func TestParseToolCallsUnescapesHTMLEntityArguments(t *testing.T) { } func TestParseToolCallsIgnoresXMLInsideFencedCodeBlock(t *testing.T) { - text := "Here is an example:\n```xml\nread_file{\"path\":\"README.md\"}\n```\nDo not execute it." + text := "Here is an example:\n```xml\nREADME.md\n```\nDo not execute it." res := ParseToolCallsDetailed(text, []string{"read_file"}) if len(res.Calls) != 0 { t.Fatalf("expected no parsed calls for fenced example, got %#v", res.Calls) @@ -330,7 +341,7 @@ func TestParseToolCallsIgnoresXMLInsideFencedCodeBlock(t *testing.T) { } func TestParseToolCallsParsesOnlyNonFencedXMLToolCall(t *testing.T) { - text := "```xml\nread_file{\"path\":\"README.md\"}\n```\nsearch{\"q\":\"golang\"}" + text := "```xml\nREADME.md\n```\ngolang" res := ParseToolCallsDetailed(text, []string{"read_file", "search"}) if len(res.Calls) != 1 { t.Fatalf("expected exactly one parsed call outside fence, got %#v", res.Calls) @@ -341,7 +352,7 @@ func TestParseToolCallsParsesOnlyNonFencedXMLToolCall(t *testing.T) { } func TestParseToolCallsParsesAfterFourBacktickFence(t *testing.T) { - text := "````markdown\n```xml\nread_file{\"path\":\"README.md\"}\n```\n````\nsearch{\"q\":\"outside\"}" + text := "````markdown\n```xml\nREADME.md\n```\n````\noutside" res := ParseToolCallsDetailed(text, []string{"read_file", "search"}) if len(res.Calls) != 1 { t.Fatalf("expected exactly one parsed call outside four-backtick fence, got %#v", res.Calls) diff --git a/internal/util/thinking.go b/internal/util/thinking.go index ad9b184..6fa101c 100644 --- a/internal/util/thinking.go +++ b/internal/util/thinking.go @@ -3,27 +3,48 @@ package util import "strings" func ResolveThinkingEnabled(req map[string]any, defaultEnabled bool) bool { - if enabled, ok := parseThinkingSetting(req["thinking"]); ok { - return enabled - } - if extraBody, ok := req["extra_body"].(map[string]any); ok { - if enabled, ok := parseThinkingSetting(extraBody["thinking"]); ok { - return enabled - } - } - if enabled, ok := parseReasoningEffort(req["reasoning_effort"]); ok { + if enabled, ok := ResolveThinkingOverride(req); ok { return enabled } return defaultEnabled } +func ResolveThinkingOverride(req map[string]any) (bool, bool) { + if req == nil { + return false, false + } + if enabled, ok := parseThinkingSetting(req["thinking"]); ok { + return enabled, true + } + if enabled, ok := parseReasoningSetting(req["reasoning"]); ok { + return enabled, true + } + if extraBody, ok := req["extra_body"].(map[string]any); ok { + if enabled, ok := parseThinkingSetting(extraBody["thinking"]); ok { + return enabled, true + } + if enabled, ok := parseReasoningSetting(extraBody["reasoning"]); ok { + return enabled, true + } + if enabled, ok := parseReasoningEffort(extraBody["reasoning_effort"]); ok { + return enabled, true + } + } + if enabled, ok := parseReasoningEffort(req["reasoning_effort"]); ok { + return enabled, true + } + return false, false +} + func parseThinkingSetting(raw any) (bool, bool) { switch v := raw.(type) { + case bool: + return v, true case string: switch strings.ToLower(strings.TrimSpace(v)) { - case "enabled": + case "enabled", "enable", "on", "true": return true, true - case "disabled": + case "disabled", "disable", "off", "false", "none": return false, true default: return false, false @@ -36,10 +57,28 @@ func parseThinkingSetting(raw any) (bool, bool) { return false, false } +func parseReasoningSetting(raw any) (bool, bool) { + switch v := raw.(type) { + case bool: + return v, true + case string: + return parseReasoningEffort(v) + case map[string]any: + for _, key := range []string{"effort", "type", "enabled"} { + if enabled, ok := parseReasoningSetting(v[key]); ok { + return enabled, true + } + } + } + return false, false +} + func parseReasoningEffort(raw any) (bool, bool) { switch strings.ToLower(strings.TrimSpace(toString(raw))) { - case "low", "medium", "high", "xhigh": + case "minimal", "low", "medium", "high", "xhigh": return true, true + case "none", "disabled", "disable", "off", "false": + return false, true default: return false, false } diff --git a/internal/util/thinking_test.go b/internal/util/thinking_test.go index 7e81cda..003fb5b 100644 --- a/internal/util/thinking_test.go +++ b/internal/util/thinking_test.go @@ -27,13 +27,24 @@ func TestResolveThinkingEnabledUsesExtraBodyFallback(t *testing.T) { } func TestResolveThinkingEnabledMapsReasoningEffortToEnabled(t *testing.T) { - for _, effort := range []string{"low", "medium", "high", "xhigh"} { + for _, effort := range []string{"minimal", "low", "medium", "high", "xhigh"} { if got := ResolveThinkingEnabled(map[string]any{"reasoning_effort": effort}, false); !got { t.Fatalf("expected reasoning_effort=%s to enable thinking", effort) } } } +func TestResolveThinkingEnabledMapsReasoningObject(t *testing.T) { + req := map[string]any{"reasoning": map[string]any{"effort": "none"}} + if got := ResolveThinkingEnabled(req, true); got { + t.Fatalf("expected reasoning.effort=none to disable thinking") + } + req = map[string]any{"reasoning": map[string]any{"effort": "medium"}} + if got := ResolveThinkingEnabled(req, false); !got { + t.Fatalf("expected reasoning.effort=medium to enable thinking") + } +} + func TestResolveThinkingEnabledDefaultsWhenUnset(t *testing.T) { if !ResolveThinkingEnabled(nil, true) { t.Fatal("expected default thinking=true when unset") diff --git a/tests/compat/expected/toolcalls_canonical_nested_param.json b/tests/compat/expected/toolcalls_canonical_nested_param.json new file mode 100644 index 0000000..8eabce0 --- /dev/null +++ b/tests/compat/expected/toolcalls_canonical_nested_param.json @@ -0,0 +1,14 @@ +{ + "calls": [ + { + "name": "get_weather", + "input": { + "city": "beijing", + "unit": "c" + } + } + ], + "sawToolCallSyntax": true, + "rejectedByPolicy": false, + "rejectedToolNames": [] +} diff --git a/tests/compat/expected/toolcalls_canonical_tool_call.json b/tests/compat/expected/toolcalls_canonical_tool_call.json new file mode 100644 index 0000000..124de59 --- /dev/null +++ b/tests/compat/expected/toolcalls_canonical_tool_call.json @@ -0,0 +1,13 @@ +{ + "calls": [ + { + "name": "read_file", + "input": { + "path": "README.MD" + } + } + ], + "sawToolCallSyntax": true, + "rejectedByPolicy": false, + "rejectedToolNames": [] +} diff --git a/tests/compat/expected/toolcalls_function_call_tag.json b/tests/compat/expected/toolcalls_function_call_tag.json deleted file mode 100644 index 4643a9b..0000000 --- a/tests/compat/expected/toolcalls_function_call_tag.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "calls": [], - "sawToolCallSyntax": false, - "rejectedByPolicy": false, - "rejectedToolNames": [] -} diff --git a/tests/compat/expected/toolcalls_invoke_attr.json b/tests/compat/expected/toolcalls_invoke_attr.json deleted file mode 100644 index 4643a9b..0000000 --- a/tests/compat/expected/toolcalls_invoke_attr.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "calls": [], - "sawToolCallSyntax": false, - "rejectedByPolicy": false, - "rejectedToolNames": [] -} diff --git a/tests/compat/expected/toolcalls_xml_tool_call.json b/tests/compat/expected/toolcalls_xml_tool_call.json deleted file mode 100644 index 4643a9b..0000000 --- a/tests/compat/expected/toolcalls_xml_tool_call.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "calls": [], - "sawToolCallSyntax": false, - "rejectedByPolicy": false, - "rejectedToolNames": [] -} diff --git a/tests/compat/expected/toolcalls_xml_tool_name_parameters_json.json b/tests/compat/expected/toolcalls_xml_tool_name_parameters_json.json deleted file mode 100644 index 4643a9b..0000000 --- a/tests/compat/expected/toolcalls_xml_tool_name_parameters_json.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "calls": [], - "sawToolCallSyntax": false, - "rejectedByPolicy": false, - "rejectedToolNames": [] -} diff --git a/tests/compat/fixtures/toolcalls/canonical_nested_param.json b/tests/compat/fixtures/toolcalls/canonical_nested_param.json new file mode 100644 index 0000000..5dd0f9b --- /dev/null +++ b/tests/compat/fixtures/toolcalls/canonical_nested_param.json @@ -0,0 +1,6 @@ +{ + "text": "", + "tool_names": [ + "get_weather" + ] +} diff --git a/tests/compat/fixtures/toolcalls/canonical_tool_call.json b/tests/compat/fixtures/toolcalls/canonical_tool_call.json new file mode 100644 index 0000000..6d80e9b --- /dev/null +++ b/tests/compat/fixtures/toolcalls/canonical_tool_call.json @@ -0,0 +1,6 @@ +{ + "text": "README.MD", + "tool_names": [ + "read_file" + ] +} diff --git a/tests/compat/fixtures/toolcalls/function_call_tag.json b/tests/compat/fixtures/toolcalls/function_call_tag.json deleted file mode 100644 index a345ed2..0000000 --- a/tests/compat/fixtures/toolcalls/function_call_tag.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "text": "read_file{\"path\":\"README.MD\"}", - "tool_names": [ - "read_file" - ] -} \ No newline at end of file diff --git a/tests/compat/fixtures/toolcalls/invoke_attr.json b/tests/compat/fixtures/toolcalls/invoke_attr.json deleted file mode 100644 index 70c77fc..0000000 --- a/tests/compat/fixtures/toolcalls/invoke_attr.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "text": "{\"path\":\"README.MD\"}", - "tool_names": [ - "read_file" - ] -} \ No newline at end of file diff --git a/tests/compat/fixtures/toolcalls/xml_tool_call.json b/tests/compat/fixtures/toolcalls/xml_tool_call.json deleted file mode 100644 index b4ba281..0000000 --- a/tests/compat/fixtures/toolcalls/xml_tool_call.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "text": "read_file{\"path\":\"README.MD\"}", - "tool_names": [ - "read_file" - ] -} \ No newline at end of file diff --git a/tests/compat/fixtures/toolcalls/xml_tool_name_parameters_json.json b/tests/compat/fixtures/toolcalls/xml_tool_name_parameters_json.json deleted file mode 100644 index 22843dc..0000000 --- a/tests/compat/fixtures/toolcalls/xml_tool_name_parameters_json.json +++ /dev/null @@ -1,6 +0,0 @@ -{ - "text": "get_weather{\"city\":\"beijing\",\"unit\":\"c\"}", - "tool_names": [ - "get_weather" - ] -} diff --git a/tests/node/stream-tool-sieve.test.js b/tests/node/stream-tool-sieve.test.js index 80b4bd9..1e5012a 100644 --- a/tests/node/stream-tool-sieve.test.js +++ b/tests/node/stream-tool-sieve.test.js @@ -42,7 +42,7 @@ test('extractToolNames keeps only declared tool names (Go parity)', () => { }); test('parseToolCalls parses XML markup tool call', () => { - const payload = 'read_file{"path":"README.MD"}'; + const payload = 'README.MD'; const calls = parseToolCalls(payload, ['read_file']); assert.equal(calls.length, 1); assert.equal(calls[0].name, 'read_file'); @@ -61,7 +61,7 @@ test('parseToolCalls ignores tool_call payloads that exist only inside fenced co const text = [ 'I will call a tool now.', '```xml', - 'read_file{"path":"README.md"}', + 'README.md', '```', ].join('\n'); const calls = parseToolCalls(text, ['read_file']); @@ -69,7 +69,7 @@ test('parseToolCalls ignores tool_call payloads that exist only inside fenced co }); test('parseToolCalls keeps unknown schema names when toolNames is provided', () => { - const payload = 'not_in_schema{"q":"go"}'; + const payload = 'go'; const calls = parseToolCalls(payload, ['search']); assert.equal(calls.length, 1); assert.equal(calls[0].name, 'not_in_schema'); @@ -77,7 +77,7 @@ test('parseToolCalls keeps unknown schema names when toolNames is provided', () test('sieve emits tool_calls for XML tool call payload', () => { const events = runSieve( - ['read_file{"path":"README.MD"}'], + ['README.MD'], ['read_file'], ); const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []); @@ -88,8 +88,8 @@ test('sieve emits tool_calls for XML tool call payload', () => { test('sieve emits tool_calls when XML tag spans multiple chunks', () => { const events = runSieve( [ - 'read_file', - '{"path":"README.MD"}', + '', + 'README.MD', ], ['read_file'], ); @@ -103,10 +103,10 @@ test('sieve keeps long XML tool calls buffered until the closing tag arrives', ( const splitAt = longContent.length / 2; const events = runSieve( [ - '\n \n write_to_file\n \n \n \n \n \n \n', + ']]>\n \n', ], ['write_to_file'], ); @@ -147,7 +147,16 @@ test('sieve keeps embedded invalid tool-like json as normal text to avoid stream }); test('sieve passes malformed executable-looking XML through as text', () => { - const chunk = '{"path":"README.MD"}'; + const chunk = '{"path":"README.MD"}'; + const events = runSieve([chunk], ['read_file']); + const leakedText = collectText(events); + const hasToolCalls = events.some((evt) => evt.type === 'tool_calls' && evt.calls?.length > 0); + assert.equal(hasToolCalls, false); + assert.equal(leakedText, chunk); +}); + +test('sieve keeps bare tool_call XML as plain text without wrapper', () => { + const chunk = 'README.MD'; const events = runSieve([chunk], ['read_file']); const leakedText = collectText(events); const hasToolCalls = events.some((evt) => evt.type === 'tool_calls' && evt.calls?.length > 0); @@ -159,14 +168,13 @@ test('sieve flushes incomplete captured XML tool blocks by falling back to raw t const events = runSieve( [ '前置正文G。', - '\n', - ' \n', - ' read_file\n', + '\n', + ' \n', ], ['read_file'], ); const leakedText = collectText(events); - const expected = ['前置正文G。', '\n', ' \n', ' read_file\n'].join(''); + const expected = ['前置正文G。', '\n', ' \n'].join(''); const hasToolCalls = events.some((evt) => evt.type === 'tool_calls' && evt.calls?.length > 0); assert.equal(hasToolCalls, false); assert.equal(leakedText, expected); @@ -176,7 +184,7 @@ test('sieve captures XML wrapper tags with attributes without leaking wrapper te const events = runSieve( [ '前置正文H。', - 'read_file{"path":"README.MD"}', + 'README.MD', '后置正文I。', ], ['read_file'], @@ -186,8 +194,8 @@ test('sieve captures XML wrapper tags with attributes without leaking wrapper te assert.equal(hasToolCall, true); assert.equal(leakedText.includes('前置正文H。'), true); assert.equal(leakedText.includes('后置正文I。'), true); - assert.equal(leakedText.includes(''), false); - assert.equal(leakedText.includes(''), false); + assert.equal(leakedText.includes(''), false); + assert.equal(leakedText.includes(''), false); }); test('sieve keeps plain text intact in tool mode when no tool call appears', () => { @@ -270,7 +278,7 @@ test('formatOpenAIStreamToolCalls reuses ids with the same idStore', () => { }); test('parseToolCalls rejects mismatched markup tags', () => { - const payload = 'read_file{"path":"README.md"}'; + const payload = 'README.md'; const calls = parseToolCalls(payload, ['read_file']); assert.equal(calls.length, 0); }); diff --git a/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json b/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json index e878836..02d9cd4 100644 --- a/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json +++ b/tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/meta.json @@ -5,7 +5,7 @@ "request": { "chat_session_id": "0a3c904d-5761-4cf0-ae51-9b41c1c78f1e", "parent_message_id": null, - "prompt": "<|System|>\n**Memories**\nThese are memories stored via the memory_tool that you can reference in future conversations.\n[]\n\n\n**Recent Chats**\nThese are some of the user's recent conversations. You can use them to understand user preferences:\n[\n {\n \"title\": \"\",\n \"last_chat\": \"2026年4月6日\"\n },\n {\n \"title\": \"\",\n \"last_chat\": \"2026年4月6日\"\n },\n {\n \"title\": \"江青判刑原因\",\n \"last_chat\": \"2026年4月5日\"\n },\n {\n \"title\": \"GitHub個人檔案\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"DS2API架構圖\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"Markdown範例\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"廣州天氣概況\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"Xbox手把SVG\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"清除记忆\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"SVG與安卓XML示例\",\n \"last_chat\": \"2026年4月4日\"\n }\n]\n\n\n\n\n\n\n\n\n\n\nYou have access to these tools:\n\nTool: memory_tool\nDescription: The memory tool stores long-term information across conversations.\nUse `action` to control the operation: `create` (add), `edit` (update), `delete` (remove).\n- No relevant record: `create` + `content`\n- Existing relevant record: `edit` + `id` + `content`\n- Outdated/irrelevant record: `delete` + `id`\nMemories will automatically appear in the tag in later conversations.\nDo not store sensitive information (e.g., ethnicity, religion, sexual orientation, political views, sex life, criminal records).\nYou may store: preferred name, preferences, plans, work-related notes, chat style preferences, first chat time, etc.\nDo not show memory content directly in the conversation unless the user explicitly asks.\nToday is 2026年4月6日.\nSimilar memories should be merged; prefer updating existing records.\n\nExamples:\n{\"action\":\"create\",\"content\":\"User prefers brief replies and is more active on weekends.\"}\n{\"action\":\"edit\",\"id\":12,\"content\":\"User’s preferred name updated to “A-Xing”, prefers Chinese replies.\"}\n{\"action\":\"delete\",\"id\":7}\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: create, edit, or delete\",\"enum\":[\"create\",\"edit\",\"delete\"],\"type\":\"string\"},\"content\":{\"description\":\"The content of the memory record (required for create/edit)\",\"type\":\"string\"},\"id\":{\"description\":\"The id of the memory record (required for edit/delete)\",\"type\":\"integer\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: search_web\nDescription: Search the web for up-to-date or specific information.\nUse this when the user asks for the latest news, current facts, or needs verification.\nGenerate focused keywords and run multiple searches if needed.\nToday is 2026年4月6日.\n\nResponse format:\n- items[].id (short id), title, url, text\n\nCitations:\n- After using results, add `[citation,domain](id)` after the sentence.\n- Multiple citations are allowed.\n- If no results are cited, omit citations.\n\nExample:\nThe capital of France is Paris. [citation,example.com](abc123)\nThe population is about 2.1 million. [citation,example.com](abc123) [citation,example2.com](def456)\nParameters: {\"properties\":{\"query\":{\"description\":\"search keyword\",\"type\":\"string\"},\"topic\":{\"description\":\"search topic (one of `general`, `news`, `finance`)\",\"enum\":[\"general\",\"news\",\"finance\"],\"type\":\"string\"}},\"required\":[\"query\"],\"type\":\"object\"}\n\nTool: scrape_web\nDescription: Scrape a URL for detailed page content.\nUse this when the user requests content from a specific page or when search snippets are insufficient.\nAvoid using it for common questions unless the user asks.\nParameters: {\"properties\":{\"url\":{\"description\":\"url to scrape\",\"type\":\"string\"}},\"required\":[\"url\"],\"type\":\"object\"}\n\nTool: eval_javascript\nDescription: Execute JavaScript code using QuickJS engine (ES2020). The result is the value of the last expression in the code. For calculations with decimals, use toFixed() to control precision. Console output (log/info/warn/error) is captured and returned in 'logs' field. No DOM or Node.js APIs available. Example: '1 + 2' returns 3; 'const x = 5; x * 2' returns 10.\nParameters: {\"properties\":{\"code\":{\"description\":\"The JavaScript code to execute\",\"type\":\"string\"}},\"required\":[\"code\"],\"type\":\"object\"}\n\nTool: get_time_info\nDescription: Get the current local date and time info from the device. Returns year/month/day, weekday, ISO date/time strings, timezone, and timestamp.\nParameters: {\"properties\":{},\"type\":\"object\"}\n\nTool: clipboard_tool\nDescription: Read or write plain text from the device clipboard. Use action: read or write. For write, provide text. Do NOT write to the clipboard unless the user has explicitly requested it.\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: read or write\",\"enum\":[\"read\",\"write\"],\"type\":\"string\"},\"text\":{\"description\":\"Text to write to the clipboard (required for write)\",\"type\":\"string\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: text_to_speech\nDescription: Speak text aloud to the user using the device's text-to-speech engine. Use this when the user asks you to read something aloud, or when audio output is appropriate. The tool returns immediately; audio plays in the background on the device. Provide natural, readable text without markdown formatting.\nParameters: {\"properties\":{\"text\":{\"description\":\"The text to speak aloud\",\"type\":\"string\"}},\"required\":[\"text\"],\"type\":\"object\"}\n\nTool: ask_user\nDescription: Ask the user one or more questions when you need clarification, additional information, or confirmation. Each question can optionally provide a list of suggested options for the user to choose from. The user may select an option or provide their own free-text answer for each question. The answers will be returned as a JSON object mapping question IDs to the user's responses.\nParameters: {\"properties\":{\"questions\":{\"description\":\"List of questions to ask the user\",\"items\":{\"properties\":{\"id\":{\"description\":\"Unique identifier for this question\",\"type\":\"string\"},\"options\":{\"description\":\"Optional list of suggested options for the user to choose from\",\"items\":{\"type\":\"string\"},\"type\":\"array\"},\"question\":{\"description\":\"The question text to display to the user\",\"type\":\"string\"},\"selection_type\":{\"description\":\"Answer type: text (free text input, default), single (select exactly one option), multi (select one or more options)\",\"enum\":[\"text\",\"single\",\"multi\"],\"type\":\"string\"}},\"required\":[\"id\",\"question\"],\"type\":\"object\"},\"type\":\"array\"}},\"required\":[\"questions\"],\"type\":\"object\"}\n\nTOOL CALL FORMAT — FOLLOW EXACTLY:\n\nWhen calling tools, emit ONLY raw XML at the very end of your response. No text before, no text after, no markdown fences.\n\n\n \n TOOL_NAME_HERE\n {\"key\":\"value\"}\n \n\n\nRULES:\n1) Output ONLY the XML above when calling tools. Do NOT mix tool XML with regular text.\n2) MUST contain a strict JSON object. All JSON keys and strings use double quotes.\n3) Multiple tools → multiple blocks inside ONE root.\n4) Do NOT wrap the XML in markdown code fences (no triple backticks).\n5) After receiving a tool result, use it directly. Only call another tool if the result is insufficient.\n6) Parameters MUST use the exact field names from the selected tool schema.\n7) CRITICAL: Do NOT invent or add any extra fields (such as \"_raw\", \"_xml\"). Use ONLY the fields strictly defined in the schema. Extra fields will cause execution failure.\n\n❌ WRONG — Do NOT do these:\nWrong 1 — mixed text and XML:\n I'll read the file for you. ...\nWrong 2 — describing tool calls in text:\n [调用 Bash] {\"command\": \"ls\"}\nWrong 3 — missing wrapper:\n read_file{}\nWrong 4 — extra/invented fields:\n {\"_raw\": \"...\", \"command\": \"ls\"}\n\n\n✅ CORRECT EXAMPLES:\n\nExample A — Single tool:\n\n \n read_file\n {\"path\":\"src/main.go\"}\n \n\n\nExample B — Two tools in parallel:\n\n \n read_file\n {\"path\":\"src/main.go\"}\n \n \n write_to_file\n {\"path\":\"output.txt\",\"content\":\"Hello world\"}\n \n\n\nExample C — Tool with complex nested JSON parameters:\n\n \n ask_followup_question\n {\"question\":\"Which approach do you prefer?\",\"follow_up\":[{\"text\":\"Option A\"},{\"text\":\"Option B\"}]}\n \n\n\nRemember: Output ONLY the ... XML block when calling tools.<|end▁of▁instructions|>\n\n<|User|>\n<|User|>\n在一个类似2022×2022的花园的每个方格中,最初都有一个高度为0的树,园丁和伐木工交替进行以下游戏,园丁首先开始:园丁选择花园中的一个方格,该方格上的每棵树以及周围至多八个方格中的所有树都会增长一单位,伐木工随后选择板上的四个不同方格,这些方格上正高的树都会减少一单位,称一棵树为雄伟的,如果其高度至少为10的六次方.确定园丁能够确保板上最终有K棵雄伟的树,无论伐木工如何操作,求最大的K<|end▁of▁sentence|><|end▁of▁sentence|>", + "prompt": "<|System|>\n**Memories**\nThese are memories stored via the memory_tool that you can reference in future conversations.\n[]\n\n\n**Recent Chats**\nThese are some of the user's recent conversations. You can use them to understand user preferences:\n[\n {\n \"title\": \"\",\n \"last_chat\": \"2026年4月6日\"\n },\n {\n \"title\": \"\",\n \"last_chat\": \"2026年4月6日\"\n },\n {\n \"title\": \"江青判刑原因\",\n \"last_chat\": \"2026年4月5日\"\n },\n {\n \"title\": \"GitHub個人檔案\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"DS2API架構圖\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"Markdown範例\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"廣州天氣概況\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"Xbox手把SVG\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"清除记忆\",\n \"last_chat\": \"2026年4月4日\"\n },\n {\n \"title\": \"SVG與安卓XML示例\",\n \"last_chat\": \"2026年4月4日\"\n }\n]\n\n\n\n\n\n\n\n\n\n\nYou have access to these tools:\n\nTool: memory_tool\nDescription: The memory tool stores long-term information across conversations.\nUse `action` to control the operation: `create` (add), `edit` (update), `delete` (remove).\n- No relevant record: `create` + `content`\n- Existing relevant record: `edit` + `id` + `content`\n- Outdated/irrelevant record: `delete` + `id`\nMemories will automatically appear in the tag in later conversations.\nDo not store sensitive information (e.g., ethnicity, religion, sexual orientation, political views, sex life, criminal records).\nYou may store: preferred name, preferences, plans, work-related notes, chat style preferences, first chat time, etc.\nDo not show memory content directly in the conversation unless the user explicitly asks.\nToday is 2026年4月6日.\nSimilar memories should be merged; prefer updating existing records.\n\nExamples:\n{\"action\":\"create\",\"content\":\"User prefers brief replies and is more active on weekends.\"}\n{\"action\":\"edit\",\"id\":12,\"content\":\"User’s preferred name updated to “A-Xing”, prefers Chinese replies.\"}\n{\"action\":\"delete\",\"id\":7}\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: create, edit, or delete\",\"enum\":[\"create\",\"edit\",\"delete\"],\"type\":\"string\"},\"content\":{\"description\":\"The content of the memory record (required for create/edit)\",\"type\":\"string\"},\"id\":{\"description\":\"The id of the memory record (required for edit/delete)\",\"type\":\"integer\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: search_web\nDescription: Search the web for up-to-date or specific information.\nUse this when the user asks for the latest news, current facts, or needs verification.\nGenerate focused keywords and run multiple searches if needed.\nToday is 2026年4月6日.\n\nResponse format:\n- items[].id (short id), title, url, text\n\nCitations:\n- After using results, add `[citation,domain](id)` after the sentence.\n- Multiple citations are allowed.\n- If no results are cited, omit citations.\n\nExample:\nThe capital of France is Paris. [citation,example.com](abc123)\nThe population is about 2.1 million. [citation,example.com](abc123) [citation,example2.com](def456)\nParameters: {\"properties\":{\"query\":{\"description\":\"search keyword\",\"type\":\"string\"},\"topic\":{\"description\":\"search topic (one of `general`, `news`, `finance`)\",\"enum\":[\"general\",\"news\",\"finance\"],\"type\":\"string\"}},\"required\":[\"query\"],\"type\":\"object\"}\n\nTool: scrape_web\nDescription: Scrape a URL for detailed page content.\nUse this when the user requests content from a specific page or when search snippets are insufficient.\nAvoid using it for common questions unless the user asks.\nParameters: {\"properties\":{\"url\":{\"description\":\"url to scrape\",\"type\":\"string\"}},\"required\":[\"url\"],\"type\":\"object\"}\n\nTool: eval_javascript\nDescription: Execute JavaScript code using QuickJS engine (ES2020). The result is the value of the last expression in the code. For calculations with decimals, use toFixed() to control precision. Console output (log/info/warn/error) is captured and returned in 'logs' field. No DOM or Node.js APIs available. Example: '1 + 2' returns 3; 'const x = 5; x * 2' returns 10.\nParameters: {\"properties\":{\"code\":{\"description\":\"The JavaScript code to execute\",\"type\":\"string\"}},\"required\":[\"code\"],\"type\":\"object\"}\n\nTool: get_time_info\nDescription: Get the current local date and time info from the device. Returns year/month/day, weekday, ISO date/time strings, timezone, and timestamp.\nParameters: {\"properties\":{},\"type\":\"object\"}\n\nTool: clipboard_tool\nDescription: Read or write plain text from the device clipboard. Use action: read or write. For write, provide text. Do NOT write to the clipboard unless the user has explicitly requested it.\nParameters: {\"properties\":{\"action\":{\"description\":\"Operation to perform: read or write\",\"enum\":[\"read\",\"write\"],\"type\":\"string\"},\"text\":{\"description\":\"Text to write to the clipboard (required for write)\",\"type\":\"string\"}},\"required\":[\"action\"],\"type\":\"object\"}\n\nTool: text_to_speech\nDescription: Speak text aloud to the user using the device's text-to-speech engine. Use this when the user asks you to read something aloud, or when audio output is appropriate. The tool returns immediately; audio plays in the background on the device. Provide natural, readable text without markdown formatting.\nParameters: {\"properties\":{\"text\":{\"description\":\"The text to speak aloud\",\"type\":\"string\"}},\"required\":[\"text\"],\"type\":\"object\"}\n\nTool: ask_user\nDescription: Ask the user one or more questions when you need clarification, additional information, or confirmation. Each question can optionally provide a list of suggested options for the user to choose from. The user may select an option or provide their own free-text answer for each question. The answers will be returned as a JSON object mapping question IDs to the user's responses.\nParameters: {\"properties\":{\"questions\":{\"description\":\"List of questions to ask the user\",\"items\":{\"properties\":{\"id\":{\"description\":\"Unique identifier for this question\",\"type\":\"string\"},\"options\":{\"description\":\"Optional list of suggested options for the user to choose from\",\"items\":{\"type\":\"string\"},\"type\":\"array\"},\"question\":{\"description\":\"The question text to display to the user\",\"type\":\"string\"},\"selection_type\":{\"description\":\"Answer type: text (free text input, default), single (select exactly one option), multi (select one or more options)\",\"enum\":[\"text\",\"single\",\"multi\"],\"type\":\"string\"}},\"required\":[\"id\",\"question\"],\"type\":\"object\"},\"type\":\"array\"}},\"required\":[\"questions\"],\"type\":\"object\"}\n\nTOOL CALL FORMAT — FOLLOW EXACTLY:\n\n\n \n \n \n\n\nRULES:\n1) Use the XML wrapper format only.\n2) Put one or more entries under a single root.\n3) Use for the tool name and for each argument.\n4) All string values should use when they may contain code, markup, JSON, paths, prompts, or other special characters.\n5) Objects use nested XML inside a ; arrays may repeat children.\n6) Numbers, booleans, and null stay plain text.\n7) Use only the parameter names in the tool schema. Do not invent fields.\n8) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.\n\nPARAMETER SHAPES:\n- string => \n- object => ...\n- array => ...\n- number/bool/null => plain text\n\n【WRONG — Do NOT do these】:\n\nWrong 1 — mixed text after XML:\n ... I hope this helps.\nWrong 2 — old canonical tags or raw payloads:\n read_file{\"path\":\"x\"}\nWrong 3 — Markdown code fences:\n ```xml\n ...\n ```\n\nRemember: The ONLY valid way to use tools is the ... XML block at the end of your response.\n\n【CORRECT EXAMPLES】:\n\nExample A — Single tool:\n\n \n \n \n\n\nExample B — Two tools in parallel:\n\n \n \n \n \n \n \n \n\n\nExample C — Tool with nested XML parameters:\n\n \n \n \n \n\n<|end▁of▁instructions|>\n\n<|User|>\n<|User|>\n在一个类似2022×2022的花园的每个方格中,最初都有一个高度为0的树,园丁和伐木工交替进行以下游戏,园丁首先开始:园丁选择花园中的一个方格,该方格上的每棵树以及周围至多八个方格中的所有树都会增长一单位,伐木工随后选择板上的四个不同方格,这些方格上正高的树都会减少一单位,称一棵树为雄伟的,如果其高度至少为10的六次方.确定园丁能够确保板上最终有K棵雄伟的树,无论伐木工如何操作,求最大的K<|end▁of▁sentence|><|end▁of▁sentence|>", "ref_file_ids": [], "search_enabled": false, "thinking_enabled": true