mirror of
https://github.com/CJackHwang/ds2api.git
synced 2026-05-10 03:07:41 +08:00
fix: preserve partial-update fields for current_input_file and thinking_injection, expand DSML space-separator aliases
- Guard current_input_file.enabled / thinking_injection.{enabled,prompt} with hasNestedSettingsKey so partial updates don't overwrite omitted fields
- Expand DSML alias support to tolerate space-separated tags (e.g. <|dsml invoke>) alongside pipe-separated forms
- Sync Go sieve, Node sieve, toolcall parser, and tests for all new DSML variants
- Update API.md and toolcall-semantics.md with expanded alias coverage
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
4
API.md
4
API.md
@@ -37,7 +37,7 @@
|
||||
|
||||
- OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。
|
||||
- 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。
|
||||
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>` 以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`,内部仍以 XML 解析语义为准,并在流式场景执行防泄漏筛分。
|
||||
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`),以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`,内部仍以 XML 解析语义为准,并在流式场景执行防泄漏筛分。
|
||||
- `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。
|
||||
|
||||
---
|
||||
@@ -344,7 +344,7 @@ data: [DONE]
|
||||
补充说明:
|
||||
|
||||
- **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call(前后普通文本仍可透传)。
|
||||
- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;DSML 会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。
|
||||
- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;DSML 会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。
|
||||
- 当最终可见正文为空但思维链里包含可执行工具调用时,Chat / Responses 会在收尾阶段补发标准 OpenAI `tool_calls` / `function_call` 输出;如果客户端未开启 thinking / reasoning,该思维链只用于检测,不会作为可见正文或 `reasoning_content` 暴露。
|
||||
- Markdown fenced code block(例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。
|
||||
|
||||
|
||||
@@ -39,6 +39,7 @@
|
||||
兼容修复:
|
||||
|
||||
- 如果模型漏掉 opening wrapper,但后面仍输出了一个或多个 invoke 并以 closing wrapper 收尾,Go 解析链路会在解析前补回缺失的 opening wrapper。
|
||||
- 如果模型把 DSML 标签里的分隔符 `|` 写漏成空格(例如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`,或无 leading pipe 的 `<DSML tool_calls>` 形态),Go / Node 会在固定工具标签名范围内归一化;相似但非工具标签名(如 `tool_calls_extra`)仍按普通文本处理。
|
||||
- 这是一个针对常见模型失误的窄修复,不改变推荐输出格式;prompt 仍要求模型直接输出完整 DSML 外壳。
|
||||
|
||||
## 2) 非兼容内容
|
||||
@@ -51,7 +52,7 @@
|
||||
|
||||
在流式链路中(Go / Node 一致):
|
||||
|
||||
- DSML `<|DSML|tool_calls>` wrapper 及其兼容变体(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
|
||||
- DSML `<|DSML|tool_calls>` wrapper、兼容变体(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)、窄容错空格分隔形态(如 `<|DSML tool_calls>`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
|
||||
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapper,Go 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
|
||||
- 已识别成功的工具调用不会再次回流到普通文本
|
||||
- 不符合新格式的块不会执行,并继续按原样文本透传
|
||||
@@ -87,7 +88,7 @@ node --test tests/node/stream-tool-sieve.test.js
|
||||
|
||||
- DSML `<|DSML|tool_calls>` wrapper 正常解析
|
||||
- legacy canonical `<tool_calls>` wrapper 正常解析
|
||||
- 别名变体(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)正常解析
|
||||
- 别名变体(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)和 DSML 空格分隔 typo(如 `<|DSML tool_calls>`)正常解析
|
||||
- 混搭标签(DSML wrapper + canonical inner)归一化后正常解析
|
||||
- 波浪线围栏 `~~~` 内的示例不执行
|
||||
- 嵌套围栏(4 反引号嵌套 3 反引号)内的示例不执行
|
||||
|
||||
@@ -244,6 +244,52 @@ func TestUpdateSettingsCurrentInputFile(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsCurrentInputFilePartialUpdatePreservesEnabled(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"],"current_input_file":{"enabled":false,"min_chars":777}}`)
|
||||
payload := map[string]any{
|
||||
"current_input_file": map[string]any{
|
||||
"min_chars": 5000,
|
||||
},
|
||||
}
|
||||
b, _ := json.Marshal(payload)
|
||||
req := httptest.NewRequest(http.MethodPut, "/admin/settings", bytes.NewReader(b))
|
||||
rec := httptest.NewRecorder()
|
||||
h.updateSettings(rec, req)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
snap := h.Store.Snapshot()
|
||||
if snap.CurrentInputFile.Enabled == nil || *snap.CurrentInputFile.Enabled {
|
||||
t.Fatalf("expected current_input_file.enabled to remain false, got %#v", snap.CurrentInputFile.Enabled)
|
||||
}
|
||||
if snap.CurrentInputFile.MinChars != 5000 {
|
||||
t.Fatalf("expected current_input_file.min_chars=5000, got %#v", snap.CurrentInputFile)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsCurrentInputFilePartialUpdatePreservesMinChars(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"],"current_input_file":{"enabled":false,"min_chars":777}}`)
|
||||
payload := map[string]any{
|
||||
"current_input_file": map[string]any{
|
||||
"enabled": true,
|
||||
},
|
||||
}
|
||||
b, _ := json.Marshal(payload)
|
||||
req := httptest.NewRequest(http.MethodPut, "/admin/settings", bytes.NewReader(b))
|
||||
rec := httptest.NewRecorder()
|
||||
h.updateSettings(rec, req)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
snap := h.Store.Snapshot()
|
||||
if snap.CurrentInputFile.Enabled == nil || !*snap.CurrentInputFile.Enabled {
|
||||
t.Fatalf("expected current_input_file.enabled=true, got %#v", snap.CurrentInputFile.Enabled)
|
||||
}
|
||||
if snap.CurrentInputFile.MinChars != 777 {
|
||||
t.Fatalf("expected current_input_file.min_chars to remain 777, got %#v", snap.CurrentInputFile)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsRejectsTwoSplitModesEnabled(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"]}`)
|
||||
payload := map[string]any{
|
||||
@@ -292,6 +338,52 @@ func TestUpdateSettingsThinkingInjection(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsThinkingInjectionPartialPromptPreservesEnabled(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"],"thinking_injection":{"enabled":false,"prompt":"original prompt"}}`)
|
||||
payload := map[string]any{
|
||||
"thinking_injection": map[string]any{
|
||||
"prompt": " updated prompt ",
|
||||
},
|
||||
}
|
||||
b, _ := json.Marshal(payload)
|
||||
req := httptest.NewRequest(http.MethodPut, "/admin/settings", bytes.NewReader(b))
|
||||
rec := httptest.NewRecorder()
|
||||
h.updateSettings(rec, req)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
snap := h.Store.Snapshot()
|
||||
if snap.ThinkingInjection.Enabled == nil || *snap.ThinkingInjection.Enabled {
|
||||
t.Fatalf("expected thinking_injection.enabled to remain false, got %#v", snap.ThinkingInjection.Enabled)
|
||||
}
|
||||
if got := h.Store.ThinkingInjectionPrompt(); got != "updated prompt" {
|
||||
t.Fatalf("expected updated prompt, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsThinkingInjectionPartialEnabledPreservesPrompt(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"],"thinking_injection":{"enabled":false,"prompt":"original prompt"}}`)
|
||||
payload := map[string]any{
|
||||
"thinking_injection": map[string]any{
|
||||
"enabled": true,
|
||||
},
|
||||
}
|
||||
b, _ := json.Marshal(payload)
|
||||
req := httptest.NewRequest(http.MethodPut, "/admin/settings", bytes.NewReader(b))
|
||||
rec := httptest.NewRecorder()
|
||||
h.updateSettings(rec, req)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
snap := h.Store.Snapshot()
|
||||
if snap.ThinkingInjection.Enabled == nil || !*snap.ThinkingInjection.Enabled {
|
||||
t.Fatalf("expected thinking_injection.enabled=true, got %#v", snap.ThinkingInjection.Enabled)
|
||||
}
|
||||
if got := h.Store.ThinkingInjectionPrompt(); got != "original prompt" {
|
||||
t.Fatalf("expected original prompt to be preserved, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdateSettingsAutoDeleteMode(t *testing.T) {
|
||||
h := newAdminTestHandler(t, `{"keys":["k1"],"auto_delete":{"sessions":true}}`)
|
||||
|
||||
|
||||
@@ -28,6 +28,10 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
|
||||
return
|
||||
}
|
||||
}
|
||||
currentInputEnabledSet := hasNestedSettingsKey(req, "current_input_file", "enabled")
|
||||
currentInputMinCharsSet := hasNestedSettingsKey(req, "current_input_file", "min_chars")
|
||||
thinkingInjectionEnabledSet := hasNestedSettingsKey(req, "thinking_injection", "enabled")
|
||||
thinkingInjectionPromptSet := hasNestedSettingsKey(req, "thinking_injection", "prompt")
|
||||
|
||||
if err := h.Store.Update(func(c *config.Config) error {
|
||||
if adminCfg != nil {
|
||||
@@ -80,16 +84,24 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
|
||||
}
|
||||
}
|
||||
if currentInputCfg != nil {
|
||||
c.CurrentInputFile.Enabled = currentInputCfg.Enabled
|
||||
if currentInputCfg.Enabled != nil && *currentInputCfg.Enabled {
|
||||
if currentInputEnabledSet {
|
||||
c.CurrentInputFile.Enabled = currentInputCfg.Enabled
|
||||
}
|
||||
if currentInputEnabledSet && currentInputCfg.Enabled != nil && *currentInputCfg.Enabled {
|
||||
disabled := false
|
||||
c.HistorySplit.Enabled = &disabled
|
||||
}
|
||||
c.CurrentInputFile.MinChars = currentInputCfg.MinChars
|
||||
if currentInputMinCharsSet {
|
||||
c.CurrentInputFile.MinChars = currentInputCfg.MinChars
|
||||
}
|
||||
}
|
||||
if thinkingInjCfg != nil {
|
||||
c.ThinkingInjection.Enabled = thinkingInjCfg.Enabled
|
||||
c.ThinkingInjection.Prompt = thinkingInjCfg.Prompt
|
||||
if thinkingInjectionEnabledSet {
|
||||
c.ThinkingInjection.Enabled = thinkingInjCfg.Enabled
|
||||
}
|
||||
if thinkingInjectionPromptSet {
|
||||
c.ThinkingInjection.Prompt = thinkingInjCfg.Prompt
|
||||
}
|
||||
}
|
||||
if aliasMap != nil {
|
||||
c.ModelAliases = aliasMap
|
||||
@@ -144,3 +156,12 @@ func (h *Handler) updateSettingsPassword(w http.ResponseWriter, r *http.Request)
|
||||
"jwt_valid_after_unix": now,
|
||||
})
|
||||
}
|
||||
|
||||
func hasNestedSettingsKey(req map[string]any, section, key string) bool {
|
||||
raw, ok := req[section].(map[string]any)
|
||||
if !ok {
|
||||
return false
|
||||
}
|
||||
_, exists := raw[key]
|
||||
return exists
|
||||
}
|
||||
|
||||
@@ -8,7 +8,7 @@ const {
|
||||
stripFencedCodeBlocks,
|
||||
} = require('./parse_payload');
|
||||
|
||||
const TOOL_MARKUP_PREFIXES = ['<tool_calls', '<|dsml|tool_calls', '<dsml|tool_calls', '<|tool_calls', '<|tool_calls'];
|
||||
const TOOL_MARKUP_PREFIXES = ['<tool_calls', '<|dsml|tool_calls', '<|dsml tool_calls', '<dsml|tool_calls', '<dsml tool_calls', '<|tool_calls', '<|tool_calls'];
|
||||
|
||||
function extractToolNames(tools) {
|
||||
if (!Array.isArray(tools) || tools.length === 0) {
|
||||
|
||||
@@ -166,6 +166,18 @@ const DSML_TOOL_MARKUP_ALIASES = [
|
||||
{ from: '</|dsml|invoke>', to: '</invoke>' },
|
||||
{ from: '<|dsml|parameter', to: '<parameter' },
|
||||
{ from: '</|dsml|parameter>', to: '</parameter>' },
|
||||
{ from: '<|dsml tool_calls', to: '<tool_calls' },
|
||||
{ from: '</|dsml tool_calls>', to: '</tool_calls>' },
|
||||
{ from: '<|dsml invoke', to: '<invoke' },
|
||||
{ from: '</|dsml invoke>', to: '</invoke>' },
|
||||
{ from: '<|dsml parameter', to: '<parameter' },
|
||||
{ from: '</|dsml parameter>', to: '</parameter>' },
|
||||
{ from: '<dsml tool_calls', to: '<tool_calls' },
|
||||
{ from: '</dsml tool_calls>', to: '</tool_calls>' },
|
||||
{ from: '<dsml invoke', to: '<invoke' },
|
||||
{ from: '</dsml invoke>', to: '</invoke>' },
|
||||
{ from: '<dsml parameter', to: '<parameter' },
|
||||
{ from: '</dsml parameter>', to: '</parameter>' },
|
||||
{ from: '<dsml|tool_calls', to: '<tool_calls' },
|
||||
{ from: '</dsml|tool_calls>', to: '</tool_calls>' },
|
||||
{ from: '<dsml|invoke', to: '<invoke' },
|
||||
|
||||
@@ -4,7 +4,9 @@ const { parseToolCalls } = require('./parse');
|
||||
// XML wrapper tag pair used by the streaming sieve.
|
||||
const XML_TOOL_TAG_PAIRS = [
|
||||
{ open: '<|dsml|tool_calls', close: '</|dsml|tool_calls>' },
|
||||
{ open: '<|dsml tool_calls', close: '</|dsml tool_calls>' },
|
||||
{ open: '<dsml|tool_calls', close: '</dsml|tool_calls>' },
|
||||
{ open: '<dsml tool_calls', close: '</dsml tool_calls>' },
|
||||
{ open: '<|tool_calls', close: '</|tool_calls>' },
|
||||
{ open: '<|tool_calls', close: '</|tool_calls>' },
|
||||
{ open: '<tool_calls', close: '</tool_calls>' },
|
||||
@@ -12,7 +14,7 @@ const XML_TOOL_TAG_PAIRS = [
|
||||
|
||||
const XML_TOOL_OPENING_TAGS = [
|
||||
...XML_TOOL_TAG_PAIRS.map(p => p.open),
|
||||
'<|dsml|invoke', '<dsml|invoke', '<|invoke', '<|invoke', '<invoke',
|
||||
'<|dsml|invoke', '<|dsml invoke', '<dsml|invoke', '<dsml invoke', '<|invoke', '<|invoke', '<invoke',
|
||||
];
|
||||
|
||||
function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) {
|
||||
@@ -188,11 +190,10 @@ function hasXMLToolTagBoundary(text, idx) {
|
||||
}
|
||||
|
||||
function hasOpenXMLToolTag(captured) {
|
||||
const lower = captured.toLowerCase();
|
||||
for (const pair of XML_TOOL_TAG_PAIRS) {
|
||||
const openIdx = lower.indexOf(pair.open);
|
||||
const openIdx = findXMLOpenOutsideCDATA(captured, pair.open, 0);
|
||||
if (openIdx >= 0) {
|
||||
if (findXMLCloseOutsideCDATA(captured, pair.close, openIdx + pair.open.length) < 0) {
|
||||
if (findMatchingXMLToolWrapperClose(captured, pair.open, pair.close, openIdx) < 0) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
@@ -203,7 +204,9 @@ function hasOpenXMLToolTag(captured) {
|
||||
function containsAnyToolCallWrapper(lower) {
|
||||
return lower.includes('<tool_calls') ||
|
||||
lower.includes('<|dsml|tool_calls') ||
|
||||
lower.includes('<|dsml tool_calls') ||
|
||||
lower.includes('<dsml|tool_calls') ||
|
||||
lower.includes('<dsml tool_calls') ||
|
||||
lower.includes('<|tool_calls') ||
|
||||
lower.includes('<|tool_calls');
|
||||
}
|
||||
@@ -211,7 +214,7 @@ function containsAnyToolCallWrapper(lower) {
|
||||
function firstInvokeIndex(lower) {
|
||||
const xmlIdx = lower.indexOf('<invoke');
|
||||
// Check all DSML-like invoke prefixes.
|
||||
const dsmlPrefixes = ['<|dsml|invoke', '<dsml|invoke', '<|invoke', '<|invoke'];
|
||||
const dsmlPrefixes = ['<|dsml|invoke', '<|dsml invoke', '<dsml|invoke', '<dsml invoke', '<|invoke', '<|invoke'];
|
||||
let dsmlIdx = -1;
|
||||
for (const prefix of dsmlPrefixes) {
|
||||
const idx = lower.indexOf(prefix);
|
||||
|
||||
@@ -3,8 +3,12 @@
|
||||
const XML_TOOL_SEGMENT_TAGS = [
|
||||
'<|dsml|tool_calls>', '<|dsml|tool_calls\n', '<|dsml|tool_calls ',
|
||||
'<|dsml|invoke ', '<|dsml|invoke\n', '<|dsml|invoke\t', '<|dsml|invoke\r',
|
||||
'<|dsml tool_calls>', '<|dsml tool_calls\n', '<|dsml tool_calls ',
|
||||
'<|dsml invoke ', '<|dsml invoke\n', '<|dsml invoke\t', '<|dsml invoke\r',
|
||||
'<dsml|tool_calls>', '<dsml|tool_calls\n', '<dsml|tool_calls ',
|
||||
'<dsml|invoke ', '<dsml|invoke\n', '<dsml|invoke\t', '<dsml|invoke\r',
|
||||
'<dsml tool_calls>', '<dsml tool_calls\n', '<dsml tool_calls ',
|
||||
'<dsml invoke ', '<dsml invoke\n', '<dsml invoke\t', '<dsml invoke\r',
|
||||
'<|tool_calls>', '<|tool_calls\n', '<|tool_calls ',
|
||||
'<|invoke ', '<|invoke\n', '<|invoke\t', '<|invoke\r',
|
||||
'<|tool_calls>', '<|tool_calls\n', '<|tool_calls ',
|
||||
@@ -15,7 +19,9 @@ const XML_TOOL_SEGMENT_TAGS = [
|
||||
|
||||
const XML_TOOL_OPENING_TAGS = [
|
||||
'<|dsml|tool_calls',
|
||||
'<|dsml tool_calls',
|
||||
'<dsml|tool_calls',
|
||||
'<dsml tool_calls',
|
||||
'<|tool_calls',
|
||||
'<|tool_calls',
|
||||
'<tool_calls',
|
||||
@@ -23,7 +29,9 @@ const XML_TOOL_OPENING_TAGS = [
|
||||
|
||||
const XML_TOOL_CLOSING_TAGS = [
|
||||
'</|dsml|tool_calls>',
|
||||
'</|dsml tool_calls>',
|
||||
'</dsml|tool_calls>',
|
||||
'</dsml tool_calls>',
|
||||
'</|tool_calls>',
|
||||
'</|tool_calls>',
|
||||
'</tool_calls>',
|
||||
|
||||
@@ -26,6 +26,18 @@ var dsmlToolMarkupAliases = []struct {
|
||||
{"</|dsml|invoke>", "</invoke>"},
|
||||
{"<|dsml|parameter", "<parameter"},
|
||||
{"</|dsml|parameter>", "</parameter>"},
|
||||
{"<|dsml tool_calls", "<tool_calls"},
|
||||
{"</|dsml tool_calls>", "</tool_calls>"},
|
||||
{"<|dsml invoke", "<invoke"},
|
||||
{"</|dsml invoke>", "</invoke>"},
|
||||
{"<|dsml parameter", "<parameter"},
|
||||
{"</|dsml parameter>", "</parameter>"},
|
||||
{"<dsml tool_calls", "<tool_calls"},
|
||||
{"</dsml tool_calls>", "</tool_calls>"},
|
||||
{"<dsml invoke", "<invoke"},
|
||||
{"</dsml invoke>", "</invoke>"},
|
||||
{"<dsml parameter", "<parameter"},
|
||||
{"</dsml parameter>", "</parameter>"},
|
||||
{"<dsml|tool_calls", "<tool_calls"},
|
||||
{"</dsml|tool_calls>", "</tool_calls>"},
|
||||
{"<dsml|invoke", "<invoke"},
|
||||
|
||||
@@ -94,7 +94,9 @@ func filterToolCallsDetailed(parsed []ParsedToolCall) ([]ParsedToolCall, []strin
|
||||
func looksLikeToolCallSyntax(text string) bool {
|
||||
lower := strings.ToLower(text)
|
||||
return strings.Contains(lower, "<|dsml|tool_calls") ||
|
||||
strings.Contains(lower, "<|dsml tool_calls") ||
|
||||
strings.Contains(lower, "<dsml|tool_calls") ||
|
||||
strings.Contains(lower, "<dsml tool_calls") ||
|
||||
strings.Contains(lower, "<|tool_calls") ||
|
||||
strings.Contains(lower, "<|tool_calls") ||
|
||||
strings.Contains(lower, "<tool_calls")
|
||||
|
||||
@@ -444,6 +444,40 @@ func TestParseToolCallsParsesAfterFourBacktickFence(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsToleratesDSMLSpaceSeparatorTypo(t *testing.T) {
|
||||
text := strings.Join([]string{
|
||||
"<|DSML tool_calls>",
|
||||
"<|DSML invoke name=\"Read\">",
|
||||
"<|DSML parameter name=\"file_path\"><![CDATA[/tmp/input.txt]]></|DSML parameter>",
|
||||
"</|DSML invoke>",
|
||||
"</|DSML tool_calls>",
|
||||
}, "\n")
|
||||
calls := ParseToolCalls(text, []string{"Read"})
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected one call from DSML space-separator typo, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "Read" {
|
||||
t.Fatalf("expected Read call, got %#v", calls[0])
|
||||
}
|
||||
if got, _ := calls[0].Input["file_path"].(string); got != "/tmp/input.txt" {
|
||||
t.Fatalf("expected file_path to parse, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsDoesNotAcceptDSMLSpaceLookalikeTagName(t *testing.T) {
|
||||
text := strings.Join([]string{
|
||||
"<|DSML tool_calls_extra>",
|
||||
"<|DSML invoke name=\"Read\">",
|
||||
"<|DSML parameter name=\"file_path\">/tmp/input.txt</|DSML parameter>",
|
||||
"</|DSML invoke>",
|
||||
"</|DSML tool_calls_extra>",
|
||||
}, "\n")
|
||||
calls := ParseToolCalls(text, []string{"Read"})
|
||||
if len(calls) != 0 {
|
||||
t.Fatalf("expected no calls from lookalike tag, got %#v", calls)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsSkipsProseMentionOfSameWrapperVariant(t *testing.T) {
|
||||
text := strings.Join([]string{
|
||||
"Summary: support canonical <tool_calls> and DSML <|DSML|tool_calls> wrappers.",
|
||||
|
||||
@@ -554,3 +554,64 @@ func TestSieve_ChineseReviewSamplePreservesInlineDSMLMention(t *testing.T) {
|
||||
t.Fatalf("真实工具块不应泄漏到正文, got %q", text.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestSieve_ToleratesDSMLSpaceSeparatorTypo(t *testing.T) {
|
||||
var state State
|
||||
chunks := []string{
|
||||
"准备读取文件。\n",
|
||||
"<|DSML tool_calls>\n",
|
||||
"<|DSML invoke name=\"Read\">\n",
|
||||
"<|DSML parameter name=\"file_path\"><![CDATA[/tmp/input.txt]]></|DSML parameter>\n",
|
||||
"</|DSML invoke>\n",
|
||||
"</|DSML tool_calls>",
|
||||
}
|
||||
var events []Event
|
||||
for _, c := range chunks {
|
||||
events = append(events, ProcessChunk(&state, c, []string{"Read"})...)
|
||||
}
|
||||
events = append(events, Flush(&state, []string{"Read"})...)
|
||||
|
||||
var text strings.Builder
|
||||
var filePath string
|
||||
callCount := 0
|
||||
for _, e := range events {
|
||||
text.WriteString(e.Content)
|
||||
for _, call := range e.ToolCalls {
|
||||
callCount++
|
||||
filePath, _ = call.Input["file_path"].(string)
|
||||
}
|
||||
}
|
||||
|
||||
if callCount != 1 {
|
||||
t.Fatalf("应解析出 1 个工具调用,got %d, text=%q", callCount, text.String())
|
||||
}
|
||||
if filePath != "/tmp/input.txt" {
|
||||
t.Fatalf("应解析出 file_path,got %q", filePath)
|
||||
}
|
||||
if !strings.Contains(text.String(), "准备读取文件") {
|
||||
t.Fatalf("前置正文应保留, got %q", text.String())
|
||||
}
|
||||
if strings.Contains(text.String(), "<|DSML invoke") {
|
||||
t.Fatalf("真实工具块不应泄漏到正文, got %q", text.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestSieve_DSMLSpaceLookalikeTagNameStaysText(t *testing.T) {
|
||||
var state State
|
||||
input := "<|DSML tool_calls_extra><|DSML invoke name=\"Read\"><|DSML parameter name=\"file_path\">/tmp/input.txt</|DSML parameter></|DSML invoke></|DSML tool_calls_extra>"
|
||||
events := ProcessChunk(&state, input, []string{"Read"})
|
||||
events = append(events, Flush(&state, []string{"Read"})...)
|
||||
|
||||
var text strings.Builder
|
||||
callCount := 0
|
||||
for _, e := range events {
|
||||
text.WriteString(e.Content)
|
||||
callCount += len(e.ToolCalls)
|
||||
}
|
||||
if callCount != 0 {
|
||||
t.Fatalf("相似标签名不应触发工具调用,got %d", callCount)
|
||||
}
|
||||
if text.String() != input {
|
||||
t.Fatalf("相似标签名应作为正文透传, got %q", text.String())
|
||||
}
|
||||
}
|
||||
|
||||
@@ -99,11 +99,10 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string,
|
||||
// hasOpenXMLToolTag returns true if captured text contains an XML tool opening tag
|
||||
// whose SPECIFIC closing tag has not appeared yet.
|
||||
func hasOpenXMLToolTag(captured string) bool {
|
||||
lower := strings.ToLower(captured)
|
||||
for _, pair := range xmlToolCallTagPairs {
|
||||
openIdx := strings.Index(lower, pair.open)
|
||||
openIdx := findXMLOpenOutsideCDATA(captured, pair.open, 0)
|
||||
if openIdx >= 0 {
|
||||
if findXMLCloseOutsideCDATA(captured, pair.close, openIdx+len(pair.open)) < 0 {
|
||||
if findMatchingXMLToolWrapperClose(captured, pair.open, pair.close, openIdx) < 0 {
|
||||
return true
|
||||
}
|
||||
}
|
||||
@@ -117,17 +116,25 @@ func shouldKeepBareInvokeCapture(captured string) bool {
|
||||
if invokeIdx < 0 || containsAnyToolCallWrapper(lower) {
|
||||
return false
|
||||
}
|
||||
wrapperClose := "</tool_calls>"
|
||||
invokeOpenLen := len("<invoke")
|
||||
invokeClose := "</invoke>"
|
||||
parameterOpen := "<parameter"
|
||||
if dsml {
|
||||
wrapperClose = "</|dsml|tool_calls>"
|
||||
invokeOpenLen = len("<|dsml|invoke")
|
||||
invokeClose = "</|dsml|invoke>"
|
||||
parameterOpen = "<|dsml|parameter"
|
||||
}
|
||||
if findXMLCloseOutsideCDATA(captured, wrapperClose, invokeIdx) > invokeIdx {
|
||||
if dsml && strings.HasPrefix(lower[invokeIdx:], "<|dsml invoke") {
|
||||
invokeOpenLen = len("<|dsml invoke")
|
||||
parameterOpen = "<|dsml parameter"
|
||||
}
|
||||
if dsml && strings.HasPrefix(lower[invokeIdx:], "<dsml|invoke") {
|
||||
invokeOpenLen = len("<dsml|invoke")
|
||||
parameterOpen = "<dsml|parameter"
|
||||
}
|
||||
if dsml && strings.HasPrefix(lower[invokeIdx:], "<dsml invoke") {
|
||||
invokeOpenLen = len("<dsml invoke")
|
||||
parameterOpen = "<dsml parameter"
|
||||
}
|
||||
if findAnyXMLCloseOutsideCDATA(captured, possibleWrapperCloseTags(dsml), invokeIdx) > invokeIdx {
|
||||
return true
|
||||
}
|
||||
|
||||
@@ -141,9 +148,15 @@ func shouldKeepBareInvokeCapture(captured string) bool {
|
||||
return true
|
||||
}
|
||||
|
||||
invokeCloseIdx := findXMLCloseOutsideCDATA(captured, invokeClose, startEnd+1)
|
||||
invokeCloseIdx := findAnyXMLCloseOutsideCDATA(captured, possibleInvokeCloseTags(dsml), startEnd+1)
|
||||
if invokeCloseIdx >= 0 {
|
||||
afterClose := captured[invokeCloseIdx+len(invokeClose):]
|
||||
afterClose := captured[invokeCloseIdx:]
|
||||
for _, closeTag := range possibleInvokeCloseTags(dsml) {
|
||||
if strings.HasPrefix(strings.ToLower(afterClose), closeTag) {
|
||||
afterClose = afterClose[len(closeTag):]
|
||||
break
|
||||
}
|
||||
}
|
||||
return strings.TrimSpace(afterClose) == ""
|
||||
}
|
||||
|
||||
@@ -156,15 +169,42 @@ func shouldKeepBareInvokeCapture(captured string) bool {
|
||||
func containsAnyToolCallWrapper(lower string) bool {
|
||||
return strings.Contains(lower, "<tool_calls") ||
|
||||
strings.Contains(lower, "<|dsml|tool_calls") ||
|
||||
strings.Contains(lower, "<|dsml tool_calls") ||
|
||||
strings.Contains(lower, "<dsml|tool_calls") ||
|
||||
strings.Contains(lower, "<dsml tool_calls") ||
|
||||
strings.Contains(lower, "<|tool_calls") ||
|
||||
strings.Contains(lower, "<|tool_calls")
|
||||
}
|
||||
|
||||
func possibleWrapperCloseTags(dsml bool) []string {
|
||||
if !dsml {
|
||||
return []string{"</tool_calls>"}
|
||||
}
|
||||
return []string{"</|dsml|tool_calls>", "</|dsml tool_calls>", "</dsml|tool_calls>", "</dsml tool_calls>", "</|tool_calls>", "</|tool_calls>"}
|
||||
}
|
||||
|
||||
func possibleInvokeCloseTags(dsml bool) []string {
|
||||
if !dsml {
|
||||
return []string{"</invoke>"}
|
||||
}
|
||||
return []string{"</|dsml|invoke>", "</|dsml invoke>", "</dsml|invoke>", "</dsml invoke>", "</|invoke>", "</|invoke>"}
|
||||
}
|
||||
|
||||
func findAnyXMLCloseOutsideCDATA(s string, closeTags []string, start int) int {
|
||||
best := -1
|
||||
for _, closeTag := range closeTags {
|
||||
idx := findXMLCloseOutsideCDATA(s, closeTag, start)
|
||||
if idx >= 0 && (best < 0 || idx < best) {
|
||||
best = idx
|
||||
}
|
||||
}
|
||||
return best
|
||||
}
|
||||
|
||||
func firstInvokeIndex(lower string) (int, bool) {
|
||||
xmlIdx := strings.Index(lower, "<invoke")
|
||||
// Check all DSML-like invoke prefixes.
|
||||
dsmlPrefixes := []string{"<|dsml|invoke", "<dsml|invoke", "<|invoke", "<|invoke"}
|
||||
dsmlPrefixes := []string{"<|dsml|invoke", "<|dsml invoke", "<dsml|invoke", "<dsml invoke", "<|invoke", "<|invoke"}
|
||||
dsmlIdx := -1
|
||||
for _, prefix := range dsmlPrefixes {
|
||||
idx := strings.Index(lower, prefix)
|
||||
|
||||
@@ -5,11 +5,13 @@ import "regexp"
|
||||
// --- XML tool call support for the streaming sieve ---
|
||||
|
||||
//nolint:unused // kept as explicit tag inventory for future XML sieve refinements.
|
||||
var xmlToolCallClosingTags = []string{"</tool_calls>", "</|dsml|tool_calls>", "</dsml|tool_calls>", "</|tool_calls>", "</|tool_calls>"}
|
||||
var xmlToolCallClosingTags = []string{"</tool_calls>", "</|dsml|tool_calls>", "</|dsml tool_calls>", "</dsml|tool_calls>", "</dsml tool_calls>", "</|tool_calls>", "</|tool_calls>"}
|
||||
var xmlToolCallOpeningTags = []string{
|
||||
"<tool_calls", "<invoke",
|
||||
"<|dsml|tool_calls", "<|dsml|invoke",
|
||||
"<|dsml tool_calls", "<|dsml invoke",
|
||||
"<dsml|tool_calls", "<dsml|invoke",
|
||||
"<dsml tool_calls", "<dsml invoke",
|
||||
"<|tool_calls", "<|invoke",
|
||||
"<|tool_calls", "<|invoke",
|
||||
}
|
||||
@@ -18,7 +20,9 @@ var xmlToolCallOpeningTags = []string{
|
||||
// Order matters: longer/wrapper tags must be checked first.
|
||||
var xmlToolCallTagPairs = []struct{ open, close string }{
|
||||
{"<|dsml|tool_calls", "</|dsml|tool_calls>"},
|
||||
{"<|dsml tool_calls", "</|dsml tool_calls>"},
|
||||
{"<dsml|tool_calls", "</dsml|tool_calls>"},
|
||||
{"<dsml tool_calls", "</dsml tool_calls>"},
|
||||
{"<|tool_calls", "</|tool_calls>"},
|
||||
{"<|tool_calls", "</|tool_calls>"},
|
||||
{"<tool_calls", "</tool_calls>"},
|
||||
@@ -33,8 +37,12 @@ var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)((?:<tool_calls\b|<\|dsml
|
||||
var xmlToolTagsToDetect = []string{
|
||||
"<|dsml|tool_calls>", "<|dsml|tool_calls\n", "<|dsml|tool_calls ",
|
||||
"<|dsml|invoke ", "<|dsml|invoke\n", "<|dsml|invoke\t", "<|dsml|invoke\r",
|
||||
"<|dsml tool_calls>", "<|dsml tool_calls\n", "<|dsml tool_calls ",
|
||||
"<|dsml invoke ", "<|dsml invoke\n", "<|dsml invoke\t", "<|dsml invoke\r",
|
||||
"<dsml|tool_calls>", "<dsml|tool_calls\n", "<dsml|tool_calls ",
|
||||
"<dsml|invoke ", "<dsml|invoke\n", "<dsml|invoke\t", "<dsml|invoke\r",
|
||||
"<dsml tool_calls>", "<dsml tool_calls\n", "<dsml tool_calls ",
|
||||
"<dsml invoke ", "<dsml invoke\n", "<dsml invoke\t", "<dsml invoke\r",
|
||||
"<|tool_calls>", "<|tool_calls\n", "<|tool_calls ",
|
||||
"<|invoke ", "<|invoke\n", "<|invoke\t", "<|invoke\r",
|
||||
"<|tool_calls>", "<|tool_calls\n", "<|tool_calls ",
|
||||
|
||||
@@ -57,6 +57,20 @@ test('parseToolCalls parses DSML shell as XML-compatible tool call', () => {
|
||||
assert.deepEqual(calls[0].input, { path: 'README.MD' });
|
||||
});
|
||||
|
||||
test('parseToolCalls tolerates DSML space-separator typo', () => {
|
||||
const payload = '<|DSML tool_calls><|DSML invoke name="Read"><|DSML parameter name="file_path"><![CDATA[/tmp/input.txt]]></|DSML parameter></|DSML invoke></|DSML tool_calls>';
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 1);
|
||||
assert.equal(calls[0].name, 'Read');
|
||||
assert.deepEqual(calls[0].input, { file_path: '/tmp/input.txt' });
|
||||
});
|
||||
|
||||
test('parseToolCalls ignores DSML space lookalike tag names', () => {
|
||||
const payload = '<|DSML tool_calls_extra><|DSML invoke name="Read"><|DSML parameter name="file_path">/tmp/input.txt</|DSML parameter></|DSML invoke></|DSML tool_calls_extra>';
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 0);
|
||||
});
|
||||
|
||||
test('parseToolCalls keeps canonical XML examples inside DSML CDATA', () => {
|
||||
const content = '<tool_calls><invoke name="demo"><parameter name="value">x</parameter></invoke></tool_calls>';
|
||||
const payload = `<|DSML|tool_calls><|DSML|invoke name="write_file"><|DSML|parameter name="path">notes.md</|DSML|parameter><|DSML|parameter name="content"><![CDATA[${content}]]></|DSML|parameter></|DSML|invoke></|DSML|tool_calls>`;
|
||||
@@ -107,6 +121,32 @@ test('sieve emits tool_calls after prose mentions same wrapper variant', () => {
|
||||
assert.equal(collectText(events).includes('Summary:'), true);
|
||||
});
|
||||
|
||||
test('sieve emits tool_calls for DSML space-separator typo', () => {
|
||||
const events = runSieve([
|
||||
'准备读取文件。\n',
|
||||
'<|DSML tool_calls>\n',
|
||||
'<|DSML invoke name="Read">\n',
|
||||
'<|DSML parameter name="file_path"><![CDATA[/tmp/input.txt]]></|DSML parameter>\n',
|
||||
'</|DSML invoke>\n',
|
||||
'</|DSML tool_calls>',
|
||||
], ['Read']);
|
||||
const text = collectText(events);
|
||||
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
|
||||
assert.equal(finalCalls.length, 1);
|
||||
assert.equal(finalCalls[0].name, 'Read');
|
||||
assert.equal(finalCalls[0].input.file_path, '/tmp/input.txt');
|
||||
assert.equal(text.includes('准备读取文件'), true);
|
||||
assert.equal(text.includes('<|DSML invoke'), false);
|
||||
});
|
||||
|
||||
test('sieve keeps DSML space lookalike tag names as text', () => {
|
||||
const input = '<|DSML tool_calls_extra><|DSML invoke name="Read"><|DSML parameter name="file_path">/tmp/input.txt</|DSML parameter></|DSML invoke></|DSML tool_calls_extra>';
|
||||
const events = runSieve([input], ['Read']);
|
||||
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
|
||||
assert.equal(finalCalls.length, 0);
|
||||
assert.equal(collectText(events), input);
|
||||
});
|
||||
|
||||
test('sieve preserves review body with alias mentions before real DSML tool calls', () => {
|
||||
const events = runSieve([
|
||||
"Done reviewing the diff. Here's my analysis before we commit:\n\n",
|
||||
|
||||
Reference in New Issue
Block a user