Compare commits

...

5 Commits

Author SHA1 Message Date
CJACK.
445c95a4f2 Merge pull request #379 from CJackHwang/dev
Merge pull request #377 from CJackHwang/codex/run-all-tests-and-fix-failures

Fix failing current-input token accounting test
2026-05-01 16:12:17 +08:00
CJACK
0a6ef8e3f2 fix: remove bufio.Scanner 2MiB line limit for SSE; support quasi_status direct patch
Replace bufio.Scanner with bufio.NewReaderSize + ReadBytes('\n') across all
SSE read paths to preserve long single-line data (e.g. write_file content).
Add quasi_status and auto_continue handling as direct path-based patches in
both Go continue observer and Node vercel_stream_impl, mirroring existing
batch-patch logic. Add 2MiB+ line throughput tests at every SSE layer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 15:45:17 +08:00
CJACK
fd0ec29991 refactor: generalize DSML tag parsing to tolerate model noise; split tiktoken by build tags
Replace hardcoded DSML typo variant lists in Go/Node tool call parsers with
generalized prefix consumption that tolerates repeated leading <, repeated DSML
prefix noise, and trailing pipe terminators. Split tiktoken-dependent token
counting into a build-tagged file for non-cgo platform compatibility. Add /data
directory to Dockerfile for bind-mount permissions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 15:17:11 +08:00
CJACK
2671298439 fix: coalesce small stream deltas to prevent character swallowing; add read-tool cache guard
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 13:53:27 +08:00
CJACK.
94c1acace5 Merge pull request #369 from CJackHwang/dev
Merge pull request #368 from CJackHwang/codex/fix-review-issues-for-pr-#364

Restore thinking fallback for tool-call detection and drop history.txt wrapper tags
2026-04-29 23:42:50 +08:00
39 changed files with 1143 additions and 369 deletions

1
.gitignore vendored
View File

@@ -29,6 +29,7 @@ yarn.lock
pnpm-lock.yaml
# Build artifacts
dist/
*.tsbuildinfo
.cache/
.parcel-cache/

View File

@@ -29,7 +29,7 @@ WORKDIR /app
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates \
&& groupadd -r ds2api && useradd -r -g ds2api -d /app -s /sbin/nologin ds2api \
&& mkdir -p /app/data && chown -R ds2api:ds2api /app \
&& mkdir -p /app/data /data && chown -R ds2api:ds2api /app /data \
&& rm -rf /var/lib/apt/lists/*
COPY --from=busybox-tools /bin/busybox /usr/local/bin/busybox
EXPOSE 5001

View File

@@ -247,6 +247,7 @@ docker-compose logs -f
默认 `docker-compose.yml` 会把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`,请设置 `DS2API_HOST_PORT=5001`(或者手动调整 `ports` 配置)。
同时默认把 `./config.json` 挂载到容器 `/data/config.json`,并设置 `DS2API_CONFIG_PATH=/data/config.json`,用于避免 `/app` 只读导致运行时 token 持久化失败。
镜像会预创建 `/data` 并授权给非 root 的 `ds2api` 用户;如果使用单文件 bind mount请确保宿主机 `config.json` 对容器用户可读写,例如 `chmod 644 config.json`。
更新镜像:`docker-compose up -d --build`

View File

@@ -131,6 +131,7 @@ docker-compose logs -f
The default `docker-compose.yml` directly uses `ghcr.io/cjackhwang/ds2api:latest` and maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
The compose template also defaults to `DS2API_CONFIG_PATH=/data/config.json` with `./config.json:/data/config.json` mounted, so deployments avoid read-only `/app` persistence issues by default.
The image pre-creates `/data` and grants it to the non-root `ds2api` user. If you bind-mount a single host file, make sure `config.json` is readable/writable by the container user, for example with `chmod 644 config.json`; otherwise Linux UID/GID mismatches can still cause `open /data/config.json: permission denied`.
Compatibility note: when `DS2API_CONFIG_PATH` is unset and runtime base dir is `/app`, newer versions prefer `/data/config.json`; if that file is missing but legacy `/app/config.json` exists, DS2API automatically falls back to the legacy path to avoid post-upgrade config loss.
If you want a pinned version instead of `latest`, you can also pull a specific tag directly:

View File

@@ -131,6 +131,7 @@ docker-compose logs -f
默认 `docker-compose.yml` 直接使用 `ghcr.io/cjackhwang/ds2api:latest`,并把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`,请设置 `DS2API_HOST_PORT=5001`(或者手动调整 `ports` 配置)。
Compose 模板还会默认设置 `DS2API_CONFIG_PATH=/data/config.json` 并挂载 `./config.json:/data/config.json`,优先避免 `/app` 只读带来的配置持久化问题。
镜像内会预创建 `/data` 并授权给非 root 的 `ds2api` 用户;如果你使用 bind mount 单文件,请确保宿主机 `config.json` 至少可被容器用户读取/写入,例如 `chmod 644 config.json`,否则 Linux UID/GID 不一致时仍可能出现 `open /data/config.json: permission denied`
兼容说明:若未设置 `DS2API_CONFIG_PATH` 且运行目录是 `/app`,新版本会优先使用 `/data/config.json`;当该文件不存在但检测到历史 `/app/config.json` 时,会自动回退读取旧路径,避免升级后“配置丢失”。
如需固定版本,也可以直接拉取指定 tag

View File

@@ -156,10 +156,12 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
工具调用正例现在优先示范官方 DSML 风格:`<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`
兼容层仍接受旧式纯 `<tool_calls>` wrapper但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。
Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限;当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时,非流式收集、流式解析和 auto-continue 透传都会保留完整行,再进入同一套工具解析与序列化流程。
在 assistant 最终回包阶段,如果某个 tool 参数在声明 schema 中明确是 `string`,兼容层会在把解析后的 `tool_calls` / `function_call` 重新序列化成 OpenAI / Responses / Claude 可见参数前,递归把该路径上的 number / bool / object / array 统一转成字符串;其中 object / array 会压成紧凑 JSON 字符串。这个保护只对 schema 明确声明为 string 的路径生效,不会改写本来就是 `number` / `boolean` / `object` / `array` 的参数。这样可以兼容 DeepSeek 输出了结构化片段、但上游客户端工具 schema 又严格要求字符串参数的场景(例如 `content``prompt``path``taskId` 等)。
工具 schema 的权威来源始终是**当前请求实际携带的 schema**,而不是同名工具在其他 runtimeClaude Code / OpenCode / Codex 等)里的默认印象。兼容层现在会同时兼容 OpenAI 风格 `function.parameters`、直接工具对象上的 `parameters` / `input_schema`、以及 camelCase 的 `inputSchema` / `schema`,并在最终输出阶段按这份请求内 schema 决定是保留 array/object还是仅对明确声明为 `string` 的路径做字符串化。该规则同样适用于 Claude 的流式收尾和 Vercel Node 流式 tool-call formatter避免不同 runtime 因 schema shape 差异而出现同名工具参数类型漂移。
正例中的工具名只会来自当前请求实际声明的工具;如果当前请求没有足够的已知工具形态,就省略对应的单工具、多工具或嵌套示例,避免把不可用工具名写进 prompt。
对执行类工具,脚本内容必须进入执行参数本身:`Bash` / `execute_command` 使用 `command``exec_command` 使用 `cmd`;不要把脚本示范成 `path` / `content` 文件写入参数。
如果当前请求声明了 `Read` / `read_file` 这类读取工具,兼容层会额外注入一条 read-tool cache guard当读取结果只表示“文件未变更 / 已在历史中 / 请引用先前上下文 / 没有正文内容”时模型必须把它视为内容不可用不能反复调用同一个无正文读取应改为请求完整正文读取能力或向用户说明需要重新提供文件内容。这个约束只缓解客户端缓存返回空内容导致的死循环DS2API 不会也无法凭空恢复客户端本地文件正文。
OpenAI 路径实现:
[internal/promptcompat/tool_prompt.go](../internal/promptcompat/tool_prompt.go)

View File

@@ -26,7 +26,7 @@
</tool_calls>
```
这不是原生 DSML 全链路实现。DSML 只作为 prompt 外壳和解析入口别名;进入 parser 前会归一化成 `<tool_calls>` / `<invoke>` / `<parameter>`,内部仍以现有 XML 解析语义为准。
这不是原生 DSML 全链路实现。DSML 主要用于让模型有意识地输出协议标识,隔离普通 XML 语义;进入 parser 前会按固定本地标签名归一化成 `<tool_calls>` / `<invoke>` / `<parameter>`,内部仍以现有 XML 解析语义为准。
约束:
@@ -39,7 +39,8 @@
兼容修复:
- 如果模型漏掉 opening wrapper但后面仍输出了一个或多个 invoke 并以 closing wrapper 收尾Go 解析链路会在解析前补回缺失的 opening wrapper。
- 如果模型把 DSML 标签里的分隔符 `|` 写漏成空格(例如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`,或无 leading pipe 的 `<DSML tool_calls>` 形态),或把 `DSML` 与工具标签名直接黏连(例如 `<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`),或把最前面的 pipe 误写成全宽竖线(例如 `<DSML|tool_calls>` / `<DSML|invoke>` / `<DSML|parameter>`Go / Node 会在固定工具标签名范围内归一化;相似但非工具标签名(如 `tool_calls_extra`)仍按普通文本处理。
- Go / Node 解析层不再枚举每一种 DSML typo。它会把工具标签名前的 `DSML`、管道符 `|` / ``、空白、重复 leading `<` 视为可容忍的协议噪声,然后只匹配固定本地标签名 `tool_calls` / `invoke` / `parameter`。例如 `<DSML|tool_calls>``<<|DSML|tool_calls>``<|DSML tool_calls>``<DSMLtool_calls>``<<DSML|DSML|tool_calls>` 都会归一化;相似但非固定标签名(如 `tool_calls_extra`)仍按普通文本处理。
- 如果模型在固定工具标签名后多输出一个尾部管道符,例如 `<|DSML|tool_calls|` / `<|DSML|invoke|` / `<|DSML|parameter|`,兼容层会把这个尾部 `|` 当作异常标签终止符并补齐缺失的 `>`;如果后面已经有 `>`,也会消费这个多余 `|` 后再归一化。
- 这是一个针对常见模型失误的窄修复不改变推荐输出格式prompt 仍要求模型直接输出完整 DSML 外壳。
-`<invoke ...>` / `<parameter ...>` 不会被当成“已支持的工具语法”;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 才会进入工具调用路径。
@@ -53,7 +54,7 @@
在流式链路中Go / Node 一致):
- DSML `<|DSML|tool_calls>` wrapper、兼容变体(`<dsml|tool_calls>``<tool_calls>``<|tool_calls>``<DSML|tool_calls>`)、窄容错空格分隔形态(如 `<|DSML tool_calls>`)、黏连形态(如 `<DSMLtool_calls>`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- DSML `<|DSML|tool_calls>` wrapper、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapperGo 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
- 已识别成功的工具调用不会再次回流到普通文本
- 不符合新格式的块不会执行,并继续按原样文本透传
@@ -61,6 +62,7 @@
- 支持嵌套围栏(如 4 反引号嵌套 3 反引号)和 CDATA 内围栏保护
- 如果模型把 `<![CDATA[` 打开后却没有闭合,流式扫描阶段仍会保守地继续缓冲,不会误把 CDATA 里的示例 XML 当成真实工具调用;在最终 parse / flush 恢复阶段,会对这类 loose CDATA 做窄修复,尽量保住外层已完整包裹的真实工具调用
- 当文本中 mention 了某种标签名(如 `<dsml|tool_calls>` 或 Markdown inline code 里的 `<|DSML|tool_calls>`而后面紧跟真正工具调用时sieve 会跳过不可解析的 mention 候选并继续匹配后续真实工具块,不会因 mention 导致工具调用丢失,也不会截断 mention 后的正文
- Go 侧 SSE 读取不再使用 `bufio.Scanner` 的固定 token 上限;单个 `data:` 行中包含很长的写文件参数时,非流式收集、流式解析与 auto-continue 透传都应保留完整行,再交给 tool parser 处理
另外,`<parameter>` 的值如果本身是合法 JSON 字面量,也会按结构化值解析,而不是一律保留为字符串。例如 `123``true``null``[1,2]``{"a":1}` 都会还原成对应的 number / boolean / null / array / object。
结构化 XML 参数也会还原为 JSON 结构:如果参数体只包含一个或多个 `<item>...</item>` 子节点,会输出数组;嵌套对象里的 item-only 字段也同样按数组处理。例如 `<parameter name="questions"><item><question>...</question></item></parameter>` 会输出 `{"questions":[{"question":"..."}]}`,而不是 `{"questions":{"item":...}}`
@@ -94,7 +96,7 @@ node --test tests/node/stream-tool-sieve.test.js
- DSML `<|DSML|tool_calls>` wrapper 正常解析
- legacy canonical `<tool_calls>` wrapper 正常解析
- 别名变体(`<dsml|tool_calls>``<tool_calls>``<|tool_calls>`、DSML 空格分隔 typo`<|DSML tool_calls>`)和黏连 typo`<DSMLtool_calls>`)正常解析
- 固定本地标签名的 DSML 噪声容错形态(如 `<DSML|tool_calls>``<<|DSML|tool_calls>``<|DSML tool_calls>``<DSMLtool_calls>``<<DSML|DSML|tool_calls>`)正常解析
- 混搭标签DSML wrapper + canonical inner归一化后正常解析
- 波浪线围栏 `~~~` 内的示例不执行
- 嵌套围栏4 反引号嵌套 3 反引号)内的示例不执行

View File

@@ -133,33 +133,51 @@ func pumpAutoContinue(ctx context.Context, pw *io.PipeWriter, initial io.ReadClo
// sentinels are consumed (not forwarded) so that the downstream only sees
// one final [DONE] at the very end.
func streamBodyWithContinueState(ctx context.Context, pw *io.PipeWriter, body io.Reader, state *continueState) (bool, error) {
scanner := bufio.NewScanner(body)
scanner.Buffer(make([]byte, 0, 64*1024), 2*1024*1024)
reader := bufio.NewReaderSize(body, 64*1024)
hadDone := false
for scanner.Scan() {
for {
select {
case <-ctx.Done():
return hadDone, ctx.Err()
default:
}
line := append([]byte{}, scanner.Bytes()...)
trimmed := strings.TrimSpace(string(line))
if trimmed == "" {
continue
}
if strings.HasPrefix(trimmed, "data:") {
data := strings.TrimSpace(strings.TrimPrefix(trimmed, "data:"))
if data == "[DONE]" {
hadDone = true
continue
line, err := reader.ReadBytes('\n')
if len(line) == 0 && err != nil {
if err == io.EOF {
return hadDone, nil
}
state.observe(data)
return hadDone, err
}
if _, err := io.Copy(pw, bytes.NewReader(append(line, '\n'))); err != nil {
trimmed := strings.TrimSpace(string(line))
if trimmed != "" {
if strings.HasPrefix(trimmed, "data:") {
data := strings.TrimSpace(strings.TrimPrefix(trimmed, "data:"))
if data == "[DONE]" {
hadDone = true
if err != nil && err != io.EOF {
return hadDone, err
}
if err == io.EOF {
return hadDone, nil
}
continue
}
state.observe(data)
}
if !strings.HasSuffix(string(line), "\n") {
line = append(line, '\n')
}
if _, copyErr := io.Copy(pw, bytes.NewReader(line)); copyErr != nil {
return hadDone, copyErr
}
}
if err != nil {
if err == io.EOF {
return hadDone, nil
}
return hadDone, err
}
}
return hadDone, scanner.Err()
}
// observe extracts continue-relevant signals from an SSE JSON chunk.
@@ -175,34 +193,48 @@ func (s *continueState) observe(data string) {
if id := intFrom(chunk["response_message_id"]); id > 0 {
s.responseMessageID = id
}
// Path-based status: {"p": "response/status", "v": "FINISHED"}
if p, _ := chunk["p"].(string); p == "response/status" {
s.setStatus(asString(chunk["v"]))
}
s.observeDirectPatch(asString(chunk["p"]), chunk["v"])
if p, _ := chunk["p"].(string); p == "response" {
s.observeBatchPatches("response", chunk["v"])
} else {
s.observeBatchPatches("", chunk["v"])
}
// Nested v.response
v, _ := chunk["v"].(map[string]any)
if response, _ := v["response"].(map[string]any); response != nil {
if id := intFrom(response["message_id"]); id > 0 {
s.responseMessageID = id
}
s.setStatus(asString(response["status"]))
if autoContinue, ok := response["auto_continue"].(bool); ok && autoContinue {
if v, _ := chunk["v"].(map[string]any); v != nil {
s.observeResponseObject(v["response"])
}
if message, _ := chunk["message"].(map[string]any); message != nil {
s.observeResponseObject(message["response"])
}
}
func (s *continueState) observeDirectPatch(path string, value any) {
if s == nil {
return
}
switch strings.Trim(strings.TrimSpace(path), "/") {
case "response/status", "status", "response/quasi_status", "quasi_status":
s.setStatus(asString(value))
case "response/auto_continue", "auto_continue":
if v, ok := value.(bool); ok && v {
s.lastStatus = "AUTO_CONTINUE"
}
}
// Nested message.response
if message, _ := chunk["message"].(map[string]any); message != nil {
if response, _ := message["response"].(map[string]any); response != nil {
if id := intFrom(response["message_id"]); id > 0 {
s.responseMessageID = id
}
s.setStatus(asString(response["status"]))
}
}
func (s *continueState) observeResponseObject(raw any) {
if s == nil {
return
}
response, _ := raw.(map[string]any)
if response == nil {
return
}
if id := intFrom(response["message_id"]); id > 0 {
s.responseMessageID = id
}
s.setStatus(asString(response["status"]))
if autoContinue, ok := response["auto_continue"].(bool); ok && autoContinue {
s.lastStatus = "AUTO_CONTINUE"
}
}
@@ -230,6 +262,10 @@ func (s *continueState) observeBatchPatches(parentPath string, raw any) {
switch strings.Trim(strings.TrimSpace(fullPath), "/") {
case "response/status", "status", "response/quasi_status", "quasi_status":
s.setStatus(asString(m["v"]))
case "response/auto_continue", "auto_continue":
if v, ok := m["v"].(bool); ok && v {
s.lastStatus = "AUTO_CONTINUE"
}
}
}
}

View File

@@ -150,6 +150,62 @@ func TestAutoContinueDoesNotTriggerOnPlainWIPWithoutExplicitContinuationSignal(t
}
}
func TestAutoContinuePassesThroughLongSingleSSELine(t *testing.T) {
payload := strings.Repeat("x", 2*1024*1024+4096)
initialBody := `data: {"p":"response/content","v":"` + payload + `"}` + "\n" +
`data: [DONE]` + "\n"
body := newAutoContinueBody(context.Background(), io.NopCloser(strings.NewReader(initialBody)), "session-123", 8, func(context.Context, string, int) (*http.Response, error) {
return nil, errors.New("continue should not have been called")
})
defer func() { _ = body.Close() }()
out, err := io.ReadAll(body)
if err != nil {
t.Fatalf("read body failed: %v", err)
}
if !bytes.Contains(out, []byte(payload)) {
t.Fatalf("expected long SSE payload to pass through, got len=%d want payload len=%d", len(out), len(payload))
}
if !bytes.Contains(out, []byte(`data: [DONE]`)) {
t.Fatalf("expected final DONE sentinel in body, got len=%d", len(out))
}
}
func TestAutoContinueTriggersOnDirectQuasiStatusIncomplete(t *testing.T) {
initialBody := strings.Join([]string{
`data: {"response_message_id":321,"p":"response/content","v":"<tool_calls><invoke name=\"write_file\"><parameter name=\"content\"><![CDATA[part-one"}`,
`data: {"p":"response/quasi_status","v":"INCOMPLETE"}`,
`data: [DONE]`,
}, "\n") + "\n"
var continueCalls atomic.Int32
body := newAutoContinueBody(context.Background(), io.NopCloser(strings.NewReader(initialBody)), "session-123", 8, func(context.Context, string, int) (*http.Response, error) {
continueCalls.Add(1)
return &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader(
`data: {"response_message_id":322,"p":"response/content","v":"-part-two]]></parameter></invoke></tool_calls>"}` + "\n" +
`data: {"p":"response/status","v":"FINISHED"}` + "\n" +
`data: [DONE]` + "\n",
)),
}, nil
})
defer func() { _ = body.Close() }()
out, err := io.ReadAll(body)
if err != nil {
t.Fatalf("read body failed: %v", err)
}
if continueCalls.Load() != 1 {
t.Fatalf("expected exactly one continue call, got %d", continueCalls.Load())
}
if !bytes.Contains(out, []byte("part-one")) || !bytes.Contains(out, []byte("-part-two")) {
t.Fatalf("expected continued tool content in body, got=%s", string(out))
}
}
func TestAutoContinueTriggersOnResponseBatchQuasiStatusIncomplete(t *testing.T) {
initialBody := strings.Join([]string{
`data: {"response_message_id":321,"v":{"response":{"message_id":321,"status":"WIP","auto_continue":false}}}`,

View File

@@ -2,7 +2,7 @@
"client": {
"name": "DeepSeek",
"platform": "android",
"version": "2.0.3",
"version": "2.0.4",
"android_api_level": "35",
"locale": "zh_CN"
},
@@ -24,4 +24,4 @@
"skip_exact_paths": [
"response/search_status"
]
}
}

View File

@@ -2,20 +2,24 @@ package protocol
import (
"bufio"
"io"
"net/http"
)
func ScanSSELines(resp *http.Response, onLine func([]byte) bool) error {
scanner := bufio.NewScanner(resp.Body)
buf := make([]byte, 0, 64*1024)
scanner.Buffer(buf, 2*1024*1024)
for scanner.Scan() {
if !onLine(scanner.Bytes()) {
break
reader := bufio.NewReaderSize(resp.Body, 64*1024)
for {
line, err := reader.ReadBytes('\n')
if len(line) > 0 {
if !onLine(line) {
return nil
}
}
if err != nil {
if err == io.EOF {
return nil
}
return err
}
}
if err := scanner.Err(); err != nil {
return err
}
return nil
}

View File

@@ -0,0 +1,26 @@
package protocol
import (
"io"
"net/http"
"strings"
"testing"
)
func TestScanSSELinesHandlesLongSingleLine(t *testing.T) {
payload := strings.Repeat("x", 2*1024*1024+4096)
body := "data: {\"p\":\"response/content\",\"v\":\"" + payload + "\"}\n"
resp := &http.Response{Body: io.NopCloser(strings.NewReader(body))}
var got string
err := ScanSSELines(resp, func(line []byte) bool {
got = string(line)
return true
})
if err != nil {
t.Fatalf("ScanSSELines returned error: %v", err)
}
if !strings.Contains(got, payload) {
t.Fatalf("long SSE line was not preserved: got len=%d want payload len=%d", len(got), len(payload))
}
}

View File

@@ -53,6 +53,32 @@ type chatStreamRuntime struct {
finalErrorCode string
}
type chatDeltaBatch struct {
runtime *chatStreamRuntime
field string
text strings.Builder
}
func (b *chatDeltaBatch) append(field, text string) {
if text == "" {
return
}
if b.field != "" && b.field != field {
b.flush()
}
b.field = field
b.text.WriteString(text)
}
func (b *chatDeltaBatch) flush() {
if b.field == "" || b.text.Len() == 0 {
return
}
b.runtime.sendDelta(map[string]any{b.field: b.text.String()})
b.field = ""
b.text.Reset()
}
func newChatStreamRuntime(
w http.ResponseWriter,
rc *http.ResponseController,
@@ -164,42 +190,22 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
detected := detectAssistantToolCalls(s.rawText.String(), finalText, s.rawThinking.String(), finalToolDetectionThinking, s.toolNames)
if len(detected.Calls) > 0 && !s.toolCallsDoneEmitted {
finishReason = "tool_calls"
delta := map[string]any{
s.sendDelta(map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(detected.Calls, s.streamToolCallIDs, s.toolsRaw),
}
if !s.firstChunkSent {
delta["role"] = "assistant"
s.firstChunkSent = true
}
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{openaifmt.BuildChatStreamDeltaChoice(0, delta)},
nil,
))
})
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
} else if s.bufferToolContent {
batch := chatDeltaBatch{runtime: s}
for _, evt := range toolstream.Flush(&s.toolSieve, s.toolNames) {
if len(evt.ToolCalls) > 0 {
batch.flush()
finishReason = "tool_calls"
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
s.sendDelta(map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(evt.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
}
if !s.firstChunkSent {
tcDelta["role"] = "assistant"
s.firstChunkSent = true
}
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{openaifmt.BuildChatStreamDeltaChoice(0, tcDelta)},
nil,
))
})
s.resetStreamToolCallState()
}
if evt.Content == "" {
@@ -209,21 +215,9 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
delta := map[string]any{
"content": cleaned,
}
if !s.firstChunkSent {
delta["role"] = "assistant"
s.firstChunkSent = true
}
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{openaifmt.BuildChatStreamDeltaChoice(0, delta)},
nil,
))
batch.append("content", cleaned)
}
batch.flush()
}
if len(detected.Calls) > 0 || s.toolCallsEmitted {
@@ -275,6 +269,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
}
contentSeen := false
batch := chatDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
@@ -298,7 +293,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
continue
}
s.thinking.WriteString(trimmed)
s.sendDelta(map[string]any{"reasoning_content": trimmed})
batch.append("reasoning_content", trimmed)
}
} else {
rawTrimmed := sse.TrimContinuationOverlap(s.rawText.String(), p.Text)
@@ -319,7 +314,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
if trimmed == "" {
continue
}
s.sendDelta(map[string]any{"content": trimmed})
batch.append("content", trimmed)
} else {
events := toolstream.ProcessChunk(&s.toolSieve, rawTrimmed, s.toolNames)
for _, evt := range events {
@@ -335,6 +330,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
if len(formatted) == 0 {
continue
}
batch.flush()
tcDelta := map[string]any{
"tool_calls": formatted,
}
@@ -343,6 +339,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
continue
}
if len(evt.ToolCalls) > 0 {
batch.flush()
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
@@ -357,14 +354,12 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
contentDelta := map[string]any{
"content": cleaned,
}
s.sendDelta(contentDelta)
batch.append("content", cleaned)
}
}
}
}
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
}

View File

@@ -309,6 +309,49 @@ func TestHandleStreamEmitsSingleChoiceFramesForMultipleParsedParts(t *testing.T)
}
}
func TestHandleStreamCoalescesSmallContentDeltas(t *testing.T) {
h := &Handler{}
lines := make([]string, 0, 101)
for i := 0; i < 100; i++ {
b, _ := json.Marshal(map[string]any{
"p": "response/content",
"v": "字",
})
lines = append(lines, "data: "+string(b))
}
lines = append(lines, "data: [DONE]")
resp := makeSSEHTTPResponse(lines...)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
h.handleStream(rec, req, resp, "cid-coalesce", "deepseek-v4-flash", "prompt", 0, false, false, nil, nil, nil)
frames, done := parseSSEDataFrames(t, rec.Body.String())
if !done {
t.Fatalf("expected [DONE], body=%s", rec.Body.String())
}
var content strings.Builder
contentDeltaFrames := 0
for _, frame := range frames {
choices, _ := frame["choices"].([]any)
if len(choices) != 1 {
t.Fatalf("expected exactly one choice per stream frame, got %d frame=%#v body=%s", len(choices), frame, rec.Body.String())
}
choice, _ := choices[0].(map[string]any)
delta, _ := choice["delta"].(map[string]any)
if c, ok := delta["content"].(string); ok {
contentDeltaFrames++
content.WriteString(c)
}
}
if got, want := content.String(), strings.Repeat("字", 100); got != want {
t.Fatalf("coalesced stream content mismatch: got %q want %q body=%s", got, want, rec.Body.String())
}
if contentDeltaFrames >= 100 {
t.Fatalf("expected coalescing to reduce 100 tiny content frames, got %d body=%s", contentDeltaFrames, rec.Body.String())
}
}
func TestHandleStreamIncompleteCapturedToolJSONFlushesAsTextOnFinalize(t *testing.T) {
h := &Handler{}
resp := makeSSEHTTPResponse(

View File

@@ -0,0 +1,39 @@
package responses
import (
"strings"
openaifmt "ds2api/internal/format/openai"
)
type responsesDeltaBatch struct {
runtime *responsesStreamRuntime
kind string
text strings.Builder
}
func (b *responsesDeltaBatch) append(kind, text string) {
if text == "" {
return
}
if b.kind != "" && b.kind != kind {
b.flush()
}
b.kind = kind
b.text.WriteString(text)
}
func (b *responsesDeltaBatch) flush() {
if b.kind == "" || b.text.Len() == 0 {
return
}
text := b.text.String()
switch b.kind {
case "reasoning":
b.runtime.sendEvent("response.reasoning.delta", openaifmt.BuildResponsesReasoningDeltaPayload(b.runtime.responseID, text))
case "text":
b.runtime.emitTextDelta(text)
}
b.kind = ""
b.text.Reset()
}

View File

@@ -222,6 +222,7 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
}
contentSeen := false
batch := responsesDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
@@ -247,7 +248,7 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
continue
}
s.thinking.WriteString(trimmed)
s.sendEvent("response.reasoning.delta", openaifmt.BuildResponsesReasoningDeltaPayload(s.responseID, trimmed))
batch.append("reasoning", trimmed)
continue
}
@@ -269,11 +270,13 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
if trimmed == "" {
continue
}
s.emitTextDelta(trimmed)
batch.append("text", trimmed)
continue
}
batch.flush()
s.processToolStreamEvents(toolstream.ProcessChunk(&s.sieve, rawTrimmed, s.toolNames), true, true)
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
}

View File

@@ -109,6 +109,48 @@ func TestHandleResponsesStreamOutputTextDeltaCarriesItemIndexes(t *testing.T) {
}
}
func TestHandleResponsesStreamCoalescesSmallOutputTextDeltas(t *testing.T) {
h := &Handler{}
req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
rec := httptest.NewRecorder()
var streamBody strings.Builder
for i := 0; i < 100; i++ {
b, _ := json.Marshal(map[string]any{
"p": "response/content",
"v": "字",
})
streamBody.WriteString("data: ")
streamBody.WriteString(string(b))
streamBody.WriteString("\n")
}
streamBody.WriteString("data: [DONE]\n")
resp := &http.Response{
StatusCode: http.StatusOK,
Body: io.NopCloser(strings.NewReader(streamBody.String())),
}
h.handleResponsesStream(rec, req, resp, "owner-a", "resp_coalesce", "deepseek-v4-flash", "prompt", 0, false, false, nil, nil, promptcompat.DefaultToolChoicePolicy(), "")
payloads := extractSSEEventPayloads(rec.Body.String(), "response.output_text.delta")
if len(payloads) == 0 {
t.Fatalf("expected response.output_text.delta payloads, body=%s", rec.Body.String())
}
var content strings.Builder
for _, payload := range payloads {
content.WriteString(asString(payload["delta"]))
}
if got, want := content.String(), strings.Repeat("字", 100); got != want {
t.Fatalf("coalesced response content mismatch: got %q want %q body=%s", got, want, rec.Body.String())
}
if len(payloads) >= 100 {
t.Fatalf("expected coalescing to reduce 100 tiny text deltas, got %d body=%s", len(payloads), rec.Body.String())
}
if !strings.Contains(rec.Body.String(), "event: response.completed") {
t.Fatalf("expected completed event, body=%s", rec.Body.String())
}
}
func TestHandleResponsesStreamEmitsDistinctToolCallIDsAcrossSeparateToolBlocks(t *testing.T) {
h := &Handler{}
req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)

View File

@@ -1,5 +1,8 @@
'use strict';
const MIN_DELTA_FLUSH_CHARS = 160;
const MAX_DELTA_FLUSH_WAIT_MS = 80;
function createChatCompletionEmitter({ res, sessionID, created, model, isClosed }) {
let firstChunkSent = false;
@@ -34,6 +37,62 @@ function createChatCompletionEmitter({ res, sessionID, created, model, isClosed
};
}
function createDeltaCoalescer({ sendDeltaFrame, minFlushChars = MIN_DELTA_FLUSH_CHARS, maxFlushWaitMS = MAX_DELTA_FLUSH_WAIT_MS }) {
let pendingField = '';
let pendingText = '';
let flushTimer = null;
const clearFlushTimer = () => {
if (flushTimer) {
clearTimeout(flushTimer);
flushTimer = null;
}
};
const flush = () => {
clearFlushTimer();
if (!pendingField || !pendingText) {
return;
}
const delta = { [pendingField]: pendingText };
pendingField = '';
pendingText = '';
sendDeltaFrame(delta);
};
const scheduleFlush = () => {
if (flushTimer || maxFlushWaitMS <= 0) {
return;
}
flushTimer = setTimeout(flush, maxFlushWaitMS);
if (typeof flushTimer.unref === 'function') {
flushTimer.unref();
}
};
const append = (field, text) => {
if (!field || !text) {
return;
}
if (pendingField && pendingField !== field) {
flush();
}
pendingField = field;
pendingText += text;
if ([...pendingText].length >= minFlushChars) {
flush();
return;
}
scheduleFlush();
};
return {
append,
flush,
};
}
module.exports = {
createChatCompletionEmitter,
createDeltaCoalescer,
};

View File

@@ -20,7 +20,7 @@ const {
boolDefaultTrue,
resetStreamToolCallState,
} = require('./toolcall_policy');
const { createChatCompletionEmitter } = require('./stream_emitter');
const { createChatCompletionEmitter, createDeltaCoalescer } = require('./stream_emitter');
const {
asString,
isAbortError,
@@ -191,6 +191,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
model,
isClosed: () => clientClosed,
});
const deltaCoalescer = createDeltaCoalescer({ sendDeltaFrame });
const finish = async (reason, options = {}) => {
if (ended) {
@@ -201,6 +202,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
await releaseLease();
return true;
}
deltaCoalescer.flush();
const detected = parseStandaloneToolCalls(outputText, toolNames);
if (detected.length > 0 && !toolCallsDoneEmitted) {
toolCallsEmitted = true;
@@ -210,6 +212,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
const tailEvents = flushToolSieve(toolSieveState, toolNames);
for (const evt of tailEvents) {
if (evt.type === 'tool_calls' && Array.isArray(evt.calls) && evt.calls.length > 0) {
deltaCoalescer.flush();
toolCallsEmitted = true;
toolCallsDoneEmitted = true;
sendDeltaFrame({ tool_calls: formatOpenAIStreamToolCalls(evt.calls, streamToolCallIDs, payload.tools) });
@@ -217,9 +220,10 @@ async function handleVercelStream(req, res, rawBody, payload) {
continue;
}
if (evt.text) {
sendDeltaFrame({ content: evt.text });
deltaCoalescer.append('content', evt.text);
}
}
deltaCoalescer.flush();
}
if (detected.length > 0 || toolCallsEmitted) {
reason = 'tool_calls';
@@ -327,7 +331,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
continue;
}
thinkingText += trimmed;
sendDeltaFrame({ reasoning_content: trimmed });
deltaCoalescer.append('reasoning_content', trimmed);
}
} else {
const trimmed = trimContinuationOverlap(outputText, p.text);
@@ -339,7 +343,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
}
outputText += trimmed;
if (!toolSieveEnabled) {
sendDeltaFrame({ content: trimmed });
deltaCoalescer.append('content', trimmed);
continue;
}
const events = processToolSieveChunk(toolSieveState, trimmed, toolNames);
@@ -352,19 +356,21 @@ async function handleVercelStream(req, res, rawBody, payload) {
const formatted = formatIncrementalToolCallDeltas(filtered, streamToolCallIDs);
if (formatted.length > 0) {
toolCallsEmitted = true;
sendDeltaFrame({ tool_calls: formatted });
deltaCoalescer.flush();
sendDeltaFrame({ tool_calls: formatted });
}
continue;
}
if (evt.type === 'tool_calls') {
toolCallsEmitted = true;
toolCallsDoneEmitted = true;
deltaCoalescer.flush();
sendDeltaFrame({ tool_calls: formatOpenAIStreamToolCalls(evt.calls, streamToolCallIDs, payload.tools) });
resetStreamToolCallState(streamToolCallIDs, streamToolNames);
continue;
}
if (evt.text) {
sendDeltaFrame({ content: evt.text });
deltaCoalescer.append('content', evt.text);
}
}
}
@@ -510,32 +516,51 @@ function observeContinueState(state, chunk) {
if (topID > 0) {
state.responseMessageID = topID;
}
if (chunk.p === 'response/status') {
setContinueStatus(state, asString(chunk.v));
}
observeContinueDirectPatch(state, chunk.p, chunk.v);
if (chunk.p === 'response') {
observeContinueBatchPatches(state, 'response', chunk.v);
} else {
observeContinueBatchPatches(state, '', chunk.v);
}
const response = chunk.v && typeof chunk.v === 'object' ? chunk.v.response : null;
if (response && typeof response === 'object') {
const id = numberValue(response.message_id);
if (id > 0) {
state.responseMessageID = id;
}
setContinueStatus(state, asString(response.status));
if (response.auto_continue === true) {
state.lastStatus = 'AUTO_CONTINUE';
}
}
observeContinueResponseObject(state, response);
const messageResponse = chunk.message && typeof chunk.message === 'object' && chunk.message.response;
if (messageResponse && typeof messageResponse === 'object') {
const id = numberValue(messageResponse.message_id);
if (id > 0) {
state.responseMessageID = id;
}
setContinueStatus(state, asString(messageResponse.status));
observeContinueResponseObject(state, messageResponse);
}
function observeContinueDirectPatch(state, path, value) {
if (!state) {
return;
}
switch (asString(path).trim().replace(/^\/+|\/+$/g, '')) {
case 'response/status':
case 'status':
case 'response/quasi_status':
case 'quasi_status':
setContinueStatus(state, asString(value));
break;
case 'response/auto_continue':
case 'auto_continue':
if (value === true) {
state.lastStatus = 'AUTO_CONTINUE';
}
break;
default:
break;
}
}
function observeContinueResponseObject(state, response) {
if (!state || !response || typeof response !== 'object') {
return;
}
const id = numberValue(response.message_id);
if (id > 0) {
state.responseMessageID = id;
}
setContinueStatus(state, asString(response.status));
if (response.auto_continue === true) {
state.lastStatus = 'AUTO_CONTINUE';
}
}
@@ -563,6 +588,12 @@ function observeContinueBatchPatches(state, parentPath, raw) {
case 'quasi_status':
setContinueStatus(state, asString(patch.v));
break;
case 'response/auto_continue':
case 'auto_continue':
if (patch.v === true) {
state.lastStatus = 'AUTO_CONTINUE';
}
break;
default:
break;
}

View File

@@ -248,6 +248,9 @@ function replaceDSMLToolMarkupOutsideIgnored(text) {
if (tag) {
if (tag.dsmlLike) {
out += `<${tag.closing ? '/' : ''}${tag.name}${raw.slice(tag.nameEnd, tag.end + 1)}`;
if (raw[tag.end] !== '>') {
out += '>';
}
} else {
out += raw.slice(tag.start, tag.end + 1);
}
@@ -424,31 +427,42 @@ function scanToolMarkupTagAt(text, start) {
}
const lower = raw.toLowerCase();
let i = start + 1;
while (i < raw.length && raw[i] === '<') {
i += 1;
}
const closing = raw[i] === '/';
if (closing) {
i += 1;
}
let dsmlLike = false;
if (i < raw.length && isToolMarkupPipe(raw[i])) {
dsmlLike = true;
i += 1;
}
if (lower.startsWith('dsml', i)) {
dsmlLike = true;
i += 'dsml'.length;
while (i < raw.length && isToolMarkupSeparator(raw[i])) {
i += 1;
}
}
const prefix = consumeToolMarkupNamePrefix(raw, lower, i);
i = prefix.next;
const dsmlLike = prefix.dsmlLike;
const { name, len } = matchToolMarkupName(lower, i);
if (!name) {
return null;
}
const nameEnd = i + len;
const originalNameEnd = i + len;
let nameEnd = originalNameEnd;
while (nameEnd < raw.length && isToolMarkupPipe(raw[nameEnd])) {
nameEnd += 1;
}
const hasTrailingPipe = nameEnd > originalNameEnd;
if (!hasXmlTagBoundary(raw, nameEnd)) {
return null;
}
const end = findXmlTagEnd(raw, nameEnd);
let end = findXmlTagEnd(raw, nameEnd);
if (end < 0) {
if (!hasTrailingPipe) {
return null;
}
end = nameEnd - 1;
}
if (hasTrailingPipe) {
const nextLT = raw.indexOf('<', nameEnd);
if (nextLT >= 0 && end >= nextLT) {
end = nameEnd - 1;
}
}
if (end < 0) {
return null;
}
@@ -520,37 +534,94 @@ function findPartialToolMarkupStart(text) {
if (lastLT < 0) {
return -1;
}
const tail = raw.slice(lastLT);
const start = includeDuplicateLeadingLessThan(raw, lastLT);
const tail = raw.slice(start);
if (tail.includes('>')) {
return -1;
}
const lowerTail = tail.toLowerCase();
const candidates = [
'<tool_calls', '<invoke', '<parameter',
'<|tool_calls', '<|invoke', '<|parameter',
'<tool_calls', '<invoke', '<parameter',
'<|dsml|tool_calls', '<|dsml|invoke', '<|dsml|parameter',
'<dsml|tool_calls', '<dsml|invoke', '<dsml|parameter',
'<dsmltool_calls', '<dsmlinvoke', '<dsmlparameter',
'<dsml tool_calls', '<dsml invoke', '<dsml parameter',
'<dsml|tool_calls', '<dsml|invoke', '<dsml|parameter',
'<|dsmltool_calls', '<|dsmlinvoke', '<|dsmlparameter',
'<|dsml tool_calls', '<|dsml invoke', '<|dsml parameter',
];
for (const candidate of candidates) {
if (candidate.startsWith(lowerTail)) {
return lastLT;
}
return isPartialToolMarkupTagPrefix(tail) ? start : -1;
}
function includeDuplicateLeadingLessThan(text, idx) {
let out = idx;
while (out > 0 && text[out - 1] === '<') {
out -= 1;
}
return -1;
return out;
}
function isToolMarkupPipe(ch) {
return ch === '|' || ch === '';
}
function isToolMarkupSeparator(ch) {
return ch === ' ' || ch === '\t' || ch === '\r' || ch === '\n' || isToolMarkupPipe(ch);
function isPartialToolMarkupTagPrefix(text) {
const raw = toStringSafe(text);
if (!raw || raw[0] !== '<' || raw.includes('>')) {
return false;
}
const lower = raw.toLowerCase();
let i = 1;
while (i < raw.length && raw[i] === '<') {
i += 1;
}
if (i >= raw.length) {
return true;
}
if (raw[i] === '/') {
i += 1;
}
while (i <= raw.length) {
if (i === raw.length) {
return true;
}
if (hasToolMarkupNamePrefix(lower.slice(i))) {
return true;
}
if ('dsml'.startsWith(lower.slice(i))) {
return true;
}
const next = consumeToolMarkupNamePrefixOnce(raw, lower, i);
if (!next.ok) {
return false;
}
i = next.next;
}
return false;
}
function consumeToolMarkupNamePrefix(raw, lower, idx) {
let next = idx;
let dsmlLike = false;
while (true) {
const consumed = consumeToolMarkupNamePrefixOnce(raw, lower, next);
if (!consumed.ok) {
return { next, dsmlLike };
}
next = consumed.next;
dsmlLike = true;
}
}
function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
if (idx < raw.length && isToolMarkupPipe(raw[idx])) {
return { next: idx + 1, ok: true };
}
if (idx < raw.length && [' ', '\t', '\r', '\n'].includes(raw[idx])) {
return { next: idx + 1, ok: true };
}
if (lower.startsWith('dsml', idx)) {
return { next: idx + 'dsml'.length, ok: true };
}
return { next: idx, ok: false };
}
function hasToolMarkupNamePrefix(lowerTail) {
for (const name of TOOL_MARKUP_NAMES) {
if (lowerTail.startsWith(name) || name.startsWith(lowerTail)) {
return true;
}
}
return false;
}
function matchToolMarkupName(lower, start) {

View File

@@ -1,55 +0,0 @@
'use strict';
const XML_TOOL_SEGMENT_TAGS = [
'<|dsml|tool_calls>', '<|dsml|tool_calls\n', '<|dsml|tool_calls ',
'<dsml|tool_calls>', '<dsml|tool_calls\n', '<dsml|tool_calls ',
'<|dsml|invoke ', '<|dsml|invoke\n', '<|dsml|invoke\t', '<|dsml|invoke\r',
'<|dsmltool_calls>', '<|dsmltool_calls\n', '<|dsmltool_calls ',
'<|dsmlinvoke ', '<|dsmlinvoke\n', '<|dsmlinvoke\t', '<|dsmlinvoke\r',
'<|dsml tool_calls>', '<|dsml tool_calls\n', '<|dsml tool_calls ',
'<|dsml invoke ', '<|dsml invoke\n', '<|dsml invoke\t', '<|dsml invoke\r',
'<dsml|tool_calls>', '<dsml|tool_calls\n', '<dsml|tool_calls ',
'<dsml|invoke ', '<dsml|invoke\n', '<dsml|invoke\t', '<dsml|invoke\r',
'<dsmltool_calls>', '<dsmltool_calls\n', '<dsmltool_calls ',
'<dsmlinvoke ', '<dsmlinvoke\n', '<dsmlinvoke\t', '<dsmlinvoke\r',
'<dsml tool_calls>', '<dsml tool_calls\n', '<dsml tool_calls ',
'<dsml invoke ', '<dsml invoke\n', '<dsml invoke\t', '<dsml invoke\r',
'<tool_calls>', '<tool_calls\n', '<tool_calls ',
'<invoke ', '<invoke\n', '<invoke\t', '<invoke\r',
'<|tool_calls>', '<|tool_calls\n', '<|tool_calls ',
'<|invoke ', '<|invoke\n', '<|invoke\t', '<|invoke\r',
'<tool_calls>', '<tool_calls\n', '<tool_calls ',
'<invoke ', '<invoke\n', '<invoke\t', '<invoke\r',
];
const XML_TOOL_OPENING_TAGS = [
'<|dsml|tool_calls',
'<dsml|tool_calls',
'<|dsmltool_calls',
'<|dsml tool_calls',
'<dsml|tool_calls',
'<dsmltool_calls',
'<dsml tool_calls',
'<tool_calls',
'<|tool_calls',
'<tool_calls',
];
const XML_TOOL_CLOSING_TAGS = [
'</|dsml|tool_calls>',
'</dsml|tool_calls>',
'</|dsmltool_calls>',
'</|dsml tool_calls>',
'</dsml|tool_calls>',
'</dsmltool_calls>',
'</dsml tool_calls>',
'</tool_calls>',
'</|tool_calls>',
'</tool_calls>',
];
module.exports = {
XML_TOOL_SEGMENT_TAGS,
XML_TOOL_OPENING_TAGS,
XML_TOOL_CLOSING_TAGS,
};

View File

@@ -88,6 +88,58 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
}
}
func TestBuildOpenAIFinalPromptReadLikeToolIncludesCacheGuard(t *testing.T) {
messages := []any{
map[string]any{"role": "user", "content": "请读取文件"},
}
tools := []any{
map[string]any{
"type": "function",
"function": map[string]any{
"name": "read_file",
"description": "Read a file",
"parameters": map[string]any{
"type": "object",
},
},
},
}
finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
if !strings.Contains(finalPrompt, "Read-tool cache guard") {
t.Fatalf("read-like tool prompt missing cache guard: %q", finalPrompt)
}
if !strings.Contains(finalPrompt, "provides no file body") {
t.Fatalf("read-like tool prompt missing no-body handling: %q", finalPrompt)
}
if !strings.Contains(finalPrompt, "Do not repeatedly call the same read request") {
t.Fatalf("read-like tool prompt missing loop guard: %q", finalPrompt)
}
}
func TestBuildOpenAIFinalPromptNonReadToolOmitsCacheGuard(t *testing.T) {
messages := []any{
map[string]any{"role": "user", "content": "搜索一下"},
}
tools := []any{
map[string]any{
"type": "function",
"function": map[string]any{
"name": "search",
"description": "Search docs",
"parameters": map[string]any{
"type": "object",
},
},
},
}
finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
if strings.Contains(finalPrompt, "Read-tool cache guard") {
t.Fatalf("non-read tool prompt should not include read cache guard: %q", finalPrompt)
}
}
func TestBuildOpenAIFinalPromptWithThinkingKeepsPromptUnchanged(t *testing.T) {
messages := []any{
map[string]any{"role": "user", "content": "继续回答上一个问题"},

View File

@@ -4,6 +4,7 @@ import (
"encoding/json"
"fmt"
"strings"
"unicode"
"ds2api/internal/toolcall"
)
@@ -46,6 +47,9 @@ func injectToolPrompt(messages []map[string]any, tools []any, policy ToolChoiceP
return messages, names
}
toolPrompt := "You have access to these tools:\n\n" + strings.Join(toolSchemas, "\n\n") + "\n\n" + toolcall.BuildToolCallInstructions(names)
if hasReadLikeTool(names) {
toolPrompt += "\n\nRead-tool cache guard: If a Read/read_file-style tool result says the file is unchanged, already available in history, should be referenced from previous context, or otherwise provides no file body, treat that result as missing content. Do not repeatedly call the same read request for that missing body. Request a full-content read if the tool supports it, or tell the user that the file contents need to be provided again."
}
if policy.Mode == ToolChoiceRequired {
toolPrompt += "\n7) For this response, you MUST call at least one tool from the allowed list."
}
@@ -64,3 +68,23 @@ func injectToolPrompt(messages []map[string]any, tools []any, policy ToolChoiceP
messages = append([]map[string]any{{"role": "system", "content": toolPrompt}}, messages...)
return messages, names
}
func hasReadLikeTool(names []string) bool {
for _, name := range names {
switch normalizeToolNameForGuard(name) {
case "read", "readfile":
return true
}
}
return false
}
func normalizeToolNameForGuard(name string) string {
var b strings.Builder
for _, r := range strings.ToLower(strings.TrimSpace(name)) {
if unicode.IsLetter(r) || unicode.IsDigit(r) {
b.WriteRune(r)
}
}
return b.String()
}

View File

@@ -41,6 +41,15 @@ func TestCollectStreamTextOnly(t *testing.T) {
}
}
func TestCollectStreamHandlesLongSingleSSELine(t *testing.T) {
payload := strings.Repeat("x", 2*1024*1024+4096)
resp := makeHTTPResponse(makeLargeContentSSEBody(t, payload))
result := CollectStream(resp, false, true)
if result.Text != payload {
t.Fatalf("long SSE line payload mismatch: got len=%d want len=%d", len(result.Text), len(payload))
}
}
func TestCollectStreamThinkingAndText(t *testing.T) {
resp := makeHTTPResponse(
"data: {\"p\":\"response/thinking_content\",\"v\":\"Thinking...\"}\n" +

View File

@@ -9,8 +9,7 @@ import (
const (
parsedLineBufferSize = 128
scannerBufferSize = 64 * 1024
maxScannerLineSize = 2 * 1024 * 1024
lineReaderBufferSize = 64 * 1024
minFlushChars = 160
maxFlushWait = 80 * time.Millisecond
)
@@ -29,8 +28,8 @@ func StartParsedLinePump(ctx context.Context, body io.Reader, thinkingEnabled bo
eof bool
}
lineCh := make(chan scanItem, 1)
stopScanner := make(chan struct{})
defer close(stopScanner)
stopReader := make(chan struct{})
defer close(stopReader)
go func() {
sendScanItem := func(item scanItem) bool {
select {
@@ -38,20 +37,28 @@ func StartParsedLinePump(ctx context.Context, body io.Reader, thinkingEnabled bo
return true
case <-ctx.Done():
return false
case <-stopScanner:
case <-stopReader:
return false
}
}
defer close(lineCh)
scanner := bufio.NewScanner(body)
scanner.Buffer(make([]byte, 0, scannerBufferSize), maxScannerLineSize)
for scanner.Scan() {
line := append([]byte{}, scanner.Bytes()...)
if !sendScanItem(scanItem{line: line}) {
reader := bufio.NewReaderSize(body, lineReaderBufferSize)
for {
line, err := reader.ReadBytes('\n')
if len(line) > 0 {
line = append([]byte{}, line...)
if !sendScanItem(scanItem{line: line}) {
return
}
}
if err != nil {
if err == io.EOF {
err = nil
}
_ = sendScanItem(scanItem{err: err, eof: true})
return
}
}
_ = sendScanItem(scanItem{err: scanner.Err(), eof: true})
}()
ticker := time.NewTicker(maxFlushWait)

View File

@@ -2,10 +2,23 @@ package sse
import (
"context"
"encoding/json"
"strings"
"testing"
)
func makeLargeContentSSEBody(t *testing.T, payload string) string {
t.Helper()
line, err := json.Marshal(map[string]any{
"p": "response/content",
"v": payload,
})
if err != nil {
t.Fatalf("marshal SSE line failed: %v", err)
}
return "data: " + string(line) + "\n" + "data: [DONE]\n"
}
func TestStartParsedLinePumpParsesAndStops(t *testing.T) {
body := strings.NewReader("data: {\"p\":\"response/content\",\"v\":\"hi\"}\n\ndata: [DONE]\n")
results, done := StartParsedLinePump(context.Background(), body, false, "text")
@@ -28,3 +41,28 @@ func TestStartParsedLinePumpParsesAndStops(t *testing.T) {
t.Fatalf("expected last line to stop stream, got parsed=%v stop=%v", last.Parsed, last.Stop)
}
}
func TestStartParsedLinePumpHandlesLongSingleSSELine(t *testing.T) {
payload := strings.Repeat("x", 2*1024*1024+4096)
results, done := StartParsedLinePump(context.Background(), strings.NewReader(makeLargeContentSSEBody(t, payload)), false, "text")
var got strings.Builder
var sawDone bool
for r := range results {
for _, p := range r.Parts {
got.WriteString(p.Text)
}
if r.Stop {
sawDone = true
}
}
if err := <-done; err != nil {
t.Fatalf("unexpected long-line read error: %v", err)
}
if got.String() != payload {
t.Fatalf("long SSE line payload mismatch: got len=%d want len=%d", got.Len(), len(payload))
}
if !sawDone {
t.Fatal("expected DONE after long SSE line")
}
}

View File

@@ -44,6 +44,9 @@ func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
}
b.WriteString(tag.Name)
b.WriteString(text[tag.NameEnd : tag.End+1])
if text[tag.End] != '>' {
b.WriteByte('>')
}
i = tag.End + 1
continue
}

View File

@@ -128,34 +128,39 @@ func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
}
lower := strings.ToLower(text)
i := start + 1
for i < len(text) && text[i] == '<' {
i++
}
closing := false
if i < len(text) && text[i] == '/' {
closing = true
i++
}
dsmlLike := false
if next, ok := consumeToolMarkupPipe(text, i); ok {
dsmlLike = true
i = next
}
if strings.HasPrefix(lower[i:], "dsml") {
dsmlLike = true
i += len("dsml")
for next, ok := consumeToolMarkupSeparator(text, i); ok; next, ok = consumeToolMarkupSeparator(text, i) {
i = next
}
}
i, dsmlLike := consumeToolMarkupNamePrefix(lower, text, i)
name, nameLen := matchToolMarkupName(lower, i)
if nameLen == 0 {
return ToolMarkupTag{}, false
}
nameEnd := i + nameLen
nameEndBeforePipes := nameEnd
for next, ok := consumeToolMarkupPipe(text, nameEnd); ok; next, ok = consumeToolMarkupPipe(text, nameEnd) {
nameEnd = next
}
hasTrailingPipe := nameEnd > nameEndBeforePipes
if !hasToolMarkupBoundary(text, nameEnd) {
return ToolMarkupTag{}, false
}
end := findXMLTagEnd(text, nameEnd)
if end < 0 {
return ToolMarkupTag{}, false
if !hasTrailingPipe {
return ToolMarkupTag{}, false
}
end = nameEnd - 1
}
if hasTrailingPipe {
if nextLT := strings.IndexByte(text[nameEnd:], '<'); nextLT >= 0 && end >= nameEnd+nextLT {
end = nameEnd - 1
}
}
trimmed := strings.TrimSpace(text[start : end+1])
return ToolMarkupTag{
@@ -171,6 +176,74 @@ func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
}, true
}
func IsPartialToolMarkupTagPrefix(text string) bool {
if text == "" || text[0] != '<' || strings.Contains(text, ">") {
return false
}
lower := strings.ToLower(text)
i := 1
for i < len(text) && text[i] == '<' {
i++
}
if i >= len(text) {
return true
}
if text[i] == '/' {
i++
}
for i <= len(text) {
if i == len(text) {
return true
}
if hasToolMarkupNamePrefix(lower[i:]) {
return true
}
if strings.HasPrefix("dsml", lower[i:]) {
return true
}
next, ok := consumeToolMarkupNamePrefixOnce(lower, text, i)
if !ok {
return false
}
i = next
}
return false
}
func consumeToolMarkupNamePrefix(lower, text string, idx int) (int, bool) {
dsmlLike := false
for {
next, ok := consumeToolMarkupNamePrefixOnce(lower, text, idx)
if !ok {
return idx, dsmlLike
}
idx = next
dsmlLike = true
}
}
func consumeToolMarkupNamePrefixOnce(lower, text string, idx int) (int, bool) {
if next, ok := consumeToolMarkupPipe(text, idx); ok {
return next, true
}
if idx < len(text) && (text[idx] == ' ' || text[idx] == '\t' || text[idx] == '\r' || text[idx] == '\n') {
return idx + 1, true
}
if strings.HasPrefix(lower[idx:], "dsml") {
return idx + len("dsml"), true
}
return idx, false
}
func hasToolMarkupNamePrefix(lowerTail string) bool {
for _, name := range toolMarkupNames {
if strings.HasPrefix(lowerTail, name) || strings.HasPrefix(name, lowerTail) {
return true
}
}
return false
}
func matchToolMarkupName(lower string, start int) (string, int) {
for _, name := range toolMarkupNames {
if strings.HasPrefix(lower[start:], name) {
@@ -193,19 +266,6 @@ func consumeToolMarkupPipe(text string, idx int) (int, bool) {
return idx, false
}
func consumeToolMarkupSeparator(text string, idx int) (int, bool) {
if idx >= len(text) {
return idx, false
}
if text[idx] == ' ' || text[idx] == '\t' || text[idx] == '\r' || text[idx] == '\n' {
return idx + 1, true
}
if next, ok := consumeToolMarkupPipe(text, idx); ok {
return next, true
}
return idx, false
}
func hasToolMarkupBoundary(text string, idx int) bool {
if idx >= len(text) {
return true

View File

@@ -41,6 +41,52 @@ func TestParseToolCallsSupportsDSMLShell(t *testing.T) {
}
}
func TestParseToolCallsToleratesDSMLTrailingPipeTagTerminator(t *testing.T) {
text := strings.Join([]string{
`<|DSML|tool_calls| `,
` <|DSML|invoke name="terminal">`,
` <|DSML|parameter name="command"><![CDATA[find "/home" -type d]]></|DSML|parameter>`,
` <|DSML|parameter name="timeout"><![CDATA[10]]></|DSML|parameter>`,
` </|DSML|invoke>`,
`</|DSML|tool_calls>`,
}, "\n")
calls := ParseToolCalls(text, []string{"terminal"})
if len(calls) != 1 {
t.Fatalf("expected one trailing-pipe DSML call, got %#v", calls)
}
if calls[0].Name != "terminal" {
t.Fatalf("expected terminal tool, got %#v", calls[0])
}
if calls[0].Input["command"] != `find "/home" -type d` {
t.Fatalf("expected command argument, got %#v", calls[0].Input)
}
if calls[0].Input["timeout"] != float64(10) {
t.Fatalf("expected numeric timeout, got %#v", calls[0].Input)
}
}
func TestParseToolCallsToleratesExtraLeadingLessThanBeforeDSML(t *testing.T) {
text := `<<|DSML|tool_calls><<|DSML|invoke name="Bash"><<|DSML|parameter name="command"><![CDATA[pwd]]></|DSML|parameter></|DSML|invoke></|DSML|tool_calls>`
calls := ParseToolCalls(text, []string{"Bash"})
if len(calls) != 1 {
t.Fatalf("expected one extra-leading-less-than DSML call, got %#v", calls)
}
if calls[0].Name != "Bash" || calls[0].Input["command"] != "pwd" {
t.Fatalf("unexpected extra-leading-less-than DSML parse result: %#v", calls[0])
}
}
func TestParseToolCallsToleratesRepeatedDSMLPrefixNoise(t *testing.T) {
text := `<<DSML|DSML|tool_calls><<DSML|DSML|invoke name="Bash"><<DSML|DSML|parameter name="command"><![CDATA[git status]]></DSML|DSML|parameter></DSML|DSML|invoke></DSML|DSML|tool_calls>`
calls := ParseToolCalls(text, []string{"Bash"})
if len(calls) != 1 {
t.Fatalf("expected one repeated-prefix DSML call, got %#v", calls)
}
if calls[0].Name != "Bash" || calls[0].Input["command"] != "git status" {
t.Fatalf("unexpected repeated-prefix DSML parse result: %#v", calls[0])
}
}
func TestParseToolCallsSupportsDSMLShellWithCanonicalExampleInCDATA(t *testing.T) {
content := `<tool_calls><invoke name="demo"><parameter name="value">x</parameter></invoke></tool_calls>`
text := `<|DSML|tool_calls><|DSML|invoke name="Write"><|DSML|parameter name="file_path">notes.md</|DSML|parameter><|DSML|parameter name="content"><![CDATA[` + content + `]]></|DSML|parameter></|DSML|invoke></|DSML|tool_calls>`

View File

@@ -1,10 +1,6 @@
package toolstream
import (
"strings"
"ds2api/internal/toolcall"
)
import "ds2api/internal/toolcall"
func ProcessChunk(state *State, chunk string, toolNames []string) []Event {
if state == nil {
@@ -174,31 +170,27 @@ func findToolSegmentStart(state *State, s string) int {
if s == "" {
return -1
}
lower := strings.ToLower(s)
offset := 0
for {
bestKeyIdx := -1
matchedTag := ""
for _, tag := range xmlToolTagsToDetect {
idx := strings.Index(lower[offset:], tag)
if idx >= 0 {
idx += offset
if bestKeyIdx < 0 || idx < bestKeyIdx {
bestKeyIdx = idx
matchedTag = tag
}
}
}
if bestKeyIdx < 0 {
tag, ok := toolcall.FindToolMarkupTagOutsideIgnored(s, offset)
if !ok {
return -1
}
if !insideCodeFenceWithState(state, s[:bestKeyIdx]) {
return bestKeyIdx
start := includeDuplicateLeadingLessThan(s, tag.Start)
if !insideCodeFenceWithState(state, s[:start]) {
return start
}
offset = bestKeyIdx + len(matchedTag)
offset = tag.End + 1
}
}
func includeDuplicateLeadingLessThan(s string, idx int) int {
for idx > 0 && s[idx-1] == '<' {
idx--
}
return idx
}
func consumeToolCapture(state *State, toolNames []string) (prefix string, calls []toolcall.ParsedToolCall, suffix string, ready bool) {
captured := state.capture.String()
if captured == "" {

View File

@@ -153,27 +153,14 @@ func findPartialXMLToolTagStart(s string) int {
if lastLT < 0 {
return -1
}
tail := s[lastLT:]
start := includeDuplicateLeadingLessThan(s, lastLT)
tail := s[start:]
// If there's a '>' in the tail, the tag is closed — not partial.
if strings.Contains(tail, ">") {
return -1
}
lowerTail := strings.ToLower(tail)
for _, tag := range []string{
"<tool_calls", "<invoke", "<parameter",
"<|tool_calls", "<|invoke", "<|parameter",
"<tool_calls", "<invoke", "<parameter",
"<|dsml|tool_calls", "<|dsml|invoke", "<|dsml|parameter",
"<dsml|tool_calls", "<dsml|invoke", "<dsml|parameter",
"<dsmltool_calls", "<dsmlinvoke", "<dsmlparameter",
"<dsml tool_calls", "<dsml invoke", "<dsml parameter",
"<dsml|tool_calls", "<dsml|invoke", "<dsml|parameter",
"<|dsmltool_calls", "<|dsmlinvoke", "<|dsmlparameter",
"<|dsml tool_calls", "<|dsml invoke", "<|dsml parameter",
} {
if strings.HasPrefix(tag, lowerTail) {
return lastLT
}
if toolcall.IsPartialToolMarkupTagPrefix(tail) {
return start
}
return -1
}

View File

@@ -1,35 +0,0 @@
package toolstream
import "regexp"
// --- XML tool call support for the streaming sieve ---
//nolint:unused // kept as explicit tag inventory for future XML sieve refinements.
var xmlToolCallClosingTags = []string{"</tool_calls>", "</|dsml|tool_calls>", "</|dsmltool_calls>", "</|dsml tool_calls>", "</dsml|tool_calls>", "</dsmltool_calls>", "</dsml tool_calls>", "</tool_calls>", "</|tool_calls>"}
// xmlToolCallBlockPattern matches a complete canonical XML tool call block.
//
//nolint:unused // reserved for future fast-path XML block detection.
var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)((?:<tool_calls\b|<\|dsml\|tool_calls\b)[^>]*>\s*(?:.*?)\s*(?:</tool_calls>|</\|dsml\|tool_calls>))`)
// xmlToolTagsToDetect is the set of XML tag prefixes used by findToolSegmentStart.
var xmlToolTagsToDetect = []string{
"<|dsml|tool_calls>", "<|dsml|tool_calls\n", "<|dsml|tool_calls ",
"<dsml|tool_calls>", "<dsml|tool_calls\n", "<dsml|tool_calls ",
"<|dsml|invoke ", "<|dsml|invoke\n", "<|dsml|invoke\t", "<|dsml|invoke\r",
"<|dsmltool_calls>", "<|dsmltool_calls\n", "<|dsmltool_calls ",
"<|dsmlinvoke ", "<|dsmlinvoke\n", "<|dsmlinvoke\t", "<|dsmlinvoke\r",
"<|dsml tool_calls>", "<|dsml tool_calls\n", "<|dsml tool_calls ",
"<|dsml invoke ", "<|dsml invoke\n", "<|dsml invoke\t", "<|dsml invoke\r",
"<dsml|tool_calls>", "<dsml|tool_calls\n", "<dsml|tool_calls ",
"<dsml|invoke ", "<dsml|invoke\n", "<dsml|invoke\t", "<dsml|invoke\r",
"<dsmltool_calls>", "<dsmltool_calls\n", "<dsmltool_calls ",
"<dsmlinvoke ", "<dsmlinvoke\n", "<dsmlinvoke\t", "<dsmlinvoke\r",
"<dsml tool_calls>", "<dsml tool_calls\n", "<dsml tool_calls ",
"<dsml invoke ", "<dsml invoke\n", "<dsml invoke\t", "<dsml invoke\r",
"<tool_calls>", "<tool_calls\n", "<tool_calls ",
"<invoke ", "<invoke\n", "<invoke\t", "<invoke\r",
"<|tool_calls>", "<|tool_calls\n", "<|tool_calls ",
"<|invoke ", "<|invoke\n", "<|invoke\t", "<|invoke\r",
"<tool_calls>", "<tool_calls\n", "<tool_calls ", "<invoke ", "<invoke\n", "<invoke\t", "<invoke\r",
}

View File

@@ -72,6 +72,97 @@ func TestProcessToolSieveInterceptsDSMLToolCallWithoutLeak(t *testing.T) {
}
}
func TestProcessToolSieveInterceptsDSMLTrailingPipeToolCallWithoutLeak(t *testing.T) {
var state State
chunks := []string{
"<|DSML|tool_calls| \n",
` <|DSML|invoke name="terminal">` + "\n",
` <|DSML|parameter name="command"><![CDATA[find "/home" -type d]]></|DSML|parameter>` + "\n",
` <|DSML|parameter name="timeout"><![CDATA[10]]></|DSML|parameter>` + "\n",
" </|DSML|invoke>\n",
"</|DSML|tool_calls>",
}
var events []Event
for _, c := range chunks {
events = append(events, ProcessChunk(&state, c, []string{"terminal"})...)
}
events = append(events, Flush(&state, []string{"terminal"})...)
var textContent strings.Builder
var calls []any
for _, evt := range events {
textContent.WriteString(evt.Content)
for _, call := range evt.ToolCalls {
calls = append(calls, call)
}
}
if text := textContent.String(); strings.Contains(strings.ToLower(text), "dsml") || strings.Contains(text, "terminal") {
t.Fatalf("trailing-pipe DSML tool call leaked to text: %q events=%#v", text, events)
}
if len(calls) != 1 {
t.Fatalf("expected one trailing-pipe DSML tool call, got %d events=%#v", len(calls), events)
}
}
func TestProcessToolSieveInterceptsExtraLeadingLessThanDSMLToolCallWithoutLeak(t *testing.T) {
var state State
chunks := []string{
"<<|DSML|tool_calls>\n",
` <<|DSML|invoke name="Bash">` + "\n",
` <<|DSML|parameter name="command"><![CDATA[pwd]]></|DSML|parameter>` + "\n",
" </|DSML|invoke>\n",
"</|DSML|tool_calls>",
}
var events []Event
for _, c := range chunks {
events = append(events, ProcessChunk(&state, c, []string{"Bash"})...)
}
events = append(events, Flush(&state, []string{"Bash"})...)
var textContent strings.Builder
toolCalls := 0
for _, evt := range events {
textContent.WriteString(evt.Content)
toolCalls += len(evt.ToolCalls)
}
if text := textContent.String(); strings.Contains(text, "<") || strings.Contains(text, "Bash") {
t.Fatalf("extra-leading-less-than DSML tool call leaked to text: %q events=%#v", text, events)
}
if toolCalls != 1 {
t.Fatalf("expected one extra-leading-less-than DSML tool call, got %d events=%#v", toolCalls, events)
}
}
func TestProcessToolSieveInterceptsRepeatedDSMLPrefixNoiseWithoutLeak(t *testing.T) {
var state State
chunks := []string{
"<<DSML|DSML|tool",
"_calls>\n",
` <<DSML|DSML|invoke name="Bash">` + "\n",
` <<DSML|DSML|parameter name="command"><![CDATA[git status]]></DSML|DSML|parameter>` + "\n",
" </DSML|DSML|invoke>\n",
"</DSML|DSML|tool_calls>",
}
var events []Event
for _, c := range chunks {
events = append(events, ProcessChunk(&state, c, []string{"Bash"})...)
}
events = append(events, Flush(&state, []string{"Bash"})...)
var textContent strings.Builder
toolCalls := 0
for _, evt := range events {
textContent.WriteString(evt.Content)
toolCalls += len(evt.ToolCalls)
}
if text := textContent.String(); strings.Contains(strings.ToLower(text), "dsml") || strings.Contains(text, "Bash") {
t.Fatalf("repeated-prefix DSML tool call leaked to text: %q events=%#v", text, events)
}
if toolCalls != 1 {
t.Fatalf("expected one repeated-prefix DSML tool call, got %d events=%#v", toolCalls, events)
}
}
func TestProcessToolSieveHandlesLongXMLToolCall(t *testing.T) {
var state State
const toolName = "write_to_file"
@@ -442,6 +533,8 @@ func TestFindToolSegmentStartDetectsXMLToolCalls(t *testing.T) {
want int
}{
{"tool_calls_tag", "some text <tool_calls>\n", 10},
{"dsml_trailing_pipe_tag", "some text <|DSML|tool_calls| \n", 10},
{"dsml_extra_leading_less_than", "some text <<|DSML|tool_calls>\n", 10},
{"invoke_tag_missing_wrapper", "some text <invoke name=\"read_file\">\n", 10},
{"bare_tool_call_text", "prefix <tool_call>\n", -1},
{"xml_inside_code_fence", "```xml\n<tool_calls><invoke name=\"read_file\"></invoke></tool_calls>\n```", -1},
@@ -465,6 +558,8 @@ func TestFindPartialXMLToolTagStart(t *testing.T) {
want int
}{
{"partial_tool_calls", "Hello <tool_ca", 6},
{"partial_dsml_trailing_pipe", "Hello <|DSML|tool_calls|", 6},
{"partial_dsml_extra_leading_less_than", "Hello <<|DSML|tool_calls", 6},
{"partial_invoke", "Hello <inv", 6},
{"bare_tool_call_not_held", "Hello <tool_name", -1},
{"partial_lt_only", "Text <", 5},

View File

@@ -1,11 +1,5 @@
package util
import (
"strings"
tiktoken "github.com/hupe1980/go-tiktoken"
)
const (
defaultTokenizerModel = "gpt-4o"
claudeTokenizerModel = "claude"
@@ -33,41 +27,6 @@ func CountOutputTokens(text, model string) int {
return base
}
func countWithTokenizer(text, model string) int {
text = strings.TrimSpace(text)
if text == "" {
return 0
}
encoding, err := tiktoken.NewEncodingForModel(tokenizerModelForCount(model))
if err != nil {
return 0
}
ids, _, err := encoding.Encode(text, nil, nil)
if err != nil {
return 0
}
return len(ids)
}
func tokenizerModelForCount(model string) string {
model = strings.ToLower(strings.TrimSpace(model))
if model == "" {
return defaultTokenizerModel
}
switch {
case strings.HasPrefix(model, "claude"):
return claudeTokenizerModel
case strings.HasPrefix(model, "gpt-4"), strings.HasPrefix(model, "gpt-5"), strings.HasPrefix(model, "o1"), strings.HasPrefix(model, "o3"), strings.HasPrefix(model, "o4"):
return defaultTokenizerModel
case strings.HasPrefix(model, "deepseek-v4"):
return defaultTokenizerModel
case strings.HasPrefix(model, "deepseek"):
return defaultTokenizerModel
default:
return defaultTokenizerModel
}
}
func conservativePromptPadding(base int) int {
padding := base / 50
if padding < 4 {

View File

@@ -0,0 +1,7 @@
//go:build 386 || arm || mips || mipsle || wasm
package util
func countWithTokenizer(_, _ string) int {
return 0
}

View File

@@ -0,0 +1,44 @@
//go:build !386 && !arm && !mips && !mipsle && !wasm
package util
import (
"strings"
tiktoken "github.com/hupe1980/go-tiktoken"
)
func countWithTokenizer(text, model string) int {
text = strings.TrimSpace(text)
if text == "" {
return 0
}
encoding, err := tiktoken.NewEncodingForModel(tokenizerModelForCount(model))
if err != nil {
return 0
}
ids, _, err := encoding.Encode(text, nil, nil)
if err != nil {
return 0
}
return len(ids)
}
func tokenizerModelForCount(model string) string {
model = strings.ToLower(strings.TrimSpace(model))
if model == "" {
return defaultTokenizerModel
}
switch {
case strings.HasPrefix(model, "claude"):
return claudeTokenizerModel
case strings.HasPrefix(model, "gpt-4"), strings.HasPrefix(model, "gpt-5"), strings.HasPrefix(model, "o1"), strings.HasPrefix(model, "o3"), strings.HasPrefix(model, "o4"):
return defaultTokenizerModel
case strings.HasPrefix(model, "deepseek-v4"):
return defaultTokenizerModel
case strings.HasPrefix(model, "deepseek"):
return defaultTokenizerModel
default:
return defaultTokenizerModel
}
}

View File

@@ -20,4 +20,3 @@ internal/js/helpers/stream-tool-sieve/sieve-xml.js
internal/js/helpers/stream-tool-sieve/jsonscan.js
internal/js/helpers/stream-tool-sieve/parse.js
internal/js/helpers/stream-tool-sieve/format.js
internal/js/helpers/stream-tool-sieve/tool-keywords.js

View File

@@ -210,6 +210,37 @@ test('vercel stream retries empty output once and keeps one terminal frame', asy
assert.match(completionBodies[1].prompt, /Previous reply had no visible output\. Please regenerate the visible final answer or tool call now\.$/);
});
test('vercel stream coalesces many small content deltas while keeping one choice', async () => {
const lines = Array.from({ length: 100 }, () => `data: ${JSON.stringify({ p: 'response/content', v: '字' })}\n\n`);
lines.push('data: [DONE]\n\n');
const { frames } = await runMockVercelStream(lines);
const parsed = frames.filter((frame) => frame !== '[DONE]').map((frame) => JSON.parse(frame));
const contentFrames = parsed.filter((item) => item.choices?.[0]?.delta?.content);
const content = contentFrames.map((item) => item.choices[0].delta.content).join('');
assert.equal(content, '字'.repeat(100));
assert.ok(contentFrames.length < 100, `expected fewer than 100 content frames, got ${contentFrames.length}`);
for (const item of parsed) {
assert.equal(item.choices.length, 1);
}
});
test('vercel stream flushes reasoning before content and before stop', async () => {
const { frames } = await runMockVercelStream([
`data: ${JSON.stringify({ p: 'response/fragments', o: 'APPEND', v: [
{ type: 'THINK', content: '思考' },
{ type: 'THINK', content: '过程' },
{ type: 'RESPONSE', content: '回答' },
] })}\n\n`,
'data: [DONE]\n\n',
], { thinking_enabled: true });
const parsed = frames.filter((frame) => frame !== '[DONE]').map((frame) => JSON.parse(frame));
const reasoning = parsed.map((item) => item.choices?.[0]?.delta?.reasoning_content || '').join('');
const content = parsed.map((item) => item.choices?.[0]?.delta?.content || '').join('');
assert.equal(reasoning, '思考过程');
assert.equal(content, '回答');
assert.equal(parsed.at(-1).choices[0].finish_reason, 'stop');
});
test('vercel stream exhausts DeepSeek continue before synthetic retry', async () => {
const { frames, fetchURLs, fetchBodies } = await runMockVercelStreamSequence([
[
@@ -227,6 +258,28 @@ test('vercel stream exhausts DeepSeek continue before synthetic retry', async ()
assert.equal(fetchBodies.some((body) => String(body.prompt || '').includes('Previous reply had no visible output')), false);
});
test('vercel stream continues direct quasi_status incomplete before final tool call', async () => {
const { frames, fetchURLs } = await runMockVercelStreamSequence([
[
'data: {"response_message_id":7,"p":"response/content","v":"<tool_calls><invoke name=\\"write_file\\"><parameter name=\\"content\\"><![CDATA[part-one"}\n\n',
'data: {"p":"response/quasi_status","v":"INCOMPLETE"}\n\n',
'data: [DONE]\n\n',
],
[
'data: {"response_message_id":8,"p":"response/content","v":"-part-two]]></parameter></invoke></tool_calls>"}\n\n',
'data: {"p":"response/status","v":"FINISHED"}\n\n',
'data: [DONE]\n\n',
],
], { tool_names: ['write_file'] });
const parsed = frames.filter((frame) => frame !== '[DONE]').map((frame) => JSON.parse(frame));
const toolDelta = parsed.find((item) => item.choices?.[0]?.delta?.tool_calls);
assert.equal(fetchURLs.filter((url) => url === 'https://chat.deepseek.com/api/v0/chat/continue').length, 1);
assert.ok(toolDelta);
const args = JSON.parse(toolDelta.choices[0].delta.tool_calls[0].function.arguments);
assert.equal(args.content, 'part-one-part-two');
assert.equal(parsed.at(-1).choices[0].finish_reason, 'tool_calls');
});
test('vercel stream usage completion_tokens does not double-count visible output', async () => {

View File

@@ -57,6 +57,49 @@ test('parseToolCalls parses DSML shell as XML-compatible tool call', () => {
assert.deepEqual(calls[0].input, { path: 'README.MD' });
});
test('parseToolCalls tolerates DSML trailing pipe tag terminator', () => {
const payload = [
'<|DSML|tool_calls| ',
' <|DSML|invoke name="terminal">',
' <|DSML|parameter name="command"><![CDATA[find "/home" -type d]]></|DSML|parameter>',
' <|DSML|parameter name="timeout"><![CDATA[10]]></|DSML|parameter>',
' </|DSML|invoke>',
'</|DSML|tool_calls>',
].join('\n');
const calls = parseToolCalls(payload, ['terminal']);
assert.equal(calls.length, 1);
assert.equal(calls[0].name, 'terminal');
assert.deepEqual(calls[0].input, { command: 'find "/home" -type d', timeout: 10 });
});
test('parseToolCalls tolerates extra leading less-than before DSML tags', () => {
const payload = [
'<<|DSML|tool_calls>',
' <<|DSML|invoke name="Bash">',
' <<|DSML|parameter name="command"><![CDATA[pwd]]></|DSML|parameter>',
' </|DSML|invoke>',
'</|DSML|tool_calls>',
].join('\n');
const calls = parseToolCalls(payload, ['Bash']);
assert.equal(calls.length, 1);
assert.equal(calls[0].name, 'Bash');
assert.deepEqual(calls[0].input, { command: 'pwd' });
});
test('parseToolCalls tolerates repeated DSML prefix noise', () => {
const payload = [
'<<DSML|DSML|tool_calls>',
' <<DSML|DSML|invoke name="Bash">',
' <<DSML|DSML|parameter name="command"><![CDATA[git status]]></DSML|DSML|parameter>',
' </DSML|DSML|invoke>',
'</DSML|DSML|tool_calls>',
].join('\n');
const calls = parseToolCalls(payload, ['Bash']);
assert.equal(calls.length, 1);
assert.equal(calls[0].name, 'Bash');
assert.deepEqual(calls[0].input, { command: 'git status' });
});
test('parseToolCalls tolerates DSML space-separator typo', () => {
const payload = '<|DSML tool_calls><|DSML invoke name="Read"><|DSML parameter name="file_path"><![CDATA[/tmp/input.txt]]></|DSML parameter></|DSML invoke></|DSML tool_calls>';
const calls = parseToolCalls(payload, ['Read']);
@@ -285,6 +328,39 @@ test('sieve emits tool_calls for DSML space-separator typo', () => {
assert.equal(text.includes('<|DSML invoke'), false);
});
test('sieve emits tool_calls for DSML trailing pipe tag terminator', () => {
const events = runSieve([
'<|DSML|tool_calls| \n',
'<|DSML|invoke name="terminal">\n',
'<|DSML|parameter name="command"><![CDATA[find "/home" -type d]]></|DSML|parameter>\n',
'<|DSML|parameter name="timeout"><![CDATA[10]]></|DSML|parameter>\n',
'</|DSML|invoke>\n',
'</|DSML|tool_calls>',
], ['terminal']);
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
const text = collectText(events);
assert.equal(finalCalls.length, 1);
assert.equal(finalCalls[0].name, 'terminal');
assert.deepEqual(finalCalls[0].input, { command: 'find "/home" -type d', timeout: 10 });
assert.equal(text.toLowerCase().includes('dsml'), false);
});
test('sieve emits tool_calls for extra leading less-than DSML tags without leaking prefix', () => {
const events = runSieve([
'<<|DSML|tool_calls>\n',
'<<|DSML|invoke name="Bash">\n',
'<<|DSML|parameter name="command"><![CDATA[pwd]]></|DSML|parameter>\n',
'</|DSML|invoke>\n',
'</|DSML|tool_calls>',
], ['Bash']);
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
const text = collectText(events);
assert.equal(finalCalls.length, 1);
assert.equal(finalCalls[0].name, 'Bash');
assert.deepEqual(finalCalls[0].input, { command: 'pwd' });
assert.equal(text.includes('<'), false);
});
test('sieve keeps DSML space lookalike tag names as text', () => {
const input = '<|DSML tool_calls_extra><|DSML invoke name="Read"><|DSML parameter name="file_path">/tmp/input.txt</|DSML parameter></|DSML invoke></|DSML tool_calls_extra>';
const events = runSieve([input], ['Read']);