Compare commits

...

26 Commits
v4.3.0 ... dev

Author SHA1 Message Date
CJACK
03b2acfc9f docs: add DS2API project value note and link it from docs index
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 07:33:04 +08:00
CJACK
a299c7d1c4 refactor: remove thinking content from empty output validation logic to enforce stricter completion requirements 2026-05-03 06:59:20 +08:00
CJACK
51d3578465 fix: ensure CDATA parsing correctly tracks line offsets to preserve compact tool call content 2026-05-03 06:49:22 +08:00
CJACK
072ec57acd fix: improve CDATA parsing resilience by ignoring structural markers inside markdown fences within tool calls 2026-05-03 06:40:29 +08:00
CJACK
545ab0802f feat: extend DSML tag prefix to also recognize underscore-connected variants
Support `<dsml_tool_calls>`, `<dsml_invoke>`, `<dsml_parameter>` in
addition to the existing pipe, space, hyphen, and collapsed forms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 05:39:49 +08:00
CJACK
a7522b4188 fix: retry thinking-only empty outputs, centralize reference marker stripping
- ValidateTurn no longer errors on thinking-only responses, deferring to
  ShouldRetryEmptyOutput which now also covers thinking-only outputs.
- Empty output retry uses multi-turn follow-up with a regeneration prompt
  suffix and parent_message_id in the same DeepSeek session.
- Centralize StripReferenceMarkersEnabled into textclean package to
  eliminate duplicated hardcoded booleans across 4 protocol handlers.
- Log a deprecation warning when the legacy "compat" config key is used.
- Document thinking-only retry and reference marker stripping in API.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 05:02:26 +08:00
CJACK
1286b02247 refactor: remove legacy compatibility configuration and UI components 2026-05-03 04:14:19 +08:00
CJACK
2f7cb473fc feat: support hyphenated DSML tag variants in tool-call parsing
Add compatibility for <dsml-tool-calls>/<dsml-invoke>/<dsml-parameter>
tag forms alongside the canonical pipe-prefixed DSML shell. Hyphenated
forms only activate when a DSML prefix is detected, preventing false
matches on bare XML lookalikes. Go and Node parsers aligned, with tests
covering here-doc CDATA, streaming sieve, and negative lookalike cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 03:09:10 +08:00
CJACK
ad80a57efa docs: add missing directory entries and package descriptions to architecture docs
Fill gaps identified in architecture audit: add artifacts/ and static/ to
directory tree, and document 7 auxiliary internal/ packages (textclean,
claudeconv, compat, rawsample, devcapture, util, version) in Section 3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 02:51:19 +08:00
CJACK
5f110e6910 refactor: remove legacy history split configuration and integrate current input file handling into the completion runtime pipeline. 2026-05-03 01:50:50 +08:00
CJACK
7c0bc9ec0f feat: implement support for thinking blocks in Gemini API and enable thinking by default for supported models 2026-05-03 01:00:06 +08:00
CJACK
a901250de7 refactor: replace bufio.Scanner with bufio.Reader for SSE stream parsing and track emitted text to prevent redundant output blocks 2026-05-02 23:50:35 +08:00
CJACK
dc5bffdf89 refactor: centralize assistant turn semantics and stream accumulation into new assistantturn and completionruntime packages 2026-05-02 23:28:43 +08:00
CJACK
eccd8c957b fix: prevent continuation replay overlap by trimming redundant text from thinking and response streams 2026-05-02 21:34:36 +08:00
CJACK.
b1d0ee07c0 Merge pull request #406 from wyv202011y/perf/streaming-speed-optimization
perf(streaming): optimize TTFT and reduce buffering latency
2026-05-02 21:19:04 +08:00
CJACK
0156f6b45b Merge origin/dev into PR 406 2026-05-02 21:17:02 +08:00
CJACK
a9f46f5b25 chore: bump version to 4.3.1 2026-05-02 21:04:12 +08:00
CJACK
e7d6807c7c feat: emit empty completion chunk along with keep-alive heartbeat in chat stream 2026-05-02 20:54:10 +08:00
d407ccb773 perf(streaming): optimize TTFT and reduce buffering latency
Core changes:
- stream.go: New accumulation buffer architecture with scanner goroutine
  + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate
- dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies
- claude/stream_runtime_core.go: Integrate toolstream for incremental text
- claude/stream_runtime_finalize.go: toolstream flush support
- stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms)
- empty_retry: Add thinking-aware empty output detection
- Fix reasoning_content leak and finish_reason=null in edge cases
- Fix tail content truncation when max_tokens exceeded

Tests: sync test expectations with upstream for thinking content
2026-05-02 20:28:30 +08:00
CJACK
c8f7b6b371 refactor streaming accumulation and chat history UI 2026-05-02 20:15:38 +08:00
CJACK.
20d71f528a Merge pull request #404 from NgoQuocViet2001/ai/openai-file-retrieve
feat(openai): retrieve uploaded file metadata
2026-05-02 15:42:40 +08:00
NgoQuocViet2001
36d0239dc6 feat(openai): retrieve uploaded file metadata 2026-05-02 14:33:42 +07:00
CJACK.
e620752e2b Merge pull request #403 from VanceHud/main
修复了使用Zeabur部署会失败的问题
2026-05-02 14:10:18 +08:00
VanceHud
44cb27872c Merge branch 'CJackHwang:main' into main 2026-05-02 12:19:09 +08:00
VanceHud
603801c542 Merge branch 'CJackHwang:main' into main 2026-05-01 17:18:00 +08:00
VanceHud
febd3ec83a Document Zeabur manual deployment 2026-05-01 14:29:49 +08:00
115 changed files with 4838 additions and 2315 deletions

View File

@@ -111,6 +111,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
| GET | `/v1/responses/{response_id}` | Business | Query stored response (in-memory TTL) |
| POST | `/v1/embeddings` | Business | OpenAI Embeddings API |
| POST | `/v1/files` | Business | OpenAI Files upload (multipart/form-data) |
| GET | `/v1/files/{file_id}` | Business | Retrieve uploaded file status |
| GET | `/anthropic/v1/models` | None | Claude model list |
| POST | `/anthropic/v1/messages` | Business | Claude messages |
| POST | `/anthropic/v1/messages/count_tokens` | Business | Claude token counting |
@@ -167,7 +168,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
| PUT | `/admin/chat-history/settings` | Admin | Update conversation history retention limit |
| GET | `/admin/version` | Admin | Check current version and latest Release |
OpenAI `/v1/*` paths are canonical. For clients configured with the bare DS2API service URL, the same OpenAI handlers are also exposed through root shortcuts: `/models`, `/models/{id}`, `/chat/completions`, `/responses`, `/responses/{response_id}`, `/embeddings`, and `/files`.
OpenAI `/v1/*` paths are canonical. For clients configured with the bare DS2API service URL, the same OpenAI handlers are also exposed through root shortcuts: `/models`, `/models/{id}`, `/chat/completions`, `/responses`, `/responses/{response_id}`, `/embeddings`, `/files`, and `/files/{file_id}`.
---
@@ -440,6 +441,10 @@ Constraints and behavior:
- Total request size limit is **100 MiB** (over-limit returns `413`).
- Success returns an OpenAI `file` object (`id/object/bytes/filename/purpose/status`, etc.) and includes `account_id` for source-account tracing.
### `GET /v1/files/{file_id}`
Business auth required. Retrieves the current DeepSeek upload status for a file and returns an OpenAI `file` object. Returns `404` when no matching file is found.
---
## Claude-Compatible API
@@ -550,7 +555,7 @@ data: {"type":"message_stop"}
**Notes**:
- Models whose names contain `opus` / `reasoner` / `slow` stream `thinking_delta`
- Models that support thinking emit `thinking` blocks / `thinking_delta` by default; explicit thinking disablement or `-nothinking` models suppress them
- `signature_delta` is not emitted (DeepSeek does not provide verifiable thinking signatures)
- In `tools` mode, the stream avoids leaking raw tool JSON and does not force `input_json_delta`
@@ -596,6 +601,7 @@ Request body accepts Gemini-style `contents` / `tools`. Model names can use alia
Response uses Gemini-compatible fields, including:
- `candidates[].content.parts[].text`
- `candidates[].content.parts[].thought=true` for thinking output
- `candidates[].content.parts[].functionCall` (when tool call is produced)
- `usageMetadata` (`promptTokenCount` / `candidatesTokenCount` / `totalTokenCount`)
@@ -604,6 +610,7 @@ Response uses Gemini-compatible fields, including:
Returns SSE (`text/event-stream`), each chunk as `data: <json>`:
- regular text: incremental text chunks
- thinking: incremental chunks with `parts[].thought=true`
- `tools` mode: buffered and emitted as `functionCall` at finalize phase
- final chunk: includes `finishReason: "STOP"` and `usageMetadata`
- Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent
@@ -726,7 +733,6 @@ Reads runtime settings and status, including:
- `success`
- `admin` (`has_password_hash`, `jwt_expire_hours`, `jwt_valid_after_unix`, `default_password_warning`)
- `runtime` (`account_max_inflight`, `account_max_queue`, `global_max_inflight`, `token_refresh_interval_hours`)
- `compat` (`wide_input_strict_output`, `strip_reference_markers`)
- `responses` / `embeddings`
- `auto_delete` (`mode`: `none` / `single` / `all`; legacy `sessions=true` is still treated as `all`)
- `current_input_file` (`enabled` defaults to `true`, plus `min_chars`)
@@ -740,13 +746,11 @@ Hot-updates runtime settings. Supported fields:
- `admin.jwt_expire_hours`
- `runtime.account_max_inflight` / `runtime.account_max_queue` / `runtime.global_max_inflight` / `runtime.token_refresh_interval_hours`
- `compat.wide_input_strict_output` / `compat.strip_reference_markers`
- `responses.store_ttl_seconds`
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `model_aliases`
- `history_split` is retained only for legacy config compatibility and no longer affects requests
- `toolcall` policy is fixed and is no longer writable through settings
### `POST /admin/settings/password`
@@ -770,9 +774,9 @@ Imports full config with:
The request can send config directly, or wrapped as `{"config": {...}, "mode":"merge"}`.
Query params `?mode=merge` / `?mode=replace` are also supported.
`replace` mode replaces the full config shape while preserving Vercel sync metadata. `merge` mode merges `keys`, `api_keys`, `accounts`, and `model_aliases`, and overwrites non-empty fields under `admin`, `runtime`, `responses`, and `embeddings`. Manage `compat`, `auto_delete`, and `current_input_file` via `/admin/settings` or the config file; `history_split` remains only for legacy compatibility; legacy `toolcall` fields are ignored.
`replace` mode replaces the full config shape while preserving Vercel sync metadata. `merge` mode merges `keys`, `api_keys`, `accounts`, and `model_aliases`, and overwrites non-empty fields under `admin`, `runtime`, `responses`, and `embeddings`. Manage `auto_delete` and `current_input_file` via `/admin/settings` or the config file; legacy `compat` and `toolcall` fields are ignored.
> Note: `merge` mode does not update `compat`, `auto_delete`, or `current_input_file`.
> Note: `merge` mode does not update `auto_delete` or `current_input_file`.
### `GET /admin/config/export`

20
API.md
View File

@@ -41,6 +41,8 @@
- 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>``<|tool_calls>``<tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>`),以及旧式 canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`。实现上采用窄容错结构扫描:只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123``true``null`、数组或对象),会按结构化值输出,不再一律当作字符串;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
- `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。
- 当上游返回 thinking-only 响应(模型输出了推理链但无可见文本)时,非流式补全会自动重试一次:以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出;重试最大 1 次。
- 引用标记剥离strip reference markers当前为固定开启的运行时行为所有协议适配层统一生效。
---
@@ -111,6 +113,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
| GET | `/v1/responses/{response_id}` | 业务 | 查询已生成 response内存 TTL |
| POST | `/v1/embeddings` | 业务 | OpenAI Embeddings 接口 |
| POST | `/v1/files` | 业务 | OpenAI Files 上传multipart/form-data |
| GET | `/v1/files/{file_id}` | 业务 | 查询已上传文件状态 |
| GET | `/anthropic/v1/models` | 无 | Claude 模型列表 |
| POST | `/anthropic/v1/messages` | 业务 | Claude 消息接口 |
| POST | `/anthropic/v1/messages/count_tokens` | 业务 | Claude token 计数 |
@@ -167,7 +170,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
| PUT | `/admin/chat-history/settings` | Admin | 更新对话记录保留条数 |
| GET | `/admin/version` | Admin | 查询当前版本与最新 Release |
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models``/models/{id}``/chat/completions``/responses``/responses/{response_id}``/embeddings``/files`
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models``/models/{id}``/chat/completions``/responses``/responses/{response_id}``/embeddings``/files``/files/{file_id}`
---
@@ -443,6 +446,10 @@ data: [DONE]
- 请求体总大小上限 **100 MiB**(超限返回 `413`)。
- 成功返回 OpenAI `file` 对象(`id/object/bytes/filename/purpose/status` 等字段),并附带 `account_id` 便于定位来源账号。
### `GET /v1/files/{file_id}`
需要业务鉴权。查询 DeepSeek 上传文件的当前状态,并返回 OpenAI `file` 对象;未找到匹配文件时返回 `404`
---
## Claude 兼容接口
@@ -556,7 +563,7 @@ data: {"type":"message_stop"}
**说明**
- 默认模型会按各 surface 的既有规则输出 thinking / reasoning 相关增量
- 默认支持 thinking 的模型会输出 `thinking` block / `thinking_delta`;请求显式关闭 thinking 或使用 `-nothinking` 模型时不会输出
-`-nothinking` 后缀的模型会强制关闭 thinking即使请求显式传了 `thinking` / `reasoning` / `reasoning_effort` 也不会输出 `thinking_delta`
- 不会输出 `signature_delta`(上游 DeepSeek 未提供可验证签名)
- `tools` 场景优先避免泄露原始工具 JSON不强制发送 `input_json_delta`
@@ -603,6 +610,7 @@ data: {"type":"message_stop"}
响应为 Gemini 兼容结构,核心字段包括:
- `candidates[].content.parts[].text`
- `candidates[].content.parts[].thought=true`thinking 输出)
- `candidates[].content.parts[].functionCall`(工具调用时)
- `usageMetadata``promptTokenCount` / `candidatesTokenCount` / `totalTokenCount`
@@ -611,6 +619,7 @@ data: {"type":"message_stop"}
返回 SSE`text/event-stream`),每个 chunk 为一条 `data: <json>`
- 常规文本:持续返回增量文本 chunk
- thinking持续返回 `parts[].thought=true` 的增量 chunk
- `tools` 场景:会缓冲并在结束时输出 `functionCall` 结构
- 结束 chunk包含 `finishReason: "STOP"``usageMetadata`
- token 计数优先透传上游 DeepSeek SSE`accumulated_token_usage` / `token_usage`);仅在上游缺失时回退本地估算
@@ -733,7 +742,6 @@ data: {"type":"message_stop"}
- `success`
- `admin``has_password_hash``jwt_expire_hours``jwt_valid_after_unix``default_password_warning`
- `runtime``account_max_inflight``account_max_queue``global_max_inflight``token_refresh_interval_hours`
- `compat``wide_input_strict_output``strip_reference_markers`
- `responses` / `embeddings`
- `auto_delete``mode``none` / `single` / `all`;旧配置 `sessions=true` 仍按 `all` 处理)
- `current_input_file``enabled` 默认返回 `true``min_chars`
@@ -747,13 +755,11 @@ data: {"type":"message_stop"}
- `admin.jwt_expire_hours`
- `runtime.account_max_inflight` / `runtime.account_max_queue` / `runtime.global_max_inflight` / `runtime.token_refresh_interval_hours`
- `compat.wide_input_strict_output` / `compat.strip_reference_markers`
- `responses.store_ttl_seconds`
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `model_aliases`
- `history_split` 仅作为旧配置兼容字段保留,不再影响请求处理
- `toolcall` 策略已固定,不再作为可写入字段
### `POST /admin/settings/password`
@@ -777,9 +783,9 @@ data: {"type":"message_stop"}
请求可直接传配置对象,或使用 `{"config": {...}, "mode":"merge"}` 包裹格式。
也支持在查询参数里传 `?mode=merge` / `?mode=replace`
`replace` 模式会按完整配置结构替换(保留 Vercel 同步元信息);`merge` 模式会合并 `keys``api_keys``accounts``model_aliases`,并覆盖 `admin``runtime``responses``embeddings` 中的非空字段。`compat``auto_delete``current_input_file` 建议通过 `/admin/settings` 或配置文件管理;`history_split` 仅保留为旧配置兼容字段;`toolcall` 相关字段会被忽略。
`replace` 模式会按完整配置结构替换(保留 Vercel 同步元信息);`merge` 模式会合并 `keys``api_keys``accounts``model_aliases`,并覆盖 `admin``runtime``responses``embeddings` 中的非空字段。`auto_delete``current_input_file` 建议通过 `/admin/settings` 或配置文件管理;`compat``toolcall` 相关字段会被忽略。
> 注意:`merge` 模式不会更新 `compat`、`auto_delete`、`current_input_file`。
> 注意:`merge` 模式不会更新 `auto_delete`、`current_input_file`。
### `GET /admin/config/export`

View File

@@ -76,13 +76,14 @@ flowchart LR
subgraph Runtime["运行时核心能力"]
Compat["PromptCompat\n(API -> 网页纯文本上下文)"]
Chat["Chat / Responses Runtime\n(统一工具调用与流式语义)"]
Completion["Completion Runtime\n(Session / PoW / Completion)"]
Turn["AssistantTurn\n(输出语义归一)"]
Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
Pool["Account Pool + Queue\n(并发槽位 + 等待队列)"]
DSClient["DeepSeek Client\n(Session / Auth / Completion / Files)"]
Pow["PoW 实现\n(纯 Go)"]
Tool["Tool Sieve\n(Go/Node 语义对齐)"]
History["History Split\n(长历史文件化)"]
History["Current Input File\n(DS2API_HISTORY.txt)"]
end
end
@@ -94,18 +95,19 @@ flowchart LR
OA --> Compat
CA & GA --> Compat
Compat --> Chat
Compat -.长历史.-> History
Vercel -.Go prepare.-> Chat
Compat --> Completion
Completion -.完整上下文.-> History
Completion --> Turn
Vercel -.Go prepare.-> Completion
Vercel -.Node SSE.-> Tool
Chat --> Auth
Chat -.账号轮询.-> Pool
Chat -.工具调用解析.-> Tool
Chat -.PoW 计算.-> Pow
Completion --> Auth
Completion -.账号轮询.-> Pool
Completion -.工具调用解析.-> Tool
Completion -.PoW 计算.-> Pow
Auth --> DSClient
DSClient --> Upstream
Upstream --> DSClient
Chat --> Client
Turn --> Client
Vercel --> Client
```
@@ -119,7 +121,7 @@ flowchart LR
| 能力 | 说明 |
| --- | --- |
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files` |
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files`、`GET /v1/files/{file_id}` |
| Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`(及快捷路径 `/v1/messages`、`/messages` |
| Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`(及 `/v1/models/{model}:*` 路径) |
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/admin/*` 统一走同一套 CORS 策略Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
@@ -131,7 +133,7 @@ flowchart LR
| WebUI 管理台 | `/admin` 单页应用(中英文双语、深色模式,支持查看服务器端对话记录) |
| 运维探针 | `GET /healthz`(存活)、`GET /readyz`(就绪) |
OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/completions`、`/responses`、`/embeddings`、`/files` 等根路径快捷路由,方便只配置 DS2API 根地址的第三方客户端。
OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/completions`、`/responses`、`/embeddings`、`/files`、`/files/{file_id}` 等根路径快捷路由,方便只配置 DS2API 根地址的第三方客户端。
## 平台兼容矩阵
@@ -257,6 +259,10 @@ docker-compose logs -f
2. 部署完成后访问 `/admin`,使用 Zeabur 环境变量/模板指引中的 `DS2API_ADMIN_KEY` 登录。
3. 在管理台导入/编辑配置(会写入并持久化到 `/data/config.json`)。
Zeabur 首次空卷启动时可以没有 `/data/config.json`DS2API 会先使用空的文件模式配置启动,并在管理台首次保存时创建该文件。
不依赖模板手动部署时,在 Zeabur 中选择 GitHub 仓库服务Root Directory 保持 `/`,使用仓库根目录 `Dockerfile` 构建;添加持久卷 `/data`,设置 `PORT=5001`、`DS2API_ADMIN_KEY=你的强密钥`、`DS2API_CONFIG_PATH=/data/config.json`,然后暴露 HTTP 端口 `5001`。更完整步骤见 [docs/DEPLOY.md](docs/DEPLOY.md#不使用模板手动部署)。
说明Zeabur 使用仓库内 `Dockerfile` 直接构建时,不需要额外传入 `BUILD_VERSION`;镜像会优先读取该构建参数,未提供时自动回退到仓库根目录的 `VERSION` 文件。
### 方式三Vercel 部署
@@ -317,8 +323,7 @@ go run ./cmd/ds2api
- `model_aliases`OpenAI / Claude / Gemini 共用的模型 alias 映射。
- `runtime`:账号并发、队列与 token 刷新策略,可通过 Admin Settings 热更新。
- `auto_delete.mode`:请求结束后的远端会话清理策略,支持 `none` / `single` / `all`。
- `history_split`:旧轮次拆分字段,已废弃并忽略,仅保留兼容旧配置
- `current_input_file`:唯一生效的独立拆分策略;默认开启且阈值为 `0`,触发时将完整上下文合并上传为 `DS2API_HISTORY.txt` 上下文文件。
- `current_input_file`:全局生效的上下文拆分上传策略;默认开启且阈值为 `0`,触发时将完整上下文合并上传为 `DS2API_HISTORY.txt` 上下文文件
- 如果关闭 `current_input_file`,请求会直接透传,不上传拆分上下文文件。
- `thinking_injection`:默认开启;在最新 user 消息末尾追加思考增强提示词,提高高强度推理与工具调用前的思考稳定性;`prompt` 留空时使用内置默认提示词。

View File

@@ -73,13 +73,14 @@ flowchart LR
subgraph Runtime["Runtime + Core Capabilities"]
Compat["PromptCompat\n(API -> web-chat plain text context)"]
Chat["Chat / Responses Runtime\n(unified tools + stream semantics)"]
Completion["Completion Runtime\n(session / PoW / completion)"]
Turn["AssistantTurn\n(output semantic normalization)"]
Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
Pool["Account Pool + Queue\n(in-flight slots + wait queue)"]
DSClient["DeepSeek Client\n(session / auth / completion / files)"]
Pow["PoW Solver\n(Pure Go)"]
Tool["Tool Sieve\n(Go/Node semantic parity)"]
History["History Split\n(long history as files)"]
History["Current Input File\n(DS2API_HISTORY.txt)"]
end
end
@@ -91,18 +92,19 @@ flowchart LR
OA --> Compat
CA & GA --> Compat
Compat --> Chat
Compat -.long history.-> History
Vercel -.Go prepare.-> Chat
Compat --> Completion
Completion -.full context.-> History
Completion --> Turn
Vercel -.Go prepare.-> Completion
Vercel -.Node SSE.-> Tool
Chat --> Auth
Chat -.account rotation.-> Pool
Chat -.tool-call parsing.-> Tool
Chat -.PoW solving.-> Pow
Completion --> Auth
Completion -.account rotation.-> Pool
Completion -.tool-call parsing.-> Tool
Completion -.PoW solving.-> Pow
Auth --> DSClient
DSClient --> Upstream
Upstream --> DSClient
Chat --> Client
Turn --> Client
Vercel --> Client
```
@@ -116,7 +118,7 @@ For the full module-by-module architecture and directory responsibilities, see [
| Capability | Details |
| --- | --- |
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files` |
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files`, `GET /v1/files/{file_id}` |
| Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) |
| Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
@@ -128,7 +130,7 @@ For the full module-by-module architecture and directory responsibilities, see [
| WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode, with server-side conversation history) |
| Health Probes | `GET /healthz` (liveness), `GET /readyz` (readiness) |
OpenAI `/v1/*` routes remain canonical, and DS2API also accepts root shortcuts such as `/models`, `/chat/completions`, `/responses`, `/embeddings`, and `/files` for clients configured with the bare service URL.
OpenAI `/v1/*` routes remain canonical, and DS2API also accepts root shortcuts such as `/models`, `/chat/completions`, `/responses`, `/embeddings`, `/files`, and `/files/{file_id}` for clients configured with the bare service URL.
## Platform Compatibility Matrix
@@ -245,6 +247,10 @@ Rebuild after updates: `docker-compose up -d --build`
2. After deployment, open `/admin` and login with `DS2API_ADMIN_KEY` shown in Zeabur env/template instructions.
3. Import / edit config in Admin UI (it will be written and persisted to `/data/config.json`).
Fresh Zeabur volumes can start without `/data/config.json`; DS2API will boot with an empty file-backed config and create the file on the first Admin UI save.
For manual deployment without the template, create a Zeabur GitHub service, keep Root Directory as `/`, build with the repo-root `Dockerfile`, mount a persistent volume at `/data`, set `PORT=5001`, `DS2API_ADMIN_KEY=your-strong-secret`, and `DS2API_CONFIG_PATH=/data/config.json`, then expose HTTP port `5001`. See [docs/DEPLOY.en.md](docs/DEPLOY.en.md#manual-deployment-without-the-template) for the full guide.
Note: when Zeabur builds directly from the repo `Dockerfile`, you do not need to pass `BUILD_VERSION`. The image prefers that build arg when provided, and automatically falls back to the repo-root `VERSION` file when it is absent.
### Option 3: Vercel
@@ -305,8 +311,7 @@ Common fields:
- `model_aliases`: one shared alias map for OpenAI / Claude / Gemini model names.
- `runtime`: account concurrency, queueing, and token refresh behavior, hot-reloadable via Admin Settings.
- `auto_delete.mode`: remote session cleanup after each request, supporting `none` / `single` / `all`.
- `history_split`: legacy multi-turn history split field, now ignored and kept only for backward-compatible config loading.
- `current_input_file`: the only active split mode; it is enabled by default and uploads the full context as a `DS2API_HISTORY.txt` context file once the character threshold is reached.
- `current_input_file`: the global context split/upload mode; it is enabled by default and uploads the full context as a `DS2API_HISTORY.txt` context file once the character threshold is reached.
- If you turn off `current_input_file`, requests pass through directly without uploading any split context file.
For the full environment variable list, see [docs/DEPLOY.en.md](docs/DEPLOY.en.md). For auth behavior, see [API.en.md](API.en.md#authentication).

View File

@@ -1 +1 @@
4.3.0
4.4.0

View File

@@ -43,10 +43,6 @@
"gpt-5.3-codex": "deepseek-v4-pro",
"o3": "deepseek-v4-pro"
},
"compat": {
"wide_input_strict_output": true,
"strip_reference_markers": true
},
"responses": {
"store_ttl_seconds": 900
},

View File

@@ -15,6 +15,7 @@ ds2api/
│ └── workflows/ # GitHub Actions workflows
├── api/ # Serverless entrypoints (Vercel Go/Node)
├── app/ # Application-level handler assembly
├── artifacts/ # Debug artifacts (raw-stream-sim, stream-debug, etc.)
├── cmd/ # Executable entrypoints
│ ├── ds2api/ # Main service bootstrap
│ └── ds2api-tests/ # E2E testsuite CLI bootstrap
@@ -25,6 +26,8 @@ ds2api/
│ ├── chathistory/ # Server-side conversation history storage/query
│ ├── claudeconv/ # Claude message conversion helpers
│ ├── compat/ # Compatibility and regression helpers
│ ├── assistantturn/ # Upstream output to canonical assistant turn / stream event semantics
│ ├── completionruntime/ # Shared Go DeepSeek completion startup, non-stream collection, and retry
│ ├── config/ # Config loading/validation/hot reload
│ ├── deepseek/ # DeepSeek upstream client/protocol/transport
│ │ ├── client/ # Login/session/completion/upload/delete calls
@@ -38,13 +41,14 @@ ds2api/
│ │ ├── admin/ # Admin API root assembly and resource packages
│ │ ├── claude/ # Claude HTTP protocol adapter
│ │ ├── gemini/ # Gemini HTTP protocol adapter
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions execution entrypoint
│ │ ├── responses/ # Responses API and response store
│ │ ├── files/ # Files API and inline-file preprocessing
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP errors/models/tool formatting
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions execution entrypoint
│ │ ├── responses/ # Responses API and response store
│ │ ├── files/ # Files API and inline-file preprocessing
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP errors/models/tool formatting
│ │ └── requestbody/ # HTTP body reading and UTF-8/JSON validation helpers
│ ├── js/ # Node runtime related logic
│ │ ├── chat-stream/ # Node streaming bridge
│ │ ├── helpers/ # JS helper modules
@@ -61,13 +65,14 @@ ds2api/
│ ├── textclean/ # Text cleanup
│ ├── toolcall/ # Tool-call parsing and repair
│ ├── toolstream/ # Go streaming tool-call anti-leak and delta detection
│ ├── translatorcliproxy/ # Cross-protocol translation bridge
│ ├── translatorcliproxy/ # Vercel/fallback/test protocol translation bridge
│ ├── util/ # Shared utility helpers
│ ├── version/ # Version query/compare
│ └── webui/ # WebUI static hosting logic
├── plans/ # Stage plans and manual QA records
├── pow/ # PoW standalone implementation + benchmarks
├── scripts/ # Build/release helper scripts
├── static/ # Build artifacts (admin static resources)
├── tests/ # Test assets and scripts
│ ├── compat/ # Compatibility fixtures + expected outputs
│ │ ├── expected/ # Expected output samples
@@ -76,9 +81,9 @@ ds2api/
│ │ └── toolcalls/ # Tool-call fixtures
│ ├── node/ # Node unit tests
│ ├── raw_stream_samples/ # Upstream raw SSE samples
│ │ ├── content-filter-trigger-20260405-jwt3/ # Content-filter terminal sample
│ │ ├── continue-thinking-snapshot-replay-20260405/ # Continue-thinking sample
│ │ ├── guangzhou-weather-reasoner-search-20260404/ # Search/reference sample
│ │ ├── longtext-deepseek-v4-flash-20260429/ # Flash long-text/file-upload sample
│ │ ├── longtext-deepseek-v4-pro-20260429/ # Pro long-text/file-upload sample
│ │ ├── markdown-format-example-20260405/ # Markdown sample
│ │ └── markdown-format-example-20260405-spacefix/ # Space-fix sample
│ ├── scripts/ # Test entry scripts
@@ -91,6 +96,8 @@ ds2api/
├── features/ # Feature modules
│ ├── account/ # Account management page
│ ├── apiTester/ # API tester page
│ ├── chatHistory/ # Server-side conversation history page
│ ├── proxy/ # Proxy management page
│ ├── settings/ # Settings page
│ └── vercel/ # Vercel sync page
├── layout/ # Layout components
@@ -124,8 +131,11 @@ flowchart LR
subgraph RUNTIME[Shared runtime]
AUTH[internal/auth]
POOL[internal/account queue + concurrency]
CR[internal/completionruntime]
TURN[internal/assistantturn]
STREAM[internal/stream + internal/sse]
TOOL[internal/toolcall + internal/toolstream]
FMT[internal/format/openai + claude]
DS[internal/deepseek/client]
POW[pow + internal/deepseek/protocol]
end
@@ -151,16 +161,24 @@ flowchart LR
PC --> PROMPT
PC -.long history.-> HIST
PC --> AUTH
PC --> CR
NCS -.Go prepare/release.-> CHAT
NCS --> JS
JS --> TOOL
AUTH --> POOL
CHAT --> STREAM
RESP --> STREAM
CHAT --> CR
RESP --> CR
CA --> CR
GA --> CR
CR --> DS
CR --> STREAM
CR --> TURN
STREAM --> TURN
STREAM --> TOOL
POOL --> DS
TURN --> FMT
POOL --> CR
DS --> POW
DS --> U[DeepSeek upstream]
```
@@ -169,9 +187,12 @@ flowchart LR
- `internal/server`: router tree + middlewares (health, protocol routes, Admin/WebUI).
- `internal/httpapi/openai/*`: OpenAI HTTP surface split into chat, responses, files, embeddings, history, and shared packages; chat/responses share the promptcompat, stream, and toolcall semantics.
- `internal/httpapi/{claude,gemini}`: protocol wrappers that normalize into the same prompt compatibility semantics without duplicating upstream execution.
- `internal/httpapi/{claude,gemini}`: protocol adapters that normalize into the same prompt compatibility semantics; normal direct paths must share DeepSeek session/PoW/completion execution through `completionruntime`, while `translatorcliproxy` is reserved for Vercel prepare/release, missing-backend fallback, and regression tests.
- `internal/httpapi/requestbody`: shared HTTP body reading, JSON pre-validation, and UTF-8 error helpers across protocol adapters.
- `internal/promptcompat`: compatibility core for turning OpenAI/Claude/Gemini requests into DeepSeek web-chat plain-text context.
- `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI.
- `internal/assistantturn`: Go output-side canonical semantics, converting DeepSeek SSE collection results and stream finalization state into assistant turns and centralizing thinking, tool call, citation, usage, stop/error behavior.
- `internal/completionruntime`: shared Go completion execution helpers for DeepSeek session/PoW/call startup, non-stream collection, and empty-output retry; streaming paths use it to start upstream requests, continue to use `internal/stream` for real-time consumption, and use `assistantturn` during finalization.
- `internal/translatorcliproxy`: bridge compatibility layer for Claude/Gemini and OpenAI shape translation; it is not the main business protocol conversion center.
- `internal/deepseek/{client,protocol,transport}`: upstream requests, sessions, PoW adaptation, protocol constants, and transport details.
- `internal/js/chat-stream` + `api/chat-stream.js`: Vercel Node streaming bridge; Go prepare/release owns auth, account lease, and completion payload assembly, while Node relays real-time SSE with Go-aligned finalization and tool sieve semantics.
- `internal/stream` + `internal/sse`: Go stream parsing and incremental assembly.
@@ -180,6 +201,13 @@ flowchart LR
- `internal/chathistory`: server-side conversation history persistence, pagination, detail lookup, and retention policy.
- `internal/config`: config loading/validation + runtime settings hot-reload.
- `internal/account`: managed account pool, inflight slots, waiting queue.
- `internal/textclean`: text cleanup helpers, e.g. stripping `[reference: N]` markers.
- `internal/claudeconv`: Claude API request to DeepSeek format conversion.
- `internal/compat`: compatibility regression tests using SSE fixtures to verify output consistency.
- `internal/rawsample`: upstream raw response capture, read/write, and management.
- `internal/devcapture`: developer debug capture, storing HTTP request/response for troubleshooting.
- `internal/util`: cross-package utilities including JSON writing, type conversion, token counting, thinking parsing, etc.
- `internal/version`: version query and comparison, supporting build-time injection and runtime resolution.
## 4. WebUI Runtime Relation

View File

@@ -15,6 +15,7 @@ ds2api/
│ └── workflows/ # GitHub Actions 工作流
├── api/ # Serverless 入口Vercel Go/Node
├── app/ # 应用级 handler 装配层
├── artifacts/ # 调试产物raw-stream-sim, stream-debug 等)
├── cmd/ # 可执行程序入口
│ ├── ds2api/ # 主服务启动入口
│ └── ds2api-tests/ # E2E 测试集 CLI 入口
@@ -25,6 +26,8 @@ ds2api/
│ ├── chathistory/ # 服务器端对话记录存储与查询
│ ├── claudeconv/ # Claude 消息格式转换工具
│ ├── compat/ # 兼容性辅助与回归支持
│ ├── assistantturn/ # 上游输出到统一 assistant turn / stream event 的语义层
│ ├── completionruntime/ # Go 主路径共享 DeepSeek completion 启动、非流式收集与 retry
│ ├── config/ # 配置加载、校验、热更新
│ ├── deepseek/ # DeepSeek 上游 client/protocol/transport
│ │ ├── client/ # 登录、会话、completion、上传/删除等上游调用
@@ -38,13 +41,14 @@ ds2api/
│ │ ├── admin/ # Admin API 根装配与资源子包
│ │ ├── claude/ # Claude HTTP 协议适配
│ │ ├── gemini/ # Gemini HTTP 协议适配
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions 执行入口
│ │ ├── responses/ # Responses API 与 response store
│ │ ├── files/ # Files API 与 inline file 预处理
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP 公共错误/模型/工具格式
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions 执行入口
│ │ ├── responses/ # Responses API 与 response store
│ │ ├── files/ # Files API 与 inline file 预处理
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP 公共错误/模型/工具格式
│ │ └── requestbody/ # HTTP 请求体读取与 UTF-8/JSON 校验辅助
│ ├── js/ # Node Runtime 相关逻辑
│ │ ├── chat-stream/ # Node 流式输出桥接
│ │ ├── helpers/ # JS 辅助函数
@@ -61,13 +65,14 @@ ds2api/
│ ├── textclean/ # 文本清洗
│ ├── toolcall/ # 工具调用解析与修复
│ ├── toolstream/ # Go 流式 tool call 防泄漏与增量检测
│ ├── translatorcliproxy/ # 协议互转桥
│ ├── translatorcliproxy/ # Vercel/fallback/测试用协议互转桥
│ ├── util/ # 通用工具函数
│ ├── version/ # 版本查询/比较
│ └── webui/ # WebUI 静态托管相关逻辑
├── plans/ # 阶段计划与人工验收记录
├── pow/ # PoW 独立实现与基准
├── scripts/ # 构建/发布/辅助脚本
├── static/ # 构建产物admin 等静态资源)
├── tests/ # 测试资源与脚本
│ ├── compat/ # 兼容性夹具与期望输出
│ │ ├── expected/ # 预期结果样本
@@ -76,9 +81,9 @@ ds2api/
│ │ └── toolcalls/ # toolcall 夹具
│ ├── node/ # Node 单元测试
│ ├── raw_stream_samples/ # 上游原始 SSE 样本
│ │ ├── content-filter-trigger-20260405-jwt3/ # 风控终态样本
│ │ ├── continue-thinking-snapshot-replay-20260405/ # continue 样本
│ │ ├── guangzhou-weather-reasoner-search-20260404/ # 搜索+引用样本
│ │ ├── longtext-deepseek-v4-flash-20260429/ # flash 长文本/文件上传样本
│ │ ├── longtext-deepseek-v4-pro-20260429/ # pro 长文本/文件上传样本
│ │ ├── markdown-format-example-20260405/ # Markdown 样本
│ │ └── markdown-format-example-20260405-spacefix/ # 空格修复样本
│ ├── scripts/ # 测试脚本入口
@@ -91,6 +96,8 @@ ds2api/
├── features/ # 功能模块
│ ├── account/ # 账号管理页面
│ ├── apiTester/ # API 测试页面
│ ├── chatHistory/ # 服务器端对话记录页面
│ ├── proxy/ # 代理管理页面
│ ├── settings/ # 设置页面
│ └── vercel/ # Vercel 同步页面
├── layout/ # 布局组件
@@ -124,8 +131,11 @@ flowchart LR
subgraph RUNTIME[Shared runtime]
AUTH[internal/auth]
POOL[internal/account queue + concurrency]
CR[internal/completionruntime]
TURN[internal/assistantturn]
STREAM[internal/stream + internal/sse]
TOOL[internal/toolcall + internal/toolstream]
FMT[internal/format/openai + claude]
DS[internal/deepseek/client]
POW[pow + internal/deepseek/protocol]
end
@@ -151,16 +161,24 @@ flowchart LR
PC --> PROMPT
PC -.长历史.-> HIST
PC --> AUTH
PC --> CR
NCS -.Go prepare/release.-> CHAT
NCS --> JS
JS --> TOOL
AUTH --> POOL
CHAT --> STREAM
RESP --> STREAM
CHAT --> CR
RESP --> CR
CA --> CR
GA --> CR
CR --> DS
CR --> STREAM
CR --> TURN
STREAM --> TURN
STREAM --> TOOL
POOL --> DS
TURN --> FMT
POOL --> CR
DS --> POW
DS --> U[DeepSeek upstream]
```
@@ -169,9 +187,12 @@ flowchart LR
- `internal/server`路由树和中间件挂载健康检查、协议入口、Admin/WebUI
- `internal/httpapi/openai/*`OpenAI HTTP surface按 chat、responses、files、embeddings、history、shared 拆分chat/responses 共享 promptcompat、stream、toolcall 等核心语义。
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义,不重复实现上游调用逻辑
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义;正常直连路径必须通过 `completionruntime` 共享 DeepSeek session/PoW/completion 调用,`translatorcliproxy` 仅保留给 Vercel prepare/release、后端缺失 fallback 和回归测试
- `internal/httpapi/requestbody`跨协议复用的请求体读取、JSON 解码前置校验与 UTF-8 错误处理辅助。
- `internal/promptcompat`OpenAI/Claude/Gemini 请求到 DeepSeek 网页纯文本上下文的兼容内核。
- `internal/translatorcliproxy`Claude/Gemini 与 OpenAI 结构互转
- `internal/assistantturn`Go 输出侧统一语义层,把 DeepSeek SSE 收集结果和流式收尾状态归一成 assistant turn集中处理 thinking、tool call、citation、usage、stop/error 语义
- `internal/completionruntime`Go surface 共享的 completion 执行辅助,负责 DeepSeek session/PoW/call 启动、非流式 collect 和 empty-output retry流式路径复用它启动上游请求继续用 `internal/stream` 做实时消费,并在最终收尾阶段接入 `assistantturn`
- `internal/translatorcliproxy`Claude/Gemini 与 OpenAI 结构互转的桥接兼容层,不作为主业务协议转换中心。
- `internal/deepseek/{client,protocol,transport}`上游请求、会话、PoW 适配、协议常量与传输层。
- `internal/js/chat-stream` + `api/chat-stream.js`Vercel Node 流式桥Go prepare/release 管理鉴权、账号租约和 completion payloadNode 侧负责实时 SSE 转发并保持 Go 对齐的终结态和 tool sieve 语义。
- `internal/stream` + `internal/sse`Go 流式解析与增量处理。
@@ -180,6 +201,13 @@ flowchart LR
- `internal/chathistory`:服务器端对话记录持久化、分页、单条详情和保留策略。
- `internal/config`:配置加载、校验、运行时 settings 热更新。
- `internal/account`:托管账号池、并发槽位、等待队列。
- `internal/textclean`:文本清洗,移除 `[reference: N]` 标记等噪声。
- `internal/claudeconv`Claude API 请求到 DeepSeek 格式的协议转换。
- `internal/compat`:兼容性回归测试套件,用 SSE 夹具验证输出一致性。
- `internal/rawsample`:上游原始响应的采集、读写与管理。
- `internal/devcapture`:开发调试抓包,存储 HTTP 请求/响应用于问题排查。
- `internal/util`:跨包通用工具,含 JSON 写入、类型转换、token 计数、thinking 解析等。
- `internal/version`:版本号查询与比较,支持构建注入和运行时解析。
## 4. WebUI 与运行时关系

View File

@@ -197,7 +197,7 @@ This repo includes a `zeabur.yaml` template for one-click deployment on Zeabur:
Notes:
- **Port**: DS2API listens on `5001` by default; the template sets `PORT=5001`.
- **Persistent config**: the template mounts `/data` and sets `DS2API_CONFIG_PATH=/data/config.json`. After importing config in Admin UI, it will be written and persisted to this path.
- **Persistent config**: the template mounts `/data` and sets `DS2API_CONFIG_PATH=/data/config.json`. On a fresh volume, DS2API starts with an empty file-backed config; after importing config in Admin UI, it will be written and persisted to this path.
- **`open /app/config.json: permission denied`**: this means the instance is trying to persist runtime tokens to a read-only path (commonly `/app` inside the image).
Recommended handling:
1. Set a writable path explicitly: `DS2API_CONFIG_PATH=/data/config.json` (and mount a persistent volume at `/data`);
@@ -206,6 +206,37 @@ Notes:
- **Build version**: Zeabur / regular `docker build` does not require `BUILD_VERSION` by default. The image prefers that build arg when provided, and automatically falls back to the repo-root `VERSION` file when it is absent.
- **First login**: after deployment, open `/admin` and login with `DS2API_ADMIN_KEY` shown in Zeabur env/template instructions (recommended: rotate to a strong secret after first login).
#### Manual Deployment Without The Template
If you do not want to use the `zeabur.yaml` one-click template, deploy directly from the repo root with Zeabur's GitHub integration:
1. Fork this repo, or push the code to your own GitHub repository.
2. In Zeabur Dashboard, create a Project, add a Service, then choose a GitHub/Git repository source.
3. Select the repository and branch. Keep Root Directory as `/`.
4. Use the Dockerfile build path. Zeabur auto-detects the repo-root `Dockerfile`; do not set `ZBPACK_IGNORE_DOCKERFILE=true`. If the UI asks for a Dockerfile name, enter `Dockerfile`.
5. Add a persistent volume in the Service settings and mount it at `/data`.
6. Configure environment variables:
| Variable | Recommended value | Description |
| --- | --- | --- |
| `PORT` | `5001` | Service listen port; keep it aligned with the exposed Zeabur HTTP port. |
| `DS2API_ADMIN_KEY` | Strong random string | Required admin login key. |
| `DS2API_CONFIG_PATH` | `/data/config.json` | Recommended persistent config path. |
| `LOG_LEVEL` | `INFO` | Optional log level. |
| `DS2API_CONFIG_JSON` | Raw JSON or Base64 JSON | Optional config bootstrap from env. |
| `DS2API_ENV_WRITEBACK` | `1` | Optional; enable only when using `DS2API_CONFIG_JSON` and you want the initial config written to `/data/config.json`. |
7. Expose HTTP port `5001`. The health check path can be `/healthz`.
8. After deployment, open `/admin`, login with `DS2API_ADMIN_KEY`, then import or edit config in Admin UI. A fresh volume does not need `/data/config.json` up front; the service boots first and creates the file on the first save.
Troubleshooting:
- **Startup log says `open /data/config.json: no such file or directory`**: make sure you deployed a version that includes the fresh-volume bootstrap fix, then redeploy the latest code.
- **`open /app/config.json: permission denied`**: the config path still points at the read-only image directory; mount `/data` and set `DS2API_CONFIG_PATH=/data/config.json`.
- **Config disappears after restart**: check that the `/data` persistent volume is mounted on this service. If you use `DS2API_CONFIG_JSON` but want Admin UI saves persisted, enable `DS2API_ENV_WRITEBACK=1`.
References: Zeabur's official [GitHub/Git integration](https://zeabur.com/docs/en-US/deploy/github), [Dockerfile deployment](https://zeabur.com/docs/en-US/deploy/dockerfile), and [Volumes](https://zeabur.com/docs/data-management/volumes) docs.
---
## 3. Vercel Deployment

View File

@@ -197,7 +197,7 @@ healthcheck:
部署要点:
- **端口**:服务默认监听 `5001`,模板会固定设置 `PORT=5001`
- **配置持久化**:模板挂载卷 `/data`,并设置 `DS2API_CONFIG_PATH=/data/config.json`;在管理台导入配置后,会写入并持久化到该路径。
- **配置持久化**:模板挂载卷 `/data`,并设置 `DS2API_CONFIG_PATH=/data/config.json`首次空卷启动时会先使用空的文件模式配置,在管理台导入配置后,会写入并持久化到该路径。
- **`open /app/config.json: permission denied`**:说明当前实例在尝试把运行时 token 持久化到只读路径(常见于镜像内 `/app`)。
处理建议:
1. 显式设置可写路径:`DS2API_CONFIG_PATH=/data/config.json`(并挂载持久卷到 `/data`
@@ -206,6 +206,37 @@ healthcheck:
- **构建版本号**Zeabur / 普通 `docker build` 默认不需要传 `BUILD_VERSION`;镜像会优先使用该构建参数,未提供时自动回退到仓库根目录的 `VERSION` 文件。
- **首次登录**:部署完成后访问 `/admin`,使用 Zeabur 环境变量/模板指引中的 `DS2API_ADMIN_KEY` 登录(建议首次登录后自行更换为强密码)。
#### 不使用模板手动部署
如果你不想使用 `zeabur.yaml` 一键模板,可以直接用 Zeabur 的 GitHub 集成从仓库根目录构建:
1. Fork 本仓库,或把代码推送到你自己的 GitHub 仓库。
2. 在 Zeabur Dashboard 中创建 Project然后添加 Service选择 GitHub/Git 仓库来源。
3. 选择仓库与分支Root Directory 保持 `/`
4. 构建方式使用 Dockerfile。Zeabur 会自动检测仓库根目录的 `Dockerfile`;不要设置 `ZBPACK_IGNORE_DOCKERFILE=true`。如果界面要求填写 Dockerfile 名称,填写 `Dockerfile`
5. 在 Service 配置中添加持久卷,挂载目录填写 `/data`
6. 配置环境变量:
| 变量 | 推荐值 | 说明 |
| --- | --- | --- |
| `PORT` | `5001` | 服务监听端口,需要和 Zeabur 暴露的 HTTP 端口一致。 |
| `DS2API_ADMIN_KEY` | 强随机字符串 | 管理台登录密钥,必填。 |
| `DS2API_CONFIG_PATH` | `/data/config.json` | 配置持久化路径,建议必填。 |
| `LOG_LEVEL` | `INFO` | 可选,日志级别。 |
| `DS2API_CONFIG_JSON` | 原始 JSON 或 Base64 JSON | 可选,用于用环境变量初始化配置。 |
| `DS2API_ENV_WRITEBACK` | `1` | 可选;当设置了 `DS2API_CONFIG_JSON` 且希望首次启动后写入 `/data/config.json` 时再启用。 |
7. 暴露 HTTP 端口 `5001`,健康检查路径可填 `/healthz`
8. 部署完成后访问 `/admin`,用 `DS2API_ADMIN_KEY` 登录,然后在管理台导入或编辑配置。首次空卷可以没有 `/data/config.json`,服务会先启动,第一次保存时自动创建该文件。
常见问题:
- **启动日志出现 `open /data/config.json: no such file or directory`**:请确认已经部署包含“首次空卷启动”修复的版本,并重新部署最新代码。
- **出现 `open /app/config.json: permission denied`**:说明配置路径仍指向镜像内只读目录;设置持久卷 `/data`,并确认 `DS2API_CONFIG_PATH=/data/config.json`
- **管理台保存后重启配置丢失**:检查 `/data` 持久卷是否已挂载到当前服务;如果使用了 `DS2API_CONFIG_JSON`,但想让管理台保存落盘,请启用 `DS2API_ENV_WRITEBACK=1`
参考Zeabur 官方文档的 [GitHub/Git 集成](https://zeabur.com/docs/en-US/deploy/github)、[Dockerfile 部署](https://zeabur.com/docs/zh-CN/deploy/dockerfile) 与 [Volumes](https://zeabur.com/docs/data-management/volumes)。
---
## 三、Vercel 部署

View File

@@ -68,7 +68,7 @@ gofmt -w <changed-go-files>
3. 请求归一化:`internal/promptcompat` 或协议转换包。
4. 上游请求:`internal/deepseek/client`
5. 流式输出:`internal/stream``internal/sse``internal/toolstream`
6. 响应格式:`internal/format/*``internal/translatorcliproxy`
6. 响应格式:主路径看 `internal/assistantturn``internal/format/*``internal/translatorcliproxy` 只用于 Vercel/fallback/test 桥接
对话记录页面问题优先检查:

View File

@@ -1,7 +1,7 @@
# DeepSeek SSE 行为结构说明(第三方逆向版)
> 说明:本文基于当前仓库 `tests/raw_stream_samples/` 下全部 `upstream.stream.sse` 原始流样本整理而成,属于第三方逆向观察文档,不是官方协议。
> 当前 corpus 由 4 份原始流组成,覆盖搜索+引用、风控终态、Markdown 输出和空格敏感输出等行为。
> 当前 corpus 由 5 份原始流组成,覆盖长文本生成、文件上传上下文、continue 接续、Markdown 输出和空格敏感输出等行为。
> 补充:文末还会注明少量“当前实现已确认、但 corpus 尚未完整覆盖”的行为,例如长思考场景下的自动续写状态。
文档导航:[文档总索引](./README.md) / [测试指南](./TESTING.md) / [样本目录说明](../tests/raw_stream_samples/README.md)
@@ -12,8 +12,9 @@
| 样本 | 观察重点 |
| --- | --- |
| [guangzhou-weather-reasoner-search-20260404](../tests/raw_stream_samples/guangzhou-weather-reasoner-search-20260404/upstream.stream.sse) | 搜索+思考流程,包含 `reference:N` 引用标记与工具片段 |
| [content-filter-trigger-20260405-jwt3](../tests/raw_stream_samples/content-filter-trigger-20260405-jwt3/upstream.stream.sse) | `CONTENT_FILTER` 终态分支,包含拒答模板与 `ban_regenerate` |
| [longtext-deepseek-v4-flash-20260429](../tests/raw_stream_samples/longtext-deepseek-v4-flash-20260429/upstream.stream.sse) | DeepSeek V4 flash 长文本流,包含 current input file 上传后的 completion 样本 |
| [longtext-deepseek-v4-pro-20260429](../tests/raw_stream_samples/longtext-deepseek-v4-pro-20260429/upstream.stream.sse) | DeepSeek V4 pro 长文本流,包含文件上传上下文和较长 reasoning/content 输出 |
| [continue-thinking-snapshot-replay-20260405](../tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/upstream.stream.sse) | 多轮 `completion + continue` 原始流,用于验证接续思考去重 |
| [markdown-format-example-20260405](../tests/raw_stream_samples/markdown-format-example-20260405/upstream.stream.sse) | Markdown 输出的早期样本,用于观察 token 级输出形态 |
| [markdown-format-example-20260405-spacefix](../tests/raw_stream_samples/markdown-format-example-20260405-spacefix/upstream.stream.sse) | Markdown 输出修正样本,用于验证空格 chunk 必须保留 |
@@ -194,7 +195,7 @@ close
## 8. 终态行为
当前 corpus 里有两条很重要的终态分支
当前 corpus 直接覆盖正常完成和 continue 接续;当前实现还兼容 `CONTENT_FILTER` 风控终态,相关分支由协议观察与兼容性 fixture 继续守护
### 8.1 正常完成
@@ -208,7 +209,7 @@ close
### 8.2 风控终态
`content-filter-trigger-20260405-jwt3` 展示了另一种终态路径:
`CONTENT_FILTER` 不在当前 raw stream corpus 的目录样本中,但代码和兼容性测试仍按下面这种终态路径处理
1. 先继续输出一段正常正文。
2. 出现提示类 fragment例如 `TIP`

View File

@@ -16,6 +16,7 @@
### 专题文档
- [DS2API 项目价值说明](./project-value.md)
- [API -> 网页对话纯文本兼容主链路说明](./prompt-compatibility.md)
- [Tool Calling 统一语义](./toolcall-semantics.md)
- [DeepSeek SSE 行为结构说明(逆向观察)](./DeepSeekSSE行为结构说明-2026-04-05.md)
@@ -47,6 +48,7 @@ Recommended reading order:
### Topical docs
- [DS2API project value note](./project-value.md)
- [API -> pure-text web-chat compatibility pipeline](./prompt-compatibility.md)
- [Tool-calling unified semantics](./toolcall-semantics.md)
- [DeepSeek SSE behavior notes (reverse-engineered)](./DeepSeekSSE行为结构说明-2026-04-05.md)

119
docs/project-value.md Normal file
View File

@@ -0,0 +1,119 @@
# DS2API 项目价值说明
文档导航:[总览](../README.MD) / [文档索引](./README.md) / [接口文档](../API.md) / [兼容主链路](./prompt-compatibility.md) / [Tool Calling 语义](./toolcall-semantics.md)
> 本文用于说明 DS2API 的项目定位与长期价值。
> 它不是架构说明,也不是功能清单,而是从“网页能力如何稳定 API 化”这个角度解释本项目为什么成立。
## 1. 项目定位
DS2API 的定位不是“又一个 API 代理”,也不是训练工具。
它本质上是一个网页转 API 的兼容层:把 DeepSeek 网页对话侧可用的能力,整理成 OpenAI / Claude / Gemini 风格客户端可以接入的请求与响应形态。
本项目的核心价值在于:
1. 把 DeepSeek 网页对话能力 API 化。
2. 把不同客户端协议统一到同一套兼容入口。
3. 把网页侧会话、thinking、文件引用、流式输出等行为整理成客户端可消费的结果。
4. 为上层编程工具、自动化工具或外部编排器提供稳定后端。
## 2. 解决的问题
### 2.1 把网页能力变成可接入的 API 形态
网页侧能力可以直接对话,但标准客户端需要的是稳定的 API 契约。两者之间有一段天然差距:
- 输入格式不同
- 输出事件不同
- 流式语义不同
- 文件引用方式不同
- thinking 与正文的暴露方式不同
DS2API 通过 `promptcompat``completionruntime``assistantturn` 和各协议 renderer把这段差距收敛到一条可维护的主链路中
- 请求侧把 OpenAI / Claude / Gemini 消息归一成网页纯文本上下文。
- 上游侧按 DeepSeek 网页 completion 需要的 payload 发起会话。
- 输出侧把 DeepSeek SSE 收集或流式事件再渲染回各协议原生形态。
这才是本项目的主定义:把网页能力稳定转成 API 可消费形态。
### 2.2 不只是转发,而是兼容
普通转发只能把请求送出去无法处理协议语义之间的差异。DS2API 需要额外处理:
- 模型 alias 与 DeepSeek 原生模型的映射
- thinking / reasoning 开关与输出结构
- search 与 citation / reference 标记
- 文件上传、历史文件和 current input file
- 上游空输出、content filter、auto-continue、重试和 usage 估算
这些都不是“把 URL 改一下”能解决的事情。项目价值正是在这些细节里体现出来。
### 2.3 让外部工具链能挂上去
当用户把 DS2API 接到编程工具、自动化工具或第三方 SDK 时,很多请求会变成长链路任务:
- 读取文件
- 搜索上下文
- 修改代码
- 执行命令
- 继续修正
- 输出最终结果
DS2API 不直接定义这些外部工具链,但它提供了一个足够稳定的 API 底座,让这些工具链可以外挂在上面继续工作。
## 3. 工具调用的价值
工具调用不是 DS2API 成立的前提,但它是项目很重要的增强能力。
即使没有工具调用DS2API 仍然是网页转 API 兼容层;当请求包含工具能力时,项目会额外处理模型输出漂移、长参数和流式防泄漏等问题:
- 长脚本用 CDATA 保住原文
- 文件路径和命令参数不容易被转义打坏
- tool call 语法有统一的 DSML / canonical XML 处理
- 模型输出漂了也能宽匹配、自修正
- 流式场景能尽量不把工具块漏回普通文本
这使 DS2API 可以服务编程工具和 agent 类客户端,但项目主轴仍然是“网页能力 API 化”,不是把工具调用当作项目唯一卖点。
## 4. CDATA 的作用
CDATA 不是项目价值本身,但它是工具调用与长文本兼容中很实用的一部分。
对本项目这种场景来说CDATA 的作用很直接:
- 保护长文本不被转义破坏
- 保住脚本、命令、代码片段的原样性
- 让结构化参数和自由文本更稳定地共存
- 让历史内容更容易被原样回放和再处理
它的意义不是让协议显得更复杂,而是让内容更少在转写、解析和回放过程中坏掉。
## 5. 它不是什么
为了避免误解,需要明确项目边界:
- 不是官方 DeepSeek API。
- 不是训练平台。
- 不是人工标注系统。
- 不是独立评测工具。
- 不是简单反代。
DS2API 是兼容层。它的职责是把网页能力整理成 API 体验,并在必要时对工具、历史、文件和流式输出做兼容处理。
## 6. 长期价值
DS2API 的长期价值,不在某个单点功能,而在于它把多个难点放进了同一条可维护链路:
- 多协议入口
- DeepSeek 网页 completion 适配
- prompt 纯文本兼容
- thinking / search / file 引用处理
- Go / Node 流式输出对齐
- tool call 解析与防泄漏
- Admin / WebUI / 账号池 / 并发队列
如果要用一句话概括它的价值,可以写成:
**DS2API 的价值,是把 DeepSeek 网页能力稳定整理成标准客户端可以持续使用的 API 形态。**

View File

@@ -3,7 +3,7 @@
文档导航:[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [接口文档](../API.md) / [测试指南](./TESTING.md)
> 本文档是 DS2API“把 OpenAI / Claude / Gemini 风格 API 请求兼容成 DeepSeek 网页对话纯文本上下文”的专项说明。
> 这是项目最重要的兼容产物之一。凡是修改消息标准化、tool prompt 注入、tool history 保留、文件引用、current input file / legacy history_split、下游 completion payload 组装等行为,都必须同步更新本文档。
> 这是项目最重要的兼容产物之一。凡是修改消息标准化、tool prompt 注入、tool history 保留、文件引用、current input file、下游 completion payload 组装等行为,都必须同步更新本文档。
## 1. 核心结论
@@ -45,9 +45,12 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
-> promptcompat 统一消息标准化
-> tool prompt 注入
-> DeepSeek 风格 prompt 拼装
-> 文件收集 / inline 上传 / current input fileOpenAI 链路)
-> 文件收集 / inline 上传OpenAI 文件链路)
-> current input filecompletion runtime 全局入口)
-> completion payload
-> 下游网页对话接口
-> assistantturn 输出语义归一Go 非流式 + 流式收尾)
-> 各协议 rendererOpenAI / Responses / Claude / Gemini
```
对应的关键代码入口:
@@ -72,6 +75,10 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
[internal/promptcompat/thinking_injection.go](../internal/promptcompat/thinking_injection.go)
- completion payload
[internal/promptcompat/standard_request.go](../internal/promptcompat/standard_request.go)
- Go 输出侧 assistant turn
[internal/assistantturn/turn.go](../internal/assistantturn/turn.go)
- Go completion runtime
[internal/completionruntime/nonstream.go](../internal/completionruntime/nonstream.go)
## 4. 下游真正收到的东西
@@ -101,8 +108,9 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
- 对外返回给客户端的 `prompt_tokens` / `input_tokens` / `promptTokenCount` 不再按“最后一条消息”或字符粗估近似返回,而是基于**完整上下文 prompt**做 tokenizer 计数;为了避免上下文实际超限但客户端误以为还能塞下,请求侧上下文 token 会额外保守上浮一点,宁可略大也不低估。
- 当前 `/v1/chat/completions` 业务路径仍是“每次请求新建一个远端 `chat_session_id`,并默认发送 `parent_message_id: null`”;因此 DS2API 对外默认表现为“新会话 + prompt 拼历史”,而不是复用 DeepSeek 原生会话树。
- 但 DeepSeek 远端本身支持同一 `chat_session_id` 的跨轮次持续对话。2026-04-27 已用项目内现有 DeepSeek client 做过一次不改业务代码的双轮实测:同一 `chat_session_id` 下,第 1 轮返回 `request_message_id=1` / `response_message_id=2` / 文本 `SESSION_TEST_ONE`;第 2 轮重新获取一次 PoW并发送 `parent_message_id=2` 后,成功返回 `request_message_id=3` / `response_message_id=4` / 文本 `SESSION_TEST_TWO`。这说明“同远端会话持续聊天”能力存在,且每轮需要携带正确的 parent/message 链接信息,同时重新获取对应轮次可用的 PoW。
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`Claude 消息接口在可代理场景会转换为 OpenAI chat 形态再执行
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。Claude surface 在流式请求且未显式声明 `thinking` 时,仍按 Anthropic 语义默认关闭;但在非流式代理场景,兼容层会内部开启一次下游 thinking用于捕获“正文为空、工具调用落在 thinking 里”的情况,随后在回包前剥离用户不可见的 thinking block
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`。Go 主服务新增 `completionruntime` 启动层,统一执行 DeepSeek session/PoW/call输出侧新增 `assistantturn` 语义层:非流式 OpenAI Chat / Responses / Claude / Gemini 会把 DeepSeek SSE 收集结果先归一成同一份 assistant turn再分别渲染成各协议原生外形流式 OpenAI Chat / Responses / Claude / Gemini 继续保持各协议实时 SSE framing但最终收尾的 tool fallback、schema 归一、usage、empty-output / content-filter 错误语义同样由 `assistantturn` 判定。Claude / Gemini 的常规 Go 主路径不再依赖内部 `httptest` 转发到 OpenAI handler`translatorcliproxy` 仅保留用于 Vercel bridge、后端缺失 fallback 和回归测试,不作为主业务协议转换中心
- Vercel Node 流式路径本轮不迁移,仍使用现有 Node bridge / stream-tool-sieve 实现;后续若变更 Node 流式语义,需要按 `assistantturn` 的 Go canonical 输出语义同步对齐
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。未显式关闭时,各 surface 会按解析后的 DeepSeek 模型默认能力开启 thinking并用各自协议的原生形态暴露OpenAI Chat 为 `reasoning_content`OpenAI Responses 为 `response.reasoning.delta` / `reasoning` contentClaude 为 `thinking` block / `thinking_delta`Gemini 为 `thought: true` part。
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本,因此即使最终可见层会剥离完整 leaked DSML / XML `tool_calls` wrapper、并抑制全空参数或无效 wrapper 块,也不会影响真实工具调用转成结构化 `tool_calls` / `function_call`。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning思维链只用于检测不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
- OpenAI Chat / Responses 的空回复错误处理之前会默认做一次内部补偿重试:第一次上游完整结束后,如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用,并且终止原因不是 `content_filter`,兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略,把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。重试遵循 DeepSeek 多轮对话协议:从第一次上游 SSE 流中提取 `response_message_id`,并在重试 payload 中设置 `parent_message_id` 为该值,使重试成为同一会话的后续轮次而非断裂的根消息;同时重新获取一次 PoW若 PoW 获取失败则回退到原始 PoW。该重试不会重新标准化消息、不会新建 session、不会切换账号也不会向流式客户端插入重试标记第二次 thinking / reasoning 会按正常增量直接接到第一次之后,并继续使用 overlap trim 去重。若第二次仍为空,终端错误码仍保持现有 `upstream_empty_output`;若任一尝试触发空 `content_filter`,不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`,但因无法直接调用 PoW API 而复用原始 PoW。
@@ -159,8 +167,8 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
4. 把这整段内容并入 system prompt。
工具调用正例现在优先示范官方 DSML 风格:`<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`
兼容层仍接受旧式纯 `<tool_calls>` wrapper但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外,解析器还会回收一些更松散的列表写法,例如 JSON array 字面量或逗号分隔的 JSON 项序列,只要它们足够明确;但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。
兼容层仍接受旧式纯 `<tool_calls>` wrapper并会容错若干 DSML 标签变体,包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外,解析器还会回收一些更松散的列表写法,例如 JSON array 字面量或逗号分隔的 JSON 项序列,只要它们足够明确;但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。`command` / `content` 等长文本参数CDATA 内部的 Markdown fenced DSML / XML 示例会作为原文保护;示例里的 `]]></parameter>``</tool_calls>` 不会截断外层工具调用,解析器会继续等待围栏外真正的参数 / wrapper 结束标签。
Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限;当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时,非流式收集、流式解析和 auto-continue 透传都会保留完整行,再进入同一套工具解析与序列化流程。
在 assistant 最终回包阶段,如果某个 tool 参数在声明 schema 中明确是 `string`,兼容层会在把解析后的 `tool_calls` / `function_call` 重新序列化成 OpenAI / Responses / Claude 可见参数前,递归把该路径上的 number / bool / object / array 统一转成字符串;其中 object / array 会压成紧凑 JSON 字符串。这个保护只对 schema 明确声明为 string 的路径生效,不会改写本来就是 `number` / `boolean` / `object` / `array` 的参数。这样可以兼容 DeepSeek 输出了结构化片段、但上游客户端工具 schema 又严格要求字符串参数的场景(例如 `content``prompt``path``taskId` 等)。
工具 schema 的权威来源始终是**当前请求实际携带的 schema**,而不是同名工具在其他 runtimeClaude Code / OpenCode / Codex 等)里的默认印象。兼容层现在会同时兼容 OpenAI 风格 `function.parameters`、直接工具对象上的 `parameters` / `input_schema`、以及 camelCase 的 `inputSchema` / `schema`,并在最终输出阶段按这份请求内 schema 决定是保留 array/object还是仅对明确声明为 `string` 的路径做字符串化。该规则同样适用于 Claude 的流式收尾和 Vercel Node 流式 tool-call formatter避免不同 runtime 因 schema shape 差异而出现同名工具参数类型漂移。
@@ -260,11 +268,10 @@ OpenAI 的文件上传现在不再是“只传文件本体”的通用路径,
## 9. 多轮历史为什么不会一直完整内联在 prompt
兼容层现在只保留 `current_input_file` 这一种拆分方式;旧的 `history_split` 已废弃,只保留为兼容旧配置的字段,不再参与请求处理
兼容层现在只保留 `current_input_file` 这一种拆分方式;旧的 `history_split` 配置字段已移除,读取旧配置时会忽略它且不会再写回
- `current_input_file` 默认开启;它用于把“完整上下文”合并进 `DS2API_HISTORY.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`(默认 `0`)时,兼容层会上传一个文件名为 `DS2API_HISTORY.txt` 的上下文文件。文件内容会先做 OpenAI 消息标准化,再序列化成按轮次编号的 `DS2API_HISTORY.txt` 风格 transcript带有 `# DS2API_HISTORY.txt` 标题和 `=== N. ROLE ===` 分段live prompt 中则会给出一个 continuation 语气的 user 消息,引导模型从 `DS2API_HISTORY.txt` 的最新状态继续推进,并直接回答最新请求,避免把任务拉回起点。
- `current_input_file` 默认开启;它在统一 completion runtime 入口全局生效,用于把“完整上下文”合并进 `DS2API_HISTORY.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`(默认 `0`)时,runtime 会上传一个文件名为 `DS2API_HISTORY.txt` 的上下文文件。文件内容会先经过各协议入口的标准化,再序列化成按轮次编号的 `DS2API_HISTORY.txt` 风格 transcript带有 `# DS2API_HISTORY.txt` 标题和 `=== N. ROLE ===` 分段live prompt 中则会给出一个 continuation 语气的 user 消息,引导模型从 `DS2API_HISTORY.txt` 的最新状态继续推进,并直接回答最新请求,避免把任务拉回起点。
- 如果 `current_input_file.enabled=false`,请求会直接透传,不上传任何拆分上下文文件。
- 旧的 `history_split.enabled` / `history_split.trigger_after_turns` 会被读取进配置对象以保持兼容,但不会触发拆分上传,也不会影响 `current_input_file` 的默认开启。
- 即使触发 `current_input_file` 后 live prompt 被缩短,对客户端回包里的上下文 token 统计,仍会沿用**拆分前的完整 prompt 语义**做计数,而不是按缩短后的占位 prompt 计算;否则会把真实上下文显著算小。
相关实现:
@@ -273,10 +280,10 @@ OpenAI 的文件上传现在不再是“只传文件本体”的通用路径,
[internal/config/store_accessors.go](../internal/config/store_accessors.go)
- 当前输入转文件:
[internal/httpapi/openai/history/current_input_file.go](../internal/httpapi/openai/history/current_input_file.go)
- 旧历史拆分兼容壳
[internal/httpapi/openai/history/history_split.go](../internal/httpapi/openai/history/history_split.go)
- 全局 completion runtime 应用点
[internal/completionruntime/nonstream.go](../internal/completionruntime/nonstream.go)
当前输入转文件启用并触发时,上传文件的真实文件名是 `DS2API_HISTORY.txt`,文件内容是完整 `messages` 上下文;它仍会先用 OpenAI 消息标准化和 DeepSeek 角色标记序列化,再按轮次编号成 `DS2API_HISTORY.txt` 风格的 transcript不再注入文件边界标签
当前输入转文件启用并触发时,上传文件的真实文件名是 `DS2API_HISTORY.txt`,文件内容是完整 `messages` 上下文;它会使用 OpenAI-compatible 的消息/transcript 序列化规则和 DeepSeek 角色标记,再按轮次编号成 `DS2API_HISTORY.txt` 风格的 transcript不再注入文件边界标签
```text
[uploaded filename]: DS2API_HISTORY.txt
@@ -308,7 +315,7 @@ Prior conversation history and tool progress.
- Responses `instructions` 会 prepend 为 system message
- `tools` 会注入 system prompt
- `attachments` / `input_file` / inline 文件会进入 `ref_file_ids`
- current input file 主要在这条链路里生效,旧 `history_split` 仅作兼容字段保留
- current input file 在统一 completion runtime 入口全局生效
### 10.2 Claude Messages
@@ -374,7 +381,7 @@ Prior conversation history and tool progress.
- tool prompt 模板或 tool_choice 约束变更
- inline 文件上传 / 文件引用收集规则变更
- current input file 触发条件、上传格式、`DS2API_HISTORY.txt` transcript 结构变更
-`history_split` 兼容逻辑的读取、忽略或退化行为变更
-`history_split` 字段忽略/清理行为变更
- completion payload 字段语义变更
- Claude / Gemini 对这套统一语义的复用关系变更
@@ -386,7 +393,8 @@ Prior conversation history and tool progress.
- `internal/promptcompat/tool_prompt.go`
- `internal/httpapi/openai/files/file_inline_upload.go`
- `internal/promptcompat/file_refs.go`
- `internal/httpapi/openai/history/history_split.go`
- `internal/httpapi/openai/history/current_input_file.go`
- `internal/completionruntime/nonstream.go`
- `internal/promptcompat/responses_input_normalize.go`
- `internal/httpapi/claude/standard_request.go`
- `internal/httpapi/claude/handler_utils.go`

View File

@@ -54,12 +54,13 @@
在流式链路中Go / Node 一致):
- DSML `<|DSML|tool_calls>` wrapper、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- DSML `<|DSML|tool_calls>` wrapper、短横线形式(如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapperGo 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
- 已识别成功的工具调用不会再次回流到普通文本
- 不符合新格式的块不会执行,并继续按原样文本透传
- fenced code block反引号 `` ``` `` 和波浪线 `~~~`)中的 XML 示例始终按普通文本处理
- 支持嵌套围栏(如 4 反引号嵌套 3 反引号)和 CDATA 内围栏保护
-`command` / `content` 等长文本参数CDATA 内部如果包含 Markdown fenced DSML / XML 示例,即使示例里出现 `]]></parameter>` / `</tool_calls>` 这类看起来像外层结束标签的片段,也会继续按参数原文保留,直到真正位于围栏外的外层结束标签
- 如果模型把 `<![CDATA[` 打开后却没有闭合,流式扫描阶段仍会保守地继续缓冲,不会误把 CDATA 里的示例 XML 当成真实工具调用;在最终 parse / flush 恢复阶段,会对这类 loose CDATA 做窄修复,尽量保住外层已完整包裹的真实工具调用
- 当文本中 mention 了某种标签名(如 `<dsml|tool_calls>` 或 Markdown inline code 里的 `<|DSML|tool_calls>`而后面紧跟真正工具调用时sieve 会跳过不可解析的 mention 候选并继续匹配后续真实工具块,不会因 mention 导致工具调用丢失,也不会截断 mention 后的正文
- Go 侧 SSE 读取不再使用 `bufio.Scanner` 的固定 token 上限;单个 `data:` 行中包含很长的写文件参数时,非流式收集、流式解析与 auto-continue 透传都应保留完整行,再交给 tool parser 处理

View File

@@ -0,0 +1,64 @@
package assistantturn
import (
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/sse"
)
type StreamEventType string
const (
StreamEventTextDelta StreamEventType = "text_delta"
StreamEventThinkingDelta StreamEventType = "thinking_delta"
StreamEventToolCall StreamEventType = "tool_call"
StreamEventDone StreamEventType = "done"
StreamEventError StreamEventType = "error"
StreamEventPing StreamEventType = "ping"
)
type StreamEvent struct {
Type StreamEventType
Text string
Thinking string
ToolCall any
Error *OutputError
Usage *Usage
}
type Accumulator struct {
inner shared.StreamAccumulator
}
type AccumulatorOptions struct {
ThinkingEnabled bool
SearchEnabled bool
StripReferenceMarkers bool
}
func NewAccumulator(opts AccumulatorOptions) *Accumulator {
return &Accumulator{
inner: shared.StreamAccumulator{
ThinkingEnabled: opts.ThinkingEnabled,
SearchEnabled: opts.SearchEnabled,
StripReferenceMarkers: opts.StripReferenceMarkers,
},
}
}
func (a *Accumulator) Apply(parsed sse.LineResult) shared.StreamAccumulatorResult {
if a == nil {
return shared.StreamAccumulatorResult{}
}
return a.inner.Apply(parsed)
}
func (a *Accumulator) Snapshot() (rawText, text, rawThinking, thinking, detectionThinking string) {
if a == nil {
return "", "", "", "", ""
}
return a.inner.RawText.String(),
a.inner.Text.String(),
a.inner.RawThinking.String(),
a.inner.Thinking.String(),
a.inner.ToolDetectionThinking.String()
}

View File

@@ -0,0 +1,285 @@
package assistantturn
import (
"net/http"
"strings"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/util"
)
type StopReason string
const (
StopReasonStop StopReason = "stop"
StopReasonToolCalls StopReason = "tool_calls"
StopReasonContentFilter StopReason = "content_filter"
StopReasonError StopReason = "error"
)
type Usage struct {
InputTokens int
OutputTokens int
ReasoningTokens int
TotalTokens int
}
type OutputError struct {
Status int
Message string
Code string
}
type Turn struct {
Model string
Prompt string
RawText string
RawThinking string
DetectionThinking string
Text string
Thinking string
ToolCalls []toolcall.ParsedToolCall
ParsedToolCalls toolcall.ToolCallParseResult
CitationLinks map[int]string
ContentFilter bool
ResponseMessageID int
StopReason StopReason
Usage Usage
Error *OutputError
}
type FinalizeOptions struct {
AlreadyEmittedToolCalls bool
}
type FinalOutcome struct {
FinishReason string
Error *OutputError
Usage Usage
HasToolCalls bool
HasVisibleText bool
HasVisibleOutput bool
ShouldFail bool
}
type BuildOptions struct {
Model string
Prompt string
RefFileTokens int
SearchEnabled bool
StripReferenceMarkers bool
ToolNames []string
ToolsRaw any
ToolChoice promptcompat.ToolChoicePolicy
}
type StreamSnapshot struct {
RawText string
VisibleText string
RawThinking string
VisibleThinking string
DetectionThinking string
ContentFilter bool
CitationLinks map[int]string
ResponseMessageID int
AlreadyEmittedCalls bool
AdditionalToolCalls []toolcall.ParsedToolCall
AlreadyEmittedToolRaw bool
}
func BuildTurnFromCollected(result sse.CollectResult, opts BuildOptions) Turn {
thinking := shared.CleanVisibleOutput(result.Thinking, opts.StripReferenceMarkers)
text := shared.CleanVisibleOutput(result.Text, opts.StripReferenceMarkers)
if opts.SearchEnabled {
text = shared.ReplaceCitationMarkersWithLinks(text, result.CitationLinks)
}
parsed := shared.DetectAssistantToolCalls(result.Text, text, result.Thinking, result.ToolDetectionThinking, opts.ToolNames)
calls := toolcall.NormalizeParsedToolCallsForSchemas(parsed.Calls, opts.ToolsRaw)
parsed.Calls = calls
stopReason := StopReasonStop
if result.ContentFilter {
stopReason = StopReasonContentFilter
}
if len(calls) > 0 {
stopReason = StopReasonToolCalls
}
turn := Turn{
Model: opts.Model,
Prompt: opts.Prompt,
RawText: result.Text,
RawThinking: result.Thinking,
DetectionThinking: result.ToolDetectionThinking,
Text: text,
Thinking: thinking,
ToolCalls: calls,
ParsedToolCalls: parsed,
CitationLinks: result.CitationLinks,
ContentFilter: result.ContentFilter,
ResponseMessageID: result.ResponseMessageID,
StopReason: stopReason,
}
turn.Usage = BuildUsage(opts.Model, opts.Prompt, thinking, text, opts.RefFileTokens)
turn.Error = ValidateTurn(turn, opts.ToolChoice)
if turn.Error != nil {
turn.StopReason = StopReasonError
}
return turn
}
func BuildTurnFromStreamSnapshot(snapshot StreamSnapshot, opts BuildOptions) Turn {
thinking := shared.CleanVisibleOutput(snapshot.VisibleThinking, opts.StripReferenceMarkers)
text := shared.CleanVisibleOutput(snapshot.VisibleText, opts.StripReferenceMarkers)
if opts.SearchEnabled {
text = shared.ReplaceCitationMarkersWithLinks(text, snapshot.CitationLinks)
}
parsed := shared.DetectAssistantToolCalls(snapshot.RawText, text, snapshot.RawThinking, snapshot.DetectionThinking, opts.ToolNames)
calls := parsed.Calls
if len(calls) == 0 && len(snapshot.AdditionalToolCalls) > 0 {
calls = snapshot.AdditionalToolCalls
}
calls = toolcall.NormalizeParsedToolCallsForSchemas(calls, opts.ToolsRaw)
parsed.Calls = calls
stopReason := StopReasonStop
if snapshot.ContentFilter {
stopReason = StopReasonContentFilter
}
if len(calls) > 0 || snapshot.AlreadyEmittedCalls || snapshot.AlreadyEmittedToolRaw {
stopReason = StopReasonToolCalls
}
turn := Turn{
Model: opts.Model,
Prompt: opts.Prompt,
RawText: snapshot.RawText,
RawThinking: snapshot.RawThinking,
DetectionThinking: snapshot.DetectionThinking,
Text: text,
Thinking: thinking,
ToolCalls: calls,
ParsedToolCalls: parsed,
CitationLinks: snapshot.CitationLinks,
ContentFilter: snapshot.ContentFilter,
ResponseMessageID: snapshot.ResponseMessageID,
StopReason: stopReason,
}
turn.Usage = BuildUsage(opts.Model, opts.Prompt, thinking, text, opts.RefFileTokens)
if !snapshot.AlreadyEmittedCalls && !snapshot.AlreadyEmittedToolRaw {
turn.Error = ValidateTurn(turn, opts.ToolChoice)
}
if turn.Error != nil && len(calls) == 0 {
turn.StopReason = StopReasonError
}
return turn
}
func BuildUsage(model, prompt, thinking, text string, refFileTokens int) Usage {
inputTokens := util.CountPromptTokens(prompt, model) + refFileTokens
reasoningTokens := util.CountOutputTokens(thinking, model)
outputTokens := reasoningTokens + util.CountOutputTokens(text, model)
return Usage{
InputTokens: inputTokens,
OutputTokens: outputTokens,
ReasoningTokens: reasoningTokens,
TotalTokens: inputTokens + outputTokens,
}
}
func ValidateTurn(turn Turn, policy promptcompat.ToolChoicePolicy) *OutputError {
if policy.IsRequired() && len(turn.ToolCalls) == 0 {
return &OutputError{
Status: http.StatusUnprocessableEntity,
Message: "tool_choice requires at least one valid tool call.",
Code: "tool_choice_violation",
}
}
if len(turn.ToolCalls) > 0 {
return nil
}
if strings.TrimSpace(turn.Text) != "" {
return nil
}
status, message, code := UpstreamEmptyOutputDetail(turn.ContentFilter, turn.Text, turn.Thinking)
return &OutputError{Status: status, Message: message, Code: code}
}
func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
_ = text
if contentFilter {
return http.StatusBadRequest, "Upstream content filtered the response and returned no output.", "content_filter"
}
if strings.TrimSpace(thinking) != "" {
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned reasoning without visible output.", "upstream_empty_output"
}
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned empty output.", "upstream_empty_output"
}
// ShouldRetryEmptyOutput returns true when the turn produced no visible text
// and has no tool calls or content filter. This includes thinking-only responses,
// where the model returned reasoning but no answer — a retry may yield text.
func ShouldRetryEmptyOutput(turn Turn, attempts, maxAttempts int) bool {
return attempts < maxAttempts &&
!turn.ContentFilter &&
len(turn.ToolCalls) == 0 &&
strings.TrimSpace(turn.Text) == ""
}
func FinalizeTurn(turn Turn, opts FinalizeOptions) FinalOutcome {
hasToolCalls := len(turn.ToolCalls) > 0 || opts.AlreadyEmittedToolCalls
hasVisibleText := strings.TrimSpace(turn.Text) != ""
hasVisibleThinking := strings.TrimSpace(turn.Thinking) != ""
err := turn.Error
if hasToolCalls {
err = nil
}
finishReason := FinishReason(turn)
if hasToolCalls {
finishReason = "tool_calls"
}
return FinalOutcome{
FinishReason: finishReason,
Error: err,
Usage: turn.Usage,
HasToolCalls: hasToolCalls,
HasVisibleText: hasVisibleText,
HasVisibleOutput: hasVisibleText || hasVisibleThinking || hasToolCalls,
ShouldFail: err != nil,
}
}
func OpenAIChatUsage(turn Turn) map[string]any {
return map[string]any{
"prompt_tokens": turn.Usage.InputTokens,
"completion_tokens": turn.Usage.OutputTokens,
"total_tokens": turn.Usage.TotalTokens,
"completion_tokens_details": map[string]any{
"reasoning_tokens": turn.Usage.ReasoningTokens,
},
}
}
func OpenAIResponsesUsage(turn Turn) map[string]any {
return map[string]any{
"input_tokens": turn.Usage.InputTokens,
"output_tokens": turn.Usage.OutputTokens,
"total_tokens": turn.Usage.TotalTokens,
}
}
func FinishReason(turn Turn) string {
switch turn.StopReason {
case StopReasonToolCalls:
return "tool_calls"
case StopReasonContentFilter:
return "content_filter"
default:
return "stop"
}
}

View File

@@ -0,0 +1,127 @@
package assistantturn
import (
"testing"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
)
func TestBuildTurnFromCollectedTextCitation(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{
Text: "See [citation:1]",
CitationLinks: map[int]string{1: "https://example.com"},
}, BuildOptions{Model: "deepseek-v4-flash", Prompt: "prompt", SearchEnabled: true, StripReferenceMarkers: true})
if turn.Text != "See [1](https://example.com)" {
t.Fatalf("text mismatch: %q", turn.Text)
}
if turn.StopReason != StopReasonStop {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
if turn.Error != nil {
t.Fatalf("unexpected error: %#v", turn.Error)
}
}
func TestBuildTurnFromCollectedToolCall(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{
Text: `<tool_calls><invoke name="Write"><parameter name="content">{"x":1}</parameter></invoke></tool_calls>`,
}, BuildOptions{
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"input_schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
})
if len(turn.ToolCalls) != 1 {
t.Fatalf("expected one tool call, got %d", len(turn.ToolCalls))
}
if turn.StopReason != StopReasonToolCalls {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
if _, ok := turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected content coerced to string, got %#v", turn.ToolCalls[0].Input["content"])
}
}
func TestBuildTurnFromCollectedThinkingOnlyIsEmptyOutput(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Thinking: "hidden"}, BuildOptions{})
if turn.Error == nil || turn.Error.Code != "upstream_empty_output" {
t.Fatalf("expected empty output error, got %#v", turn.Error)
}
}
func TestBuildTurnFromCollectedToolChoiceRequired(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Text: "hello"}, BuildOptions{
ToolChoice: promptcompat.ToolChoicePolicy{Mode: promptcompat.ToolChoiceRequired},
})
if turn.Error == nil || turn.Error.Code != "tool_choice_violation" {
t.Fatalf("expected tool choice violation, got %#v", turn.Error)
}
}
func TestBuildTurnFromStreamSnapshotUsesVisibleTextAndRawToolDetection(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{
RawText: `<tool_calls><invoke name="Write"><parameter name="content">{"x":1}</parameter></invoke></tool_calls>`,
VisibleText: "",
}, BuildOptions{
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
})
if len(turn.ToolCalls) != 1 {
t.Fatalf("expected stream snapshot tool call, got %d", len(turn.ToolCalls))
}
if _, ok := turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected stream snapshot schema coercion, got %#v", turn.ToolCalls[0].Input["content"])
}
}
func TestBuildTurnFromStreamSnapshotAlreadyEmittedToolAvoidsEmptyError(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{AlreadyEmittedCalls: true}, BuildOptions{})
if turn.Error != nil {
t.Fatalf("unexpected empty-output error after emitted tool call: %#v", turn.Error)
}
if turn.StopReason != StopReasonToolCalls {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
}
func TestFinalizeTurnStopOutcome(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Text: "hello"}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{})
if outcome.ShouldFail {
t.Fatalf("unexpected failure: %#v", outcome.Error)
}
if outcome.FinishReason != "stop" || !outcome.HasVisibleText || !outcome.HasVisibleOutput {
t.Fatalf("unexpected outcome: %#v", outcome)
}
}
func TestFinalizeTurnToolCallsOutcome(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{AlreadyEmittedCalls: true}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{AlreadyEmittedToolCalls: true})
if outcome.ShouldFail || outcome.FinishReason != "tool_calls" || !outcome.HasToolCalls {
t.Fatalf("unexpected tool outcome: %#v", outcome)
}
}
func TestFinalizeTurnContentFilterOutcome(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{ContentFilter: true}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{})
if !outcome.ShouldFail || outcome.Error == nil || outcome.Error.Code != "content_filter" {
t.Fatalf("expected content filter failure, got %#v", outcome)
}
}

View File

@@ -0,0 +1,193 @@
package completionruntime
import (
"context"
"fmt"
"io"
"net/http"
"strings"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/config"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
)
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type Options struct {
StripReferenceMarkers bool
MaxAttempts int
RetryEnabled bool
RetryMaxAttempts int
CurrentInputFile history.CurrentInputConfigReader
}
type NonStreamResult struct {
SessionID string
Payload map[string]any
Turn assistantturn.Turn
Attempts int
}
type StartResult struct {
SessionID string
Payload map[string]any
Pow string
Response *http.Response
Request promptcompat.StandardRequest
}
func StartCompletion(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (StartResult, *assistantturn.OutputError) {
maxAttempts := opts.MaxAttempts
if maxAttempts <= 0 {
maxAttempts = 3
}
var prepErr *assistantturn.OutputError
stdReq, prepErr = prepareCurrentInputFile(ctx, ds, a, stdReq, opts)
if prepErr != nil {
return StartResult{Request: stdReq}, prepErr
}
sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
if err != nil {
return StartResult{Request: stdReq}, authOutputError(a)
}
pow, err := ds.GetPow(ctx, a, maxAttempts)
if err != nil {
return StartResult{SessionID: sessionID, Request: stdReq}, &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Failed to get PoW (invalid token or unknown error).", Code: "error"}
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := ds.CallCompletion(ctx, a, payload, pow, maxAttempts)
if err != nil {
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Request: stdReq}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
}
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Response: resp, Request: stdReq}, nil
}
func prepareCurrentInputFile(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (promptcompat.StandardRequest, *assistantturn.OutputError) {
if opts.CurrentInputFile == nil || stdReq.CurrentInputFileApplied {
return stdReq, nil
}
out, err := (history.Service{Store: opts.CurrentInputFile, DS: ds}).ApplyCurrentInputFile(ctx, a, stdReq)
if err != nil {
status, message := history.MapError(err)
return out, &assistantturn.OutputError{Status: status, Message: message, Code: "error"}
}
return out, nil
}
func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (NonStreamResult, *assistantturn.OutputError) {
start, startErr := StartCompletion(ctx, ds, a, stdReq, opts)
if startErr != nil {
return NonStreamResult{SessionID: start.SessionID, Payload: start.Payload}, startErr
}
stdReq = start.Request
maxAttempts := opts.MaxAttempts
if maxAttempts <= 0 {
maxAttempts = 3
}
sessionID := start.SessionID
payload := start.Payload
pow := start.Pow
attempts := 0
currentResp := start.Response
usagePrompt := stdReq.PromptTokenText
accumulatedThinking := ""
accumulatedRawThinking := ""
accumulatedToolDetectionThinking := ""
for {
turn, outErr := collectAttempt(currentResp, stdReq, usagePrompt, opts)
if outErr != nil {
return NonStreamResult{SessionID: sessionID, Payload: payload, Attempts: attempts}, outErr
}
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, turn.Thinking)
accumulatedRawThinking += sse.TrimContinuationOverlap(accumulatedRawThinking, turn.RawThinking)
accumulatedToolDetectionThinking += sse.TrimContinuationOverlap(accumulatedToolDetectionThinking, turn.DetectionThinking)
turn.Thinking = accumulatedThinking
turn.RawThinking = accumulatedRawThinking
turn.DetectionThinking = accumulatedToolDetectionThinking
turn = assistantturn.BuildTurnFromCollected(sse.CollectResult{
Text: turn.RawText,
Thinking: turn.RawThinking,
ToolDetectionThinking: turn.DetectionThinking,
ContentFilter: turn.ContentFilter,
CitationLinks: turn.CitationLinks,
ResponseMessageID: turn.ResponseMessageID,
}, buildOptions(stdReq, usagePrompt, opts))
retryMax := opts.RetryMaxAttempts
if retryMax <= 0 {
retryMax = shared.EmptyOutputRetryMaxAttempts()
}
if !opts.RetryEnabled || !assistantturn.ShouldRetryEmptyOutput(turn, attempts, retryMax) {
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, turn.Error
}
attempts++
config.Logger.Info("[completion_runtime_empty_retry] attempting synthetic retry", "surface", stdReq.Surface, "stream", false, "retry_attempt", attempts, "parent_message_id", turn.ResponseMessageID)
retryPow, powErr := ds.GetPow(ctx, a, maxAttempts)
if powErr != nil {
config.Logger.Warn("[completion_runtime_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", stdReq.Surface, "retry_attempt", attempts, "error", powErr)
retryPow = pow
}
retryPayload := shared.ClonePayloadForEmptyOutputRetry(payload, turn.ResponseMessageID)
nextResp, err := ds.CallCompletion(ctx, a, retryPayload, retryPow, maxAttempts)
if err != nil {
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
}
usagePrompt = shared.UsagePromptWithEmptyOutputRetry(usagePrompt, attempts)
currentResp = nextResp
}
}
func collectAttempt(resp *http.Response, stdReq promptcompat.StandardRequest, usagePrompt string, opts Options) (assistantturn.Turn, *assistantturn.OutputError) {
defer func() {
if err := resp.Body.Close(); err != nil {
config.Logger.Warn("[completion_runtime] response body close failed", "surface", stdReq.Surface, "error", err)
}
}()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
message := strings.TrimSpace(string(body))
if message == "" {
message = http.StatusText(resp.StatusCode)
}
return assistantturn.Turn{}, &assistantturn.OutputError{Status: resp.StatusCode, Message: message, Code: "error"}
}
result := sse.CollectStream(resp, stdReq.Thinking, false)
return assistantturn.BuildTurnFromCollected(result, buildOptions(stdReq, usagePrompt, opts)), nil
}
func buildOptions(stdReq promptcompat.StandardRequest, prompt string, opts Options) assistantturn.BuildOptions {
return assistantturn.BuildOptions{
Model: stdReq.ResponseModel,
Prompt: prompt,
RefFileTokens: stdReq.RefFileTokens,
SearchEnabled: stdReq.Search,
StripReferenceMarkers: opts.StripReferenceMarkers,
ToolNames: stdReq.ToolNames,
ToolsRaw: stdReq.ToolsRaw,
ToolChoice: stdReq.ToolChoice,
}
}
func authOutputError(a *auth.RequestAuth) *assistantturn.OutputError {
if a != nil && a.UseConfigToken {
return &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Account token is invalid. Please re-login the account in admin.", Code: "error"}
}
return &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Invalid token. If this should be a DS2API key, add it to config.keys first.", Code: "error"}
}
func Errorf(status int, format string, args ...any) *assistantturn.OutputError {
return &assistantturn.OutputError{Status: status, Message: fmt.Sprintf(format, args...), Code: "error"}
}

View File

@@ -0,0 +1,174 @@
package completionruntime
import (
"context"
"io"
"net/http"
"strings"
"testing"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/promptcompat"
)
type fakeDeepSeekCaller struct {
responses []*http.Response
payloads []map[string]any
uploads []dsclient.UploadFileRequest
}
type currentInputRuntimeConfig struct{}
func (currentInputRuntimeConfig) CurrentInputFileEnabled() bool { return true }
func (currentInputRuntimeConfig) CurrentInputFileMinChars() int { return 0 }
func (f *fakeDeepSeekCaller) CreateSession(context.Context, *auth.RequestAuth, int) (string, error) {
return "session-1", nil
}
func (f *fakeDeepSeekCaller) GetPow(context.Context, *auth.RequestAuth, int) (string, error) {
return "pow", nil
}
func (f *fakeDeepSeekCaller) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
f.uploads = append(f.uploads, req)
return &dsclient.UploadFileResult{ID: "file-runtime-1"}, nil
}
func (f *fakeDeepSeekCaller) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
f.payloads = append(f.payloads, payload)
if len(f.responses) == 0 {
return sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"fallback"}`), nil
}
resp := f.responses[0]
f.responses = f.responses[1:]
return resp, nil
}
func TestExecuteNonStreamWithRetryBuildsCanonicalTurn(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{sseHTTPResponse(
http.StatusOK,
`data: {"response_message_id":42,"p":"response/content","v":"<tool_calls><invoke name=\"Write\"><parameter name=\"content\">{\"x\":1}</parameter></invoke></tool_calls>"}`,
)}}
stdReq := promptcompat.StandardRequest{
Surface: "test",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "prompt",
FinalPrompt: "final prompt",
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"input_schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
}
result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, &auth.RequestAuth{}, stdReq, Options{})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if result.SessionID != "session-1" {
t.Fatalf("session mismatch: %q", result.SessionID)
}
if got := result.Turn.ResponseMessageID; got != 42 {
t.Fatalf("response message id mismatch: %d", got)
}
if len(result.Turn.ToolCalls) != 1 {
t.Fatalf("expected one tool call, got %d", len(result.Turn.ToolCalls))
}
if _, ok := result.Turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected schema-normalized string argument, got %#v", result.Turn.ToolCalls[0].Input["content"])
}
if result.Turn.Usage.InputTokens == 0 || result.Turn.Usage.TotalTokens == 0 {
t.Fatalf("expected usage to be populated, got %#v", result.Turn.Usage)
}
}
func TestExecuteNonStreamWithRetryUsesParentMessageForEmptyRetry(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/status","v":"FINISHED"}`),
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":78,"p":"response/content","v":"ok"}`),
}}
stdReq := promptcompat.StandardRequest{
Surface: "test",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "prompt",
FinalPrompt: "final prompt",
}
result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, &auth.RequestAuth{}, stdReq, Options{RetryEnabled: true})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if result.Attempts != 1 {
t.Fatalf("expected one retry, got %d", result.Attempts)
}
if len(ds.payloads) != 2 {
t.Fatalf("expected two completion calls, got %d", len(ds.payloads))
}
if got := ds.payloads[1]["parent_message_id"]; got != 77 {
t.Fatalf("retry parent_message_id mismatch: %#v", got)
}
if result.Turn.Text != "ok" {
t.Fatalf("retry text mismatch: %q", result.Turn.Text)
}
}
func TestStartCompletionAppliesCurrentInputFileGlobally(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"ok"}`)}}
stdReq := promptcompat.StandardRequest{
Surface: "test_adapter",
RequestedModel: "deepseek-v4-flash",
ResolvedModel: "deepseek-v4-flash",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "first user turn",
FinalPrompt: "first user turn",
Messages: []any{
map[string]any{"role": "user", "content": "first user turn"},
},
}
start, outErr := StartCompletion(context.Background(), ds, &auth.RequestAuth{DeepSeekToken: "token"}, stdReq, Options{
CurrentInputFile: currentInputRuntimeConfig{},
})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if len(ds.uploads) != 1 {
t.Fatalf("expected current input upload, got %d", len(ds.uploads))
}
if got := ds.uploads[0].Filename; got != "DS2API_HISTORY.txt" {
t.Fatalf("upload filename=%q want DS2API_HISTORY.txt", got)
}
if len(ds.payloads) != 1 {
t.Fatalf("expected one completion payload, got %d", len(ds.payloads))
}
refIDs, _ := ds.payloads[0]["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-runtime-1" {
t.Fatalf("expected uploaded file id in ref_file_ids, got %#v", ds.payloads[0]["ref_file_ids"])
}
prompt, _ := ds.payloads[0]["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
if !start.Request.CurrentInputFileApplied || !strings.Contains(start.Request.PromptTokenText, "# DS2API_HISTORY.txt") {
t.Fatalf("expected prepared request to carry current input file state, got %#v", start.Request)
}
}
func sseHTTPResponse(status int, lines ...string) *http.Response {
body := strings.Join(lines, "\n")
if !strings.HasSuffix(body, "\n") {
body += "\n"
}
return &http.Response{
StatusCode: status,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader(body)),
}
}

View File

@@ -35,9 +35,6 @@ func (c Config) MarshalJSON() ([]byte, error) {
if c.Runtime.AccountMaxInflight > 0 || c.Runtime.AccountMaxQueue > 0 || c.Runtime.GlobalMaxInflight > 0 || c.Runtime.TokenRefreshIntervalHours > 0 {
m["runtime"] = c.Runtime
}
if c.Compat.WideInputStrictOutput != nil || c.Compat.StripReferenceMarkers != nil {
m["compat"] = c.Compat
}
if c.Responses.StoreTTLSeconds > 0 {
m["responses"] = c.Responses
}
@@ -45,9 +42,6 @@ func (c Config) MarshalJSON() ([]byte, error) {
m["embeddings"] = c.Embeddings
}
m["auto_delete"] = c.AutoDelete
if c.HistorySplit.Enabled != nil || c.HistorySplit.TriggerAfterTurns != nil {
m["history_split"] = c.HistorySplit
}
if c.CurrentInputFile.Enabled != nil || c.CurrentInputFile.MinChars != 0 {
m["current_input_file"] = c.CurrentInputFile
}
@@ -103,8 +97,9 @@ func (c *Config) UnmarshalJSON(b []byte) error {
return fmt.Errorf("invalid field %q: %w", k, err)
}
case "compat":
if err := json.Unmarshal(v, &c.Compat); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
// Removed field ignored instead of persisted.
if Logger != nil {
Logger.Warn("config key \"compat\" is deprecated and ignored; remove it from your configuration")
}
case "toolcall":
// Legacy field ignored. Toolcall policy is fixed and no longer configurable.
@@ -121,9 +116,7 @@ func (c *Config) UnmarshalJSON(b []byte) error {
return fmt.Errorf("invalid field %q: %w", k, err)
}
case "history_split":
if err := json.Unmarshal(v, &c.HistorySplit); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
}
// Removed legacy split field is ignored instead of persisted.
case "current_input_file":
if err := json.Unmarshal(v, &c.CurrentInputFile); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
@@ -160,17 +153,9 @@ func (c Config) Clone() Config {
ModelAliases: cloneStringMap(c.ModelAliases),
Admin: c.Admin,
Runtime: c.Runtime,
Compat: CompatConfig{
WideInputStrictOutput: cloneBoolPtr(c.Compat.WideInputStrictOutput),
StripReferenceMarkers: cloneBoolPtr(c.Compat.StripReferenceMarkers),
},
Responses: c.Responses,
Embeddings: c.Embeddings,
AutoDelete: c.AutoDelete,
HistorySplit: HistorySplitConfig{
Enabled: cloneBoolPtr(c.HistorySplit.Enabled),
TriggerAfterTurns: cloneIntPtr(c.HistorySplit.TriggerAfterTurns),
},
Responses: c.Responses,
Embeddings: c.Embeddings,
AutoDelete: c.AutoDelete,
CurrentInputFile: CurrentInputFileConfig{
Enabled: cloneBoolPtr(c.CurrentInputFile.Enabled),
MinChars: c.CurrentInputFile.MinChars,
@@ -208,14 +193,6 @@ func cloneBoolPtr(in *bool) *bool {
return &v
}
func cloneIntPtr(in *int) *int {
if in == nil {
return nil
}
v := *in
return &v
}
func parseConfigString(raw string) (Config, error) {
var cfg Config
candidates := []string{raw}

View File

@@ -15,11 +15,9 @@ type Config struct {
ModelAliases map[string]string `json:"model_aliases,omitempty"`
Admin AdminConfig `json:"admin,omitempty"`
Runtime RuntimeConfig `json:"runtime,omitempty"`
Compat CompatConfig `json:"compat,omitempty"`
Responses ResponsesConfig `json:"responses,omitempty"`
Embeddings EmbeddingsConfig `json:"embeddings,omitempty"`
AutoDelete AutoDeleteConfig `json:"auto_delete"`
HistorySplit HistorySplitConfig `json:"history_split"`
CurrentInputFile CurrentInputFileConfig `json:"current_input_file,omitempty"`
ThinkingInjection ThinkingInjectionConfig `json:"thinking_injection,omitempty"`
VercelSyncHash string `json:"_vercel_sync_hash,omitempty"`
@@ -142,11 +140,6 @@ func (c *Config) normalizeModelAliases() {
}
}
type CompatConfig struct {
WideInputStrictOutput *bool `json:"wide_input_strict_output,omitempty"`
StripReferenceMarkers *bool `json:"strip_reference_markers,omitempty"`
}
type AdminConfig struct {
PasswordHash string `json:"password_hash,omitempty"`
JWTExpireHours int `json:"jwt_expire_hours,omitempty"`
@@ -173,11 +166,6 @@ type AutoDeleteConfig struct {
Sessions bool `json:"sessions,omitempty"`
}
type HistorySplitConfig struct {
Enabled *bool `json:"enabled,omitempty"`
TriggerAfterTurns *int `json:"trigger_after_turns,omitempty"`
}
type CurrentInputFileConfig struct {
Enabled *bool `json:"enabled,omitempty"`
MinChars int `json:"min_chars,omitempty"`

View File

@@ -163,8 +163,6 @@ func TestLowerFunction(t *testing.T) {
// ─── Config.MarshalJSON / UnmarshalJSON roundtrip ────────────────────
func TestConfigJSONRoundtrip(t *testing.T) {
trueVal := true
falseVal := false
cfg := Config{
Keys: []string{"key1", "key2"},
Accounts: []Account{{Email: "user@example.com", Password: "pass", Token: "tok"}},
@@ -172,17 +170,9 @@ func TestConfigJSONRoundtrip(t *testing.T) {
AutoDelete: AutoDeleteConfig{
Mode: "single",
},
HistorySplit: HistorySplitConfig{
Enabled: &trueVal,
TriggerAfterTurns: func() *int { v := 2; return &v }(),
},
Runtime: RuntimeConfig{
TokenRefreshIntervalHours: 12,
},
Compat: CompatConfig{
WideInputStrictOutput: &trueVal,
StripReferenceMarkers: &falseVal,
},
VercelSyncHash: "hash123",
VercelSyncTime: 1234567890,
AdditionalFields: map[string]any{
@@ -215,18 +205,6 @@ func TestConfigJSONRoundtrip(t *testing.T) {
if decoded.AutoDelete.Mode != "single" {
t.Fatalf("unexpected auto delete mode: %#v", decoded.AutoDelete.Mode)
}
if decoded.HistorySplit.Enabled == nil || !*decoded.HistorySplit.Enabled {
t.Fatalf("unexpected history split enabled: %#v", decoded.HistorySplit.Enabled)
}
if decoded.HistorySplit.TriggerAfterTurns == nil || *decoded.HistorySplit.TriggerAfterTurns != 2 {
t.Fatalf("unexpected history split trigger_after_turns: %#v", decoded.HistorySplit.TriggerAfterTurns)
}
if decoded.Compat.WideInputStrictOutput == nil || !*decoded.Compat.WideInputStrictOutput {
t.Fatalf("unexpected compat wide_input_strict_output: %#v", decoded.Compat.WideInputStrictOutput)
}
if decoded.Compat.StripReferenceMarkers == nil || *decoded.Compat.StripReferenceMarkers {
t.Fatalf("unexpected compat strip_reference_markers: %#v", decoded.Compat.StripReferenceMarkers)
}
if decoded.VercelSyncHash != "hash123" {
t.Fatalf("unexpected vercel sync hash: %q", decoded.VercelSyncHash)
}
@@ -290,23 +268,31 @@ func TestConfigUnmarshalJSONIgnoresRemovedLegacyModelMappings(t *testing.T) {
}
}
func TestConfigUnmarshalJSONIgnoresRemovedHistorySplit(t *testing.T) {
raw := `{"keys":["k1"],"history_split":{"enabled":true,"trigger_after_turns":2}}`
var cfg Config
if err := json.Unmarshal([]byte(raw), &cfg); err != nil {
t.Fatalf("unmarshal error: %v", err)
}
if _, ok := cfg.AdditionalFields["history_split"]; ok {
t.Fatalf("expected removed legacy field not to persist in additional fields: %#v", cfg.AdditionalFields)
}
out, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("marshal error: %v", err)
}
if strings.Contains(string(out), "history_split") {
t.Fatalf("expected removed history_split field not to marshal, got %s", out)
}
}
// ─── Config.Clone ────────────────────────────────────────────────────
func TestConfigCloneIsDeepCopy(t *testing.T) {
falseVal := false
trueVal := true
turns := 2
cfg := Config{
Keys: []string{"key1"},
Accounts: []Account{{Email: "user@test.com", Token: "token"}},
ModelAliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"},
Compat: CompatConfig{
StripReferenceMarkers: &falseVal,
},
HistorySplit: HistorySplitConfig{
Enabled: &trueVal,
TriggerAfterTurns: &turns,
},
Keys: []string{"key1"},
Accounts: []Account{{Email: "user@test.com", Token: "token"}},
ModelAliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"},
AdditionalFields: map[string]any{"custom": "value"},
}
@@ -316,15 +302,6 @@ func TestConfigCloneIsDeepCopy(t *testing.T) {
cfg.Keys[0] = "modified"
cfg.Accounts[0].Email = "modified@test.com"
cfg.ModelAliases["claude-sonnet-4-6"] = "modified-model"
if cfg.Compat.StripReferenceMarkers != nil {
*cfg.Compat.StripReferenceMarkers = true
}
if cfg.HistorySplit.Enabled != nil {
*cfg.HistorySplit.Enabled = false
}
if cfg.HistorySplit.TriggerAfterTurns != nil {
*cfg.HistorySplit.TriggerAfterTurns = 5
}
// Cloned should not be affected
if cloned.Keys[0] != "key1" {
@@ -336,15 +313,6 @@ func TestConfigCloneIsDeepCopy(t *testing.T) {
if cloned.ModelAliases["claude-sonnet-4-6"] != "deepseek-v4-flash" {
t.Fatalf("clone model aliases was affected: %#v", cloned.ModelAliases)
}
if cloned.Compat.StripReferenceMarkers == nil || *cloned.Compat.StripReferenceMarkers {
t.Fatalf("clone compat was affected: %#v", cloned.Compat.StripReferenceMarkers)
}
if cloned.HistorySplit.Enabled == nil || !*cloned.HistorySplit.Enabled {
t.Fatalf("clone history split enabled was affected: %#v", cloned.HistorySplit.Enabled)
}
if cloned.HistorySplit.TriggerAfterTurns == nil || *cloned.HistorySplit.TriggerAfterTurns != 2 {
t.Fatalf("clone history split trigger was affected: %#v", cloned.HistorySplit.TriggerAfterTurns)
}
}
func TestConfigCloneNilMaps(t *testing.T) {
@@ -483,53 +451,9 @@ func TestStoreFindAccountNotFound(t *testing.T) {
}
}
func TestStoreCompatWideInputStrictOutputDefaultTrue(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[]}`)
store := LoadStore()
if !store.CompatWideInputStrictOutput() {
t.Fatal("expected default wide_input_strict_output=true when unset")
}
}
func TestStoreCompatWideInputStrictOutputCanDisable(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[],"compat":{"wide_input_strict_output":false}}`)
store := LoadStore()
if store.CompatWideInputStrictOutput() {
t.Fatal("expected wide_input_strict_output=false when explicitly configured")
}
snap := store.Snapshot()
data, err := snap.MarshalJSON()
if err != nil {
t.Fatalf("marshal failed: %v", err)
}
var out map[string]any
if err := json.Unmarshal(data, &out); err != nil {
t.Fatalf("decode failed: %v", err)
}
rawCompat, ok := out["compat"].(map[string]any)
if !ok {
t.Fatalf("expected compat in marshaled output, got %#v", out)
}
if rawCompat["wide_input_strict_output"] != false {
t.Fatalf("expected explicit false in compat, got %#v", rawCompat)
}
}
func TestStoreCompatStripReferenceMarkersDefaultTrue(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[]}`)
store := LoadStore()
if !store.CompatStripReferenceMarkers() {
t.Fatal("expected default strip_reference_markers=true when unset")
}
}
func TestStoreCompatStripReferenceMarkersCanDisable(t *testing.T) {
func TestStoreIgnoresRemovedCompatConfig(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[],"compat":{"strip_reference_markers":false}}`)
store := LoadStore()
if store.CompatStripReferenceMarkers() {
t.Fatal("expected strip_reference_markers=false when explicitly configured")
}
snap := store.Snapshot()
data, err := snap.MarshalJSON()
@@ -540,12 +464,8 @@ func TestStoreCompatStripReferenceMarkersCanDisable(t *testing.T) {
if err := json.Unmarshal(data, &out); err != nil {
t.Fatalf("decode failed: %v", err)
}
rawCompat, ok := out["compat"].(map[string]any)
if !ok {
t.Fatalf("expected compat in marshaled output, got %#v", out)
}
if rawCompat["strip_reference_markers"] != false {
t.Fatalf("expected explicit false in compat, got %#v", rawCompat)
if _, ok := out["compat"]; ok {
t.Fatalf("expected removed compat field not to marshal, got %#v", out)
}
}

View File

@@ -144,6 +144,44 @@ func TestLoadStoreIgnoresLegacyConfigJSONEnv(t *testing.T) {
}
}
func TestExplicitMissingConfigPathBootstrapsEmptyFileBackedStore(t *testing.T) {
path := t.TempDir() + "/config.json"
t.Setenv("DS2API_CONFIG_JSON", "")
t.Setenv("DS2API_CONFIG_PATH", path)
store, err := LoadStoreWithError()
if err != nil {
t.Fatalf("expected missing explicit config path to bootstrap, got: %v", err)
}
if store.IsEnvBacked() {
t.Fatal("expected bootstrap store to be file-backed")
}
if store.ConfigPath() != path {
t.Fatalf("ConfigPath() = %q, want %q", store.ConfigPath(), path)
}
if len(store.Keys()) != 0 || len(store.Accounts()) != 0 {
t.Fatalf("expected empty bootstrap config, got keys=%d accounts=%d", len(store.Keys()), len(store.Accounts()))
}
if _, statErr := os.Stat(path); !errors.Is(statErr, os.ErrNotExist) {
t.Fatalf("expected bootstrap not to create config until first save, stat err=%v", statErr)
}
if err := store.Update(func(c *Config) error {
c.Keys = []string{"first-key"}
return nil
}); err != nil {
t.Fatalf("update should persist bootstrap config: %v", err)
}
content, err := os.ReadFile(path)
if err != nil {
t.Fatalf("expected first update to write config: %v", err)
}
if !strings.Contains(string(content), "first-key") {
t.Fatalf("expected saved config to contain first key, got: %s", content)
}
}
func TestEnvBackedStoreWritebackBootstrapsMissingConfigFile(t *testing.T) {
tmp, err := os.CreateTemp(t.TempDir(), "config-*.json")
if err != nil {

View File

@@ -52,11 +52,12 @@ func loadStore() (*Store, error) {
func loadConfig() (Config, bool, error) {
rawCfg := strings.TrimSpace(os.Getenv("DS2API_CONFIG_JSON"))
path := ConfigPath()
if rawCfg != "" {
cfg, err := parseConfigString(rawCfg)
if err != nil {
if !IsVercel() && envWritebackEnabled() {
if fileCfg, fileErr := loadConfigFromFile(ConfigPath()); fileErr == nil {
if fileCfg, fileErr := loadConfigFromFile(path); fileErr == nil {
return fileCfg, false, nil
}
}
@@ -67,7 +68,7 @@ func loadConfig() (Config, bool, error) {
if IsVercel() || !envWritebackEnabled() {
return cfg, true, err
}
content, fileErr := os.ReadFile(ConfigPath())
content, fileErr := os.ReadFile(path)
if fileErr == nil {
var fileCfg Config
if unmarshalErr := json.Unmarshal(content, &fileCfg); unmarshalErr == nil {
@@ -79,7 +80,7 @@ func loadConfig() (Config, bool, error) {
if validateErr := ValidateConfig(cfg); validateErr != nil {
return cfg, true, validateErr
}
if writeErr := writeConfigFile(ConfigPath(), cfg.Clone()); writeErr == nil {
if writeErr := writeConfigFile(path, cfg.Clone()); writeErr == nil {
return cfg, false, err
} else {
Logger.Warn("[config] env writeback bootstrap failed", "error", writeErr)
@@ -87,7 +88,7 @@ func loadConfig() (Config, bool, error) {
}
return cfg, true, err
}
cfg, err := loadConfigFromFile(ConfigPath())
cfg, err := loadConfigFromFile(path)
if err != nil {
if shouldTryLegacyContainerConfigPath() {
legacyPath := legacyContainerConfigPath()
@@ -100,6 +101,10 @@ func loadConfig() (Config, bool, error) {
// Vercel may start without writable/present config; keep in-memory bootstrap config.
return Config{}, true, nil
}
if shouldBootstrapMissingConfigFile(err) {
Logger.Warn("[config] config file missing; starting with empty file-backed config", "path", path)
return Config{}, false, nil
}
return Config{}, false, err
}
if IsVercel() {
@@ -109,6 +114,10 @@ func loadConfig() (Config, bool, error) {
return cfg, false, nil
}
func shouldBootstrapMissingConfigFile(err error) bool {
return errors.Is(err, os.ErrNotExist) && strings.TrimSpace(os.Getenv("DS2API_CONFIG_PATH")) != ""
}
func loadConfigFromFile(path string) (Config, error) {
content, err := os.ReadFile(path)
if err != nil {

View File

@@ -21,24 +21,6 @@ func (s *Store) ModelAliases() map[string]string {
return out
}
func (s *Store) CompatWideInputStrictOutput() bool {
s.mu.RLock()
defer s.mu.RUnlock()
if s.cfg.Compat.WideInputStrictOutput == nil {
return true
}
return *s.cfg.Compat.WideInputStrictOutput
}
func (s *Store) CompatStripReferenceMarkers() bool {
s.mu.RLock()
defer s.mu.RUnlock()
if s.cfg.Compat.StripReferenceMarkers == nil {
return true
}
return *s.cfg.Compat.StripReferenceMarkers
}
func (s *Store) ToolcallMode() string {
return "feature_match"
}
@@ -163,14 +145,6 @@ func (s *Store) AutoDeleteSessions() bool {
return s.AutoDeleteMode() != "none"
}
func (s *Store) HistorySplitEnabled() bool {
return false
}
func (s *Store) HistorySplitTriggerAfterTurns() int {
return 1
}
func (s *Store) CurrentInputFileEnabled() bool {
s.mu.RLock()
defer s.mu.RUnlock()

View File

@@ -2,21 +2,6 @@ package config
import "testing"
func TestStoreHistorySplitAccessors(t *testing.T) {
enabled := true
turns := 3
store := &Store{cfg: Config{HistorySplit: HistorySplitConfig{
Enabled: &enabled,
TriggerAfterTurns: &turns,
}}}
if store.HistorySplitEnabled() {
t.Fatal("expected history split to stay disabled")
}
if got := store.HistorySplitTriggerAfterTurns(); got != 1 {
t.Fatalf("history split trigger_after_turns=%d want=1", got)
}
}
func TestStoreCurrentInputFileAccessors(t *testing.T) {
store := &Store{cfg: Config{}}
if !store.CurrentInputFileEnabled() {
@@ -40,12 +25,6 @@ func TestStoreCurrentInputFileAccessors(t *testing.T) {
if got := store.CurrentInputFileMinChars(); got != 12345 {
t.Fatalf("current input file min_chars=%d want=12345", got)
}
historyEnabled := true
store.cfg.HistorySplit.Enabled = &historyEnabled
if !store.CurrentInputFileEnabled() {
t.Fatal("expected history split config to not suppress current input file mode")
}
}
func TestStoreThinkingInjectionAccessors(t *testing.T) {

View File

@@ -22,6 +22,9 @@ const (
var fileReadySleep = time.Sleep
// ErrUploadFileNotFound indicates that DeepSeek returned no matching uploaded file.
var ErrUploadFileNotFound = errors.New("uploaded file not found")
func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, result *UploadFileResult) error {
if result == nil || strings.TrimSpace(result.ID) == "" {
return nil
@@ -42,7 +45,7 @@ func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, r
return fmt.Errorf("waiting for file %s to become ready: %w", result.ID, err)
}
fetched, err := c.fetchUploadedFile(pollCtx, a, result.ID)
fetched, err := c.FetchUploadedFile(pollCtx, a, result.ID)
if err == nil && fetched != nil {
mergeUploadFileResults(result, fetched)
if isReadyUploadFileStatus(result.Status) {
@@ -65,7 +68,8 @@ func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, r
return fmt.Errorf("file %s did not become ready: %w", result.ID, lastErr)
}
func (c *Client) fetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*UploadFileResult, error) {
// FetchUploadedFile returns metadata for an uploaded DeepSeek file by ID.
func (c *Client) FetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*UploadFileResult, error) {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return nil, errors.New("file id is required")
@@ -92,7 +96,7 @@ func (c *Client) fetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fil
result := extractFetchedUploadFileResult(resp, fileID)
if result == nil || strings.TrimSpace(result.ID) == "" {
return nil, errors.New("fetch files succeeded without matching file data")
return nil, ErrUploadFileNotFound
}
result.Raw = resp
return result, nil

View File

@@ -1,6 +1,7 @@
package claude
import (
"ds2api/internal/assistantturn"
"ds2api/internal/toolcall"
"fmt"
"time"
@@ -9,6 +10,47 @@ import (
"ds2api/internal/util"
)
func BuildMessageResponseFromTurn(messageID, model string, turn assistantturn.Turn, exposeThinking bool) map[string]any {
content := make([]map[string]any, 0, 4)
if exposeThinking && turn.Thinking != "" {
content = append(content, map[string]any{"type": "thinking", "thinking": turn.Thinking})
}
stopReason := "end_turn"
if len(turn.ToolCalls) > 0 {
stopReason = "tool_use"
for i, tc := range turn.ToolCalls {
content = append(content, map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), i),
"name": tc.Name,
"input": tc.Input,
})
}
} else {
text := turn.Text
if text == "" && exposeThinking {
text = turn.Thinking
}
if text == "" {
text = "抱歉,没有生成有效的响应内容。"
}
content = append(content, map[string]any{"type": "text", "text": text})
}
return map[string]any{
"id": messageID,
"type": "message",
"role": "assistant",
"model": model,
"content": content,
"stop_reason": stopReason,
"stop_sequence": nil,
"usage": map[string]any{
"input_tokens": turn.Usage.InputTokens,
"output_tokens": turn.Usage.OutputTokens,
},
}
}
func BuildMessageResponse(messageID, model string, normalizedMessages []any, finalThinking, finalText string, toolNames []string) map[string]any {
detected := toolcall.ParseToolCalls(finalText, toolNames)
if len(detected) == 0 && finalText == "" && finalThinking != "" {

View File

@@ -208,9 +208,6 @@ func TestUpdateSettingsCurrentInputFile(t *testing.T) {
if !h.Store.CurrentInputFileEnabled() {
t.Fatal("expected current input file accessor to stay enabled")
}
if h.Store.HistorySplitEnabled() {
t.Fatal("expected history split accessor to stay disabled")
}
}
func TestUpdateSettingsCurrentInputFilePartialUpdatePreservesEnabled(t *testing.T) {

View File

@@ -21,11 +21,10 @@ func boolFrom(v any) bool {
}
}
func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *config.RuntimeConfig, *config.CompatConfig, *config.ResponsesConfig, *config.EmbeddingsConfig, *config.AutoDeleteConfig, *config.CurrentInputFileConfig, *config.ThinkingInjectionConfig, map[string]string, error) {
func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *config.RuntimeConfig, *config.ResponsesConfig, *config.EmbeddingsConfig, *config.AutoDeleteConfig, *config.CurrentInputFileConfig, *config.ThinkingInjectionConfig, map[string]string, error) {
var (
adminCfg *config.AdminConfig
runtimeCfg *config.RuntimeConfig
compatCfg *config.CompatConfig
respCfg *config.ResponsesConfig
embCfg *config.EmbeddingsConfig
autoDeleteCfg *config.AutoDeleteConfig
@@ -39,7 +38,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["jwt_expire_hours"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("admin.jwt_expire_hours", n, 1, 720, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.JWTExpireHours = n
}
@@ -51,56 +50,43 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["account_max_inflight"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.account_max_inflight", n, 1, 256, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.AccountMaxInflight = n
}
if v, exists := raw["account_max_queue"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.account_max_queue", n, 1, 200000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.AccountMaxQueue = n
}
if v, exists := raw["global_max_inflight"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.global_max_inflight", n, 1, 200000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.GlobalMaxInflight = n
}
if v, exists := raw["token_refresh_interval_hours"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.token_refresh_interval_hours", n, 1, 720, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.TokenRefreshIntervalHours = n
}
if cfg.AccountMaxInflight > 0 && cfg.GlobalMaxInflight > 0 && cfg.GlobalMaxInflight < cfg.AccountMaxInflight {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, fmt.Errorf("runtime.global_max_inflight must be >= runtime.account_max_inflight")
return nil, nil, nil, nil, nil, nil, nil, nil, fmt.Errorf("runtime.global_max_inflight must be >= runtime.account_max_inflight")
}
runtimeCfg = cfg
}
if raw, ok := req["compat"].(map[string]any); ok {
cfg := &config.CompatConfig{}
if v, exists := raw["wide_input_strict_output"]; exists {
b := boolFrom(v)
cfg.WideInputStrictOutput = &b
}
if v, exists := raw["strip_reference_markers"]; exists {
b := boolFrom(v)
cfg.StripReferenceMarkers = &b
}
compatCfg = cfg
}
if raw, ok := req["responses"].(map[string]any); ok {
cfg := &config.ResponsesConfig{}
if v, exists := raw["store_ttl_seconds"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("responses.store_ttl_seconds", n, 30, 86400, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.StoreTTLSeconds = n
}
@@ -112,7 +98,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["provider"]; exists {
p := strings.TrimSpace(fmt.Sprintf("%v", v))
if err := config.ValidateTrimmedString("embeddings.provider", p, false); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.Provider = p
}
@@ -138,7 +124,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["mode"]; exists {
mode := strings.ToLower(strings.TrimSpace(fmt.Sprintf("%v", v)))
if err := config.ValidateAutoDeleteMode(mode); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
if mode == "" {
mode = "none"
@@ -160,12 +146,12 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["min_chars"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("current_input_file.min_chars", n, 0, 100000000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.MinChars = n
}
if err := config.ValidateCurrentInputFileConfig(*cfg); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
currentInputCfg = cfg
}
@@ -182,5 +168,5 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
thinkingInjCfg = cfg
}
return adminCfg, runtimeCfg, compatCfg, respCfg, embCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, nil
return adminCfg, runtimeCfg, respCfg, embCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, nil
}

View File

@@ -27,7 +27,6 @@ func (h *Handler) getSettings(w http.ResponseWriter, _ *http.Request) {
"global_max_inflight": h.Store.RuntimeGlobalMaxInflight(recommended),
"token_refresh_interval_hours": h.Store.RuntimeTokenRefreshIntervalHours(),
},
"compat": snap.Compat,
"responses": snap.Responses,
"embeddings": snap.Embeddings,
"auto_delete": snap.AutoDelete,

View File

@@ -17,7 +17,7 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
return
}
adminCfg, runtimeCfg, compatCfg, responsesCfg, embeddingsCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, err := parseSettingsUpdateRequest(req)
adminCfg, runtimeCfg, responsesCfg, embeddingsCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, err := parseSettingsUpdateRequest(req)
if err != nil {
writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
return
@@ -53,14 +53,6 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
c.Runtime.TokenRefreshIntervalHours = runtimeCfg.TokenRefreshIntervalHours
}
}
if compatCfg != nil {
if compatCfg.WideInputStrictOutput != nil {
c.Compat.WideInputStrictOutput = compatCfg.WideInputStrictOutput
}
if compatCfg.StripReferenceMarkers != nil {
c.Compat.StripReferenceMarkers = compatCfg.StripReferenceMarkers
}
}
if responsesCfg != nil && responsesCfg.StoreTTLSeconds > 0 {
c.Responses.StoreTTLSeconds = responsesCfg.StoreTTLSeconds
}

View File

@@ -33,13 +33,10 @@ type ConfigStore interface {
RuntimeGlobalMaxInflight(defaultSize int) int
RuntimeTokenRefreshIntervalHours() int
AutoDeleteMode() string
HistorySplitEnabled() bool
HistorySplitTriggerAfterTurns() int
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
ThinkingInjectionEnabled() bool
ThinkingInjectionPrompt() string
CompatStripReferenceMarkers() bool
AutoDeleteSessions() bool
}

View File

@@ -0,0 +1,85 @@
package claude
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
)
type claudeCurrentInputAuth struct{}
func (claudeCurrentInputAuth) Determine(*http.Request) (*auth.RequestAuth, error) {
return &auth.RequestAuth{
DeepSeekToken: "direct-token",
CallerID: "caller:test",
TriedAccounts: map[string]bool{},
}, nil
}
func (claudeCurrentInputAuth) Release(*auth.RequestAuth) {}
type claudeCurrentInputDS struct {
uploads []dsclient.UploadFileRequest
payload map[string]any
}
func (d *claudeCurrentInputDS) CreateSession(context.Context, *auth.RequestAuth, int) (string, error) {
return "session-id", nil
}
func (d *claudeCurrentInputDS) GetPow(context.Context, *auth.RequestAuth, int) (string, error) {
return "pow", nil
}
func (d *claudeCurrentInputDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
d.uploads = append(d.uploads, req)
return &dsclient.UploadFileResult{ID: "file-claude-history"}, nil
}
func (d *claudeCurrentInputDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
d.payload = payload
return &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader("data: {\"p\":\"response/content\",\"v\":\"ok\"}\n")),
}, nil
}
func TestClaudeDirectAppliesCurrentInputFile(t *testing.T) {
ds := &claudeCurrentInputDS{}
h := &Handler{
Store: mockClaudeConfig{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}},
Auth: claudeCurrentInputAuth{},
DS: ds,
}
reqBody := `{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"hello from claude"}],"max_tokens":1024}`
req := httptest.NewRequest(http.MethodPost, "/v1/messages", strings.NewReader(reqBody))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
h.Messages(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if len(ds.uploads) != 1 {
t.Fatalf("expected one current input upload, got %d", len(ds.uploads))
}
if ds.uploads[0].Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", ds.uploads[0].Filename)
}
refIDs, _ := ds.payload["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-claude-history" {
t.Fatalf("expected uploaded history ref id, got %#v", ds.payload["ref_file_ids"])
}
prompt, _ := ds.payload["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
}

View File

@@ -17,12 +17,14 @@ type AuthResolver interface {
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type ConfigReader interface {
ModelAliases() map[string]string
CompatStripReferenceMarkers() bool
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type OpenAIChatRunner interface {

View File

@@ -7,7 +7,8 @@ type mockClaudeConfig struct {
}
func (m mockClaudeConfig) ModelAliases() map[string]string { return m.aliases }
func (mockClaudeConfig) CompatStripReferenceMarkers() bool { return true }
func (mockClaudeConfig) CurrentInputFileEnabled() bool { return true }
func (mockClaudeConfig) CurrentInputFileMinChars() int { return 0 }
func TestNormalizeClaudeRequestUsesGlobalAliasMapping(t *testing.T) {
req := map[string]any{
@@ -27,11 +28,32 @@ func TestNormalizeClaudeRequestUsesGlobalAliasMapping(t *testing.T) {
if out.Standard.ResolvedModel != "deepseek-v4-pro-search" {
t.Fatalf("resolved model mismatch: got=%q", out.Standard.ResolvedModel)
}
if out.Standard.Thinking || !out.Standard.Search {
if !out.Standard.Thinking || !out.Standard.Search {
t.Fatalf("unexpected flags: thinking=%v search=%v", out.Standard.Thinking, out.Standard.Search)
}
}
func TestNormalizeClaudeRequestDisablesThinkingWhenRequested(t *testing.T) {
req := map[string]any{
"model": "claude-opus-4-6",
"messages": []any{
map[string]any{"role": "user", "content": "hello"},
},
"thinking": map[string]any{"type": "disabled"},
}
out, err := normalizeClaudeRequest(mockClaudeConfig{
aliases: map[string]string{
"claude-opus-4-6": "deepseek-v4-pro",
},
}, req)
if err != nil {
t.Fatalf("normalizeClaudeRequest error: %v", err)
}
if out.Standard.Thinking {
t.Fatalf("expected explicit Claude thinking disable to win")
}
}
func TestNormalizeClaudeRequestEnablesThinkingWhenRequested(t *testing.T) {
req := map[string]any{
"model": "claude-opus-4-6",

View File

@@ -4,13 +4,19 @@ import (
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"net/http/httptest"
"strings"
"time"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
claudefmt "ds2api/internal/format/claude"
"ds2api/internal/httpapi/requestbody"
"ds2api/internal/promptcompat"
streamengine "ds2api/internal/stream"
"ds2api/internal/translatorcliproxy"
"ds2api/internal/util"
@@ -22,14 +28,89 @@ func (h *Handler) Messages(w http.ResponseWriter, r *http.Request) {
if strings.TrimSpace(r.Header.Get("anthropic-version")) == "" {
r.Header.Set("anthropic-version", "2023-06-01")
}
if h.OpenAI == nil {
writeClaudeError(w, http.StatusInternalServerError, "OpenAI proxy backend unavailable.")
if isClaudeVercelProxyRequest(r) && h.proxyViaOpenAI(w, r, h.Store) {
return
}
if h.proxyViaOpenAI(w, r, h.Store) {
if h.Auth == nil || h.DS == nil {
if h.OpenAI != nil && h.proxyViaOpenAI(w, r, h.Store) {
return
}
writeClaudeError(w, http.StatusInternalServerError, "Claude runtime backend unavailable.")
return
}
writeClaudeError(w, http.StatusBadGateway, "Failed to proxy Claude request.")
if h.handleClaudeDirect(w, r) {
return
}
writeClaudeError(w, http.StatusBadGateway, "Failed to handle Claude request.")
}
func isClaudeVercelProxyRequest(r *http.Request) bool {
if r == nil || r.URL == nil {
return false
}
return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1" ||
strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
}
func (h *Handler) handleClaudeDirect(w http.ResponseWriter, r *http.Request) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeClaudeError(w, http.StatusBadRequest, "invalid json")
} else {
writeClaudeError(w, http.StatusBadRequest, "invalid body")
}
return true
}
var req map[string]any
if err := json.Unmarshal(raw, &req); err != nil {
writeClaudeError(w, http.StatusBadRequest, "invalid json")
return true
}
norm, err := normalizeClaudeRequest(h.Store, req)
if err != nil {
writeClaudeError(w, http.StatusBadRequest, err.Error())
return true
}
exposeThinking := norm.Standard.Thinking
a, err := h.Auth.Determine(r)
if err != nil {
writeClaudeError(w, http.StatusUnauthorized, err.Error())
return true
}
defer h.Auth.Release(a)
if norm.Standard.Stream {
h.handleClaudeDirectStream(w, r, a, norm.Standard)
return true
}
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, norm.Standard, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeClaudeError(w, outErr.Status, outErr.Message)
return true
}
writeJSON(w, http.StatusOK, claudefmt.BuildMessageResponseFromTurn(
fmt.Sprintf("msg_%d", time.Now().UnixNano()),
norm.Standard.ResponseModel,
result.Turn,
exposeThinking,
))
return true
}
func (h *Handler) handleClaudeDirectStream(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeClaudeError(w, outErr.Status, outErr.Message)
return
}
streamReq := start.Request
h.handleClaudeStreamRealtime(w, r, start.Response, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw)
}
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store ConfigReader) bool {
@@ -58,7 +139,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
}
}
translatedReq := translatorcliproxy.ToOpenAI(sdktranslator.FormatClaude, translateModel, raw, stream)
translatedReq, exposeThinking := applyClaudeThinkingPolicyToOpenAIRequest(translatedReq, req, stream)
translatedReq, exposeThinking := applyClaudeThinkingPolicyToOpenAIRequest(translatedReq, req)
isVercelPrepare := strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1"
isVercelRelease := strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
@@ -133,7 +214,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
return true
}
func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[string]any, stream bool) ([]byte, bool) {
func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[string]any) ([]byte, bool) {
req := map[string]any{}
if err := json.Unmarshal(translated, &req); err != nil {
return translated, false
@@ -143,7 +224,7 @@ func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[st
if _, translatedHasOverride := util.ResolveThinkingOverride(req); translatedHasOverride {
return translated, false
}
enabled = !stream
enabled = true
}
typ := "disabled"
if enabled {
@@ -152,9 +233,9 @@ func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[st
req["thinking"] = map[string]any{"type": typ}
out, err := json.Marshal(req)
if err != nil {
return translated, ok && enabled
return translated, enabled
}
return out, ok && enabled
return out, enabled
}
func stripClaudeThinkingBlocks(raw []byte) []byte {
@@ -209,7 +290,7 @@ func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Requ
messages,
thinkingEnabled,
searchEnabled,
h.compatStripReferenceMarkers(),
stripReferenceMarkersEnabled(),
toolNames,
toolsRaw,
buildClaudePromptTokenText(messages, thinkingEnabled),

View File

@@ -8,6 +8,7 @@ import (
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
"ds2api/internal/textclean"
"ds2api/internal/util"
)
@@ -21,11 +22,8 @@ type Handler struct {
OpenAI OpenAIChatRunner
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil || h.Store == nil {
return true
}
return h.Store.CompatStripReferenceMarkers()
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
var (

View File

@@ -28,6 +28,18 @@ func makeClaudeSSEHTTPResponse(lines ...string) *http.Response {
}
}
func makeClaudeContentLine(t *testing.T, text string) string {
t.Helper()
line, err := json.Marshal(map[string]any{
"p": "response/content",
"v": text,
})
if err != nil {
t.Fatalf("marshal content line failed: %v", err)
}
return "data: " + string(line)
}
func parseClaudeFrames(t *testing.T, body string) []claudeFrame {
t.Helper()
chunks := strings.Split(body, "\n\n")
@@ -71,6 +83,17 @@ func findClaudeFrames(frames []claudeFrame, event string) []claudeFrame {
return out
}
func collectClaudeTextDeltas(frames []claudeFrame) string {
var combined strings.Builder
for _, f := range findClaudeFrames(frames, "content_block_delta") {
delta, _ := f.Payload["delta"].(map[string]any)
if delta["type"] == "text_delta" {
combined.WriteString(asString(delta["text"]))
}
}
return combined.String()
}
func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T) {
h := &Handler{}
resp := makeClaudeSSEHTTPResponse(
@@ -96,8 +119,8 @@ func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T)
frames := parseClaudeFrames(t, body)
deltas := findClaudeFrames(frames, "content_block_delta")
if len(deltas) < 2 {
t.Fatalf("expected at least 2 text deltas, got=%d body=%s", len(deltas), body)
if len(deltas) < 1 {
t.Fatalf("expected at least 1 text delta, got=%d body=%s", len(deltas), body)
}
combined := strings.Builder{}
for _, f := range deltas {
@@ -111,6 +134,52 @@ func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T)
}
}
func TestHandleClaudeStreamRealtimeToolBufferedPlainTextDoesNotRepeatFinalText(t *testing.T) {
h := &Handler{}
want := "明白\n\nBash\nIN\npwd\nOUT\nok"
resp := makeClaudeSSEHTTPResponse(
makeClaudeContentLine(t, "明"),
makeClaudeContentLine(t, "白\n\nBash\nIN\npwd\n"),
makeClaudeContentLine(t, "OUT\nok"),
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "use tool"}}, false, false, []string{"Bash"}, nil)
frames := parseClaudeFrames(t, rec.Body.String())
if got := collectClaudeTextDeltas(frames); got != want {
t.Fatalf("unexpected combined text: got %q want %q body=%s", got, want, rec.Body.String())
}
}
func TestHandleClaudeStreamRealtimeTrimsContinuationReplay(t *testing.T) {
h := &Handler{}
prefix := strings.Repeat("A", 40)
resp := makeClaudeSSEHTTPResponse(
`data: {"p":"response/content","v":"`+prefix+`"}`,
`data: {"p":"response/content","v":"`+prefix+` tail"}`,
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, false, false, nil, nil)
frames := parseClaudeFrames(t, rec.Body.String())
combined := strings.Builder{}
for _, f := range findClaudeFrames(frames, "content_block_delta") {
delta, _ := f.Payload["delta"].(map[string]any)
if delta["type"] == "text_delta" {
combined.WriteString(asString(delta["text"]))
}
}
if got, want := combined.String(), prefix+" tail"; got != want {
t.Fatalf("unexpected combined text: got %q want %q body=%s", got, want, rec.Body.String())
}
}
func TestHandleClaudeStreamRealtimeThinkingDelta(t *testing.T) {
h := &Handler{}
resp := makeClaudeSSEHTTPResponse(

View File

@@ -14,7 +14,8 @@ type claudeProxyStoreStub struct {
func (s claudeProxyStoreStub) ModelAliases() map[string]string { return s.aliases }
func (claudeProxyStoreStub) CompatStripReferenceMarkers() bool { return true }
func (claudeProxyStoreStub) CurrentInputFileEnabled() bool { return true }
func (claudeProxyStoreStub) CurrentInputFileMinChars() int { return 0 }
type openAIProxyStub struct {
status int
@@ -166,7 +167,7 @@ func TestClaudeProxyViaOpenAIEnablesThinkingWhenRequested(t *testing.T) {
}
}
func TestClaudeProxyViaOpenAIKeepsStreamDefaultThinkingDisabled(t *testing.T) {
func TestClaudeProxyViaOpenAIEnablesStreamThinkingByDefault(t *testing.T) {
openAI := &openAIProxyCaptureStub{}
h := &Handler{
Store: claudeProxyStoreStub{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}},
@@ -178,12 +179,12 @@ func TestClaudeProxyViaOpenAIKeepsStreamDefaultThinkingDisabled(t *testing.T) {
h.Messages(rec, req)
thinking, _ := openAI.seenReq["thinking"].(map[string]any)
if thinking["type"] != "disabled" {
t.Fatalf("expected Claude stream default to keep downstream thinking disabled, got %#v", openAI.seenReq)
if thinking["type"] != "enabled" {
t.Fatalf("expected Claude stream default to enable downstream thinking, got %#v", openAI.seenReq)
}
}
func TestClaudeProxyViaOpenAIStripsThinkingBlocksFromNonStreamResponse(t *testing.T) {
func TestClaudeProxyViaOpenAIExposesThinkingBlocksByDefault(t *testing.T) {
body := `{"id":"chatcmpl_1","object":"chat.completion","created":1,"model":"claude-sonnet-4-5","choices":[{"index":0,"message":{"role":"assistant","content":null,"reasoning_content":"internal reasoning","tool_calls":[{"id":"call_1","type":"function","function":{"name":"search","arguments":"{\"q\":\"x\"}"}}]},"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":1,"completion_tokens":1,"total_tokens":2}}`
h := &Handler{OpenAI: openAIProxyStub{status: 200, body: body}}
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":"hi"}],"stream":false}`))
@@ -195,14 +196,31 @@ func TestClaudeProxyViaOpenAIStripsThinkingBlocksFromNonStreamResponse(t *testin
t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
}
got := rec.Body.String()
if strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected converted Claude response to strip thinking block, got %s", got)
if !strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected converted Claude response to expose thinking block, got %s", got)
}
if !strings.Contains(got, `"tool_use"`) {
t.Fatalf("expected converted Claude response to preserve tool_use, got %s", got)
}
}
func TestClaudeProxyViaOpenAIStripsThinkingBlocksWhenDisabled(t *testing.T) {
body := `{"id":"chatcmpl_1","object":"chat.completion","created":1,"model":"claude-sonnet-4-5","choices":[{"index":0,"message":{"role":"assistant","content":"ok","reasoning_content":"internal reasoning"},"finish_reason":"stop"}],"usage":{"prompt_tokens":1,"completion_tokens":1,"total_tokens":2}}`
h := &Handler{OpenAI: openAIProxyStub{status: 200, body: body}}
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"disabled"},"stream":false}`))
rec := httptest.NewRecorder()
h.Messages(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
}
got := rec.Body.String()
if strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected disabled thinking to strip thinking block, got %s", got)
}
}
func TestClaudeProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
openAI := &openAIProxyCaptureStub{}
h := &Handler{OpenAI: openAI}

View File

@@ -32,11 +32,11 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
dsPayload := convertClaudeToDeepSeek(payload, store)
dsModel, _ := dsPayload["model"].(string)
_, searchEnabled, ok := config.GetModelConfig(dsModel)
defaultThinkingEnabled, searchEnabled, ok := config.GetModelConfig(dsModel)
if !ok {
searchEnabled = false
}
thinkingEnabled := util.ResolveThinkingEnabled(req, false)
thinkingEnabled := util.ResolveThinkingEnabled(req, defaultThinkingEnabled)
if config.IsNoThinkingModel(dsModel) {
thinkingEnabled = false
}

View File

@@ -8,6 +8,8 @@ import (
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
)
type claudeStreamRuntime struct {
@@ -30,11 +32,18 @@ type claudeStreamRuntime struct {
thinking strings.Builder
text strings.Builder
sieve toolstream.State
rawText strings.Builder
rawThinking strings.Builder
toolDetectionThinking strings.Builder
toolCallsDetected bool
nextBlockIndex int
thinkingBlockOpen bool
thinkingBlockIndex int
textBlockOpen bool
textBlockIndex int
textEmitted bool
ended bool
upstreamErr string
}
@@ -84,8 +93,28 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
}
contentSeen := false
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlapFromBuilder(&s.toolDetectionThinking, p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
cleanedText := cleanVisibleOutput(p.Text, s.stripReferenceMarkers)
var rawTrimmed string
if p.Type == "thinking" {
rawTrimmed = sse.TrimContinuationOverlapFromBuilder(&s.rawThinking, p.Text)
} else {
rawTrimmed = sse.TrimContinuationOverlapFromBuilder(&s.rawText, p.Text)
}
if rawTrimmed == "" {
continue
}
if p.Type == "thinking" {
s.rawThinking.WriteString(rawTrimmed)
} else {
s.rawText.WriteString(rawTrimmed)
}
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
@@ -98,7 +127,7 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
if !s.thinkingEnabled {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
trimmed := sse.TrimContinuationOverlapFromBuilder(&s.thinking, cleanedText)
if trimmed == "" {
continue
}
@@ -128,44 +157,80 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed == "" {
continue
}
s.text.WriteString(trimmed)
if s.bufferToolContent {
if hasUnclosedCodeFence(s.text.String()) {
continue
s.text.WriteString(cleanedText)
if !s.bufferToolContent {
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
continue
}
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
"delta": map[string]any{
"type": "text_delta",
"text": cleanedText,
},
})
s.textBlockOpen = true
s.textEmitted = true
continue
}
events := toolstream.ProcessChunk(&s.sieve, rawTrimmed, s.toolNames)
for _, evt := range events {
if len(evt.ToolCalls) > 0 {
s.closeTextBlock()
s.toolCallsDetected = true
normalized := toolcall.NormalizeParsedToolCallsForSchemas(evt.ToolCalls, s.toolsRaw)
for _, tc := range normalized {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
continue
}
if evt.Content == "" {
continue
}
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "text_delta",
"text": cleaned,
},
})
s.textEmitted = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "text_delta",
"text": trimmed,
},
})
}
return streamengine.ParsedDecision{ContentSeen: contentSeen}
}
func hasUnclosedCodeFence(text string) bool {
return strings.Count(text, "```")%2 == 1
}

View File

@@ -1,13 +1,15 @@
package claude
import (
"ds2api/internal/assistantturn"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
"encoding/json"
"fmt"
"time"
streamengine "ds2api/internal/stream"
"ds2api/internal/util"
)
func (s *claudeStreamRuntime) closeThinkingBlock() {
@@ -34,6 +36,32 @@ func (s *claudeStreamRuntime) closeTextBlock() {
s.textBlockIndex = -1
}
func (s *claudeStreamRuntime) sendToolUseBlock(idx int, tc toolcall.ParsedToolCall) {
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": idx,
"content_block": map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), idx),
"name": tc.Name,
"input": map[string]any{},
},
})
inputBytes, _ := json.Marshal(tc.Input)
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": idx,
"delta": map[string]any{
"type": "input_json_delta",
"partial_json": string(inputBytes),
},
})
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
})
}
func (s *claudeStreamRuntime) finalize(stopReason string) {
if s.ended {
return
@@ -41,49 +69,83 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
s.ended = true
s.closeThinkingBlock()
s.closeTextBlock()
finalThinking := s.thinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
if s.bufferToolContent {
detected := toolcall.ParseStandaloneToolCalls(finalText, s.toolNames)
if len(detected) == 0 && finalText == "" && finalThinking != "" {
detected = toolcall.ParseStandaloneToolCalls(finalThinking, s.toolNames)
}
if len(detected) > 0 {
detected = toolcall.NormalizeParsedToolCallsForSchemas(detected, s.toolsRaw)
stopReason = "tool_use"
for i, tc := range detected {
idx := s.nextBlockIndex + i
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": idx,
"content_block": map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), idx),
"name": tc.Name,
"input": map[string]any{},
},
})
inputBytes, _ := json.Marshal(tc.Input)
for _, evt := range toolstream.Flush(&s.sieve, s.toolNames) {
if len(evt.ToolCalls) > 0 {
s.closeTextBlock()
s.toolCallsDetected = true
normalized := toolcall.NormalizeParsedToolCallsForSchemas(evt.ToolCalls, s.toolsRaw)
for _, tc := range normalized {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": idx,
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "input_json_delta",
"partial_json": string(inputBytes),
"type": "text_delta",
"text": cleaned,
},
})
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
})
s.textEmitted = true
}
s.nextBlockIndex += len(detected)
} else if finalText != "" {
}
}
s.closeTextBlock()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.rawText.String(),
VisibleText: s.text.String(),
RawThinking: s.rawThinking.String(),
VisibleThinking: s.thinking.String(),
DetectionThinking: s.toolDetectionThinking.String(),
AlreadyEmittedCalls: s.toolCallsDetected,
AlreadyEmittedToolRaw: s.toolCallsDetected,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.promptTokenText,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
})
finalText := turn.Text
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsDetected,
})
if s.bufferToolContent && !s.toolCallsDetected {
if len(turn.ToolCalls) > 0 {
stopReason = "tool_use"
for _, tc := range turn.ToolCalls {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
} else if finalText != "" && !s.textEmitted {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
@@ -102,6 +164,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
"text": finalText,
},
})
s.textEmitted = true
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
@@ -109,7 +172,10 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
}
}
outputTokens := util.CountOutputTokens(finalThinking, s.model) + util.CountOutputTokens(finalText, s.model)
if outcome.HasToolCalls {
stopReason = "tool_use"
}
s.send("message_delta", map[string]any{
"type": "message_delta",
"delta": map[string]any{
@@ -117,7 +183,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
"stop_sequence": nil,
},
"usage": map[string]any{
"output_tokens": outputTokens,
"output_tokens": outcome.Usage.OutputTokens,
},
})
s.send("message_stop", map[string]any{"type": "message_stop"})

View File

@@ -23,7 +23,8 @@ type streamStatusClaudeStoreStub struct{}
func (streamStatusClaudeStoreStub) ModelAliases() map[string]string { return nil }
func (streamStatusClaudeStoreStub) CompatStripReferenceMarkers() bool { return true }
func (streamStatusClaudeStoreStub) CurrentInputFileEnabled() bool { return true }
func (streamStatusClaudeStoreStub) CurrentInputFileMinChars() int { return 0 }
func captureClaudeStatusMiddleware(statuses *[]int) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {

View File

@@ -33,6 +33,9 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
toolsRaw := convertGeminiTools(req["tools"])
finalPrompt, toolNames := promptcompat.BuildOpenAIPromptForAdapter(messagesRaw, toolsRaw, "", thinkingEnabled)
if len(toolNames) == 0 && len(toolsRaw) > 0 {
toolNames = []string{"__any_tool__"}
}
passThrough := collectGeminiPassThrough(req)
return promptcompat.StandardRequest{
@@ -42,6 +45,7 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
ResponseModel: requestedModel,
Messages: messagesRaw,
PromptTokenText: finalPrompt,
ToolsRaw: toolsRaw,
FinalPrompt: finalPrompt,
ToolNames: toolNames,
Stream: stream,

View File

@@ -17,12 +17,14 @@ type AuthResolver interface {
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type ConfigReader interface {
ModelAliases() map[string]string
CompatStripReferenceMarkers() bool
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type OpenAIChatRunner interface {

View File

@@ -11,7 +11,11 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/httpapi/requestbody"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/translatorcliproxy"
@@ -21,14 +25,84 @@ import (
)
func (h *Handler) handleGenerateContent(w http.ResponseWriter, r *http.Request, stream bool) {
if h.OpenAI == nil {
writeGeminiError(w, http.StatusInternalServerError, "OpenAI proxy backend unavailable.")
if isGeminiVercelProxyRequest(r) && h.proxyViaOpenAI(w, r, stream) {
return
}
if h.proxyViaOpenAI(w, r, stream) {
if h.Auth == nil || h.DS == nil {
if h.OpenAI != nil && h.proxyViaOpenAI(w, r, stream) {
return
}
writeGeminiError(w, http.StatusInternalServerError, "Gemini runtime backend unavailable.")
return
}
writeGeminiError(w, http.StatusBadGateway, "Failed to proxy Gemini request.")
if h.handleGeminiDirect(w, r, stream) {
return
}
writeGeminiError(w, http.StatusBadGateway, "Failed to handle Gemini request.")
}
func isGeminiVercelProxyRequest(r *http.Request) bool {
if r == nil || r.URL == nil {
return false
}
return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1" ||
strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
}
func (h *Handler) handleGeminiDirect(w http.ResponseWriter, r *http.Request, stream bool) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeGeminiError(w, http.StatusBadRequest, "invalid json")
} else {
writeGeminiError(w, http.StatusBadRequest, "invalid body")
}
return true
}
routeModel := strings.TrimSpace(chi.URLParam(r, "model"))
var req map[string]any
if err := json.Unmarshal(raw, &req); err != nil {
writeGeminiError(w, http.StatusBadRequest, "invalid json")
return true
}
stdReq, err := normalizeGeminiRequest(h.Store, routeModel, req, stream)
if err != nil {
writeGeminiError(w, http.StatusBadRequest, err.Error())
return true
}
a, err := h.Auth.Determine(r)
if err != nil {
writeGeminiError(w, http.StatusUnauthorized, err.Error())
return true
}
defer h.Auth.Release(a)
if stream {
h.handleGeminiDirectStream(w, r, a, stdReq)
return true
}
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeGeminiError(w, outErr.Status, outErr.Message)
return true
}
writeJSON(w, http.StatusOK, buildGeminiGenerateContentResponseFromTurn(result.Turn))
return true
}
func (h *Handler) handleGeminiDirectStream(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeGeminiError(w, outErr.Status, outErr.Message)
return
}
streamReq := start.Request
h.handleStreamGenerateContent(w, r, start.Response, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw)
}
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream bool) bool {
@@ -220,7 +294,7 @@ func (h *Handler) handleNonStreamGenerateContent(w http.ResponseWriter, resp *ht
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
writeJSON(w, http.StatusOK, buildGeminiGenerateContentResponse(
model,
finalPrompt,
@@ -250,6 +324,60 @@ func buildGeminiGenerateContentResponse(model, finalPrompt, finalThinking, final
}
}
func buildGeminiGenerateContentResponseFromTurn(turn assistantturn.Turn) map[string]any {
parts := buildGeminiPartsFromTurn(turn)
return map[string]any{
"candidates": []map[string]any{
{
"index": 0,
"content": map[string]any{
"role": "model",
"parts": parts,
},
"finishReason": "STOP",
},
},
"modelVersion": turn.Model,
"usageMetadata": map[string]any{
"promptTokenCount": turn.Usage.InputTokens,
"candidatesTokenCount": turn.Usage.OutputTokens,
"totalTokenCount": turn.Usage.TotalTokens,
},
}
}
func buildGeminiPartsFromTurn(turn assistantturn.Turn) []map[string]any {
thinkingPart := func() []map[string]any {
if turn.Thinking == "" {
return nil
}
return []map[string]any{{"text": turn.Thinking, "thought": true}}
}
if len(turn.ToolCalls) > 0 {
parts := thinkingPart()
if parts == nil {
parts = make([]map[string]any, 0, len(turn.ToolCalls))
}
for _, tc := range turn.ToolCalls {
parts = append(parts, map[string]any{
"functionCall": map[string]any{
"name": tc.Name,
"args": tc.Input,
},
})
}
return parts
}
parts := thinkingPart()
if turn.Text != "" {
parts = append(parts, map[string]any{"text": turn.Text})
}
if len(parts) == 0 {
parts = append(parts, map[string]any{"text": ""})
}
return parts
}
//nolint:unused // retained for native Gemini non-stream handling path.
func buildGeminiUsage(model, finalPrompt, finalThinking, finalText string) map[string]any {
promptTokens := util.CountPromptTokens(finalPrompt, model)
@@ -268,8 +396,17 @@ func buildGeminiPartsFromFinal(finalText, finalThinking string, toolNames []stri
if len(detected) == 0 && finalThinking != "" {
detected = toolcall.ParseToolCalls(finalThinking, toolNames)
}
thinkingPart := func() []map[string]any {
if finalThinking == "" {
return nil
}
return []map[string]any{{"text": finalThinking, "thought": true}}
}
if len(detected) > 0 {
parts := make([]map[string]any, 0, len(detected))
parts := thinkingPart()
if parts == nil {
parts = make([]map[string]any, 0, len(detected))
}
for _, tc := range detected {
parts = append(parts, map[string]any{
"functionCall": map[string]any{
@@ -281,9 +418,12 @@ func buildGeminiPartsFromFinal(finalText, finalThinking string, toolNames []stri
return parts
}
text := finalText
if text == "" {
text = finalThinking
parts := thinkingPart()
if finalText != "" {
parts = append(parts, map[string]any{"text": finalText})
}
return []map[string]any{{"text": text}}
if len(parts) == 0 {
parts = append(parts, map[string]any{"text": ""})
}
return parts
}

View File

@@ -5,6 +5,7 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/textclean"
"ds2api/internal/util"
)
@@ -18,11 +19,8 @@ type Handler struct {
}
//nolint:unused // used by native Gemini stream/non-stream runtime helpers.
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil || h.Store == nil {
return true
}
return h.Store.CompatStripReferenceMarkers()
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func RegisterRoutes(r chi.Router, h *Handler) {

View File

@@ -7,13 +7,14 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
dsprotocol "ds2api/internal/deepseek/protocol"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
)
//nolint:unused // retained for native Gemini stream handling path.
func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Request, resp *http.Response, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Request, resp *http.Response, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any) {
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
@@ -28,7 +29,7 @@ func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Req
rc := http.NewResponseController(w)
_, canFlush := w.(http.Flusher)
runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, h.compatStripReferenceMarkers(), toolNames)
runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw)
initialType := "text"
if thinkingEnabled {
@@ -64,9 +65,11 @@ type geminiStreamRuntime struct {
bufferContent bool
stripReferenceMarkers bool
toolNames []string
toolsRaw any
thinking strings.Builder
text strings.Builder
accumulator *assistantturn.Accumulator
contentFilter bool
responseMessageID int
}
//nolint:unused // retained for native Gemini stream handling path.
@@ -80,6 +83,7 @@ func newGeminiStreamRuntime(
searchEnabled bool,
stripReferenceMarkers bool,
toolNames []string,
toolsRaw any,
) *geminiStreamRuntime {
return &geminiStreamRuntime{
w: w,
@@ -92,6 +96,12 @@ func newGeminiStreamRuntime(
bufferContent: len(toolNames) > 0,
stripReferenceMarkers: stripReferenceMarkers,
toolNames: toolNames,
toolsRaw: toolsRaw,
accumulator: assistantturn.NewAccumulator(assistantturn.AccumulatorOptions{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
}),
}
}
@@ -111,35 +121,39 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
if !parsed.Parsed {
return streamengine.ParsedDecision{}
}
if parsed.ResponseMessageID > 0 {
s.responseMessageID = parsed.ResponseMessageID
}
if parsed.ContentFilter || parsed.ErrorMessage != "" || parsed.Stop {
if parsed.ContentFilter {
s.contentFilter = true
}
return streamengine.ParsedDecision{Stop: true}
}
contentSeen := false
for _, p := range parsed.Parts {
cleanedText := cleanVisibleOutput(p.Text, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
if p.Type != "thinking" && s.searchEnabled && sse.IsCitation(cleanedText) {
continue
}
contentSeen = true
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
if s.thinkingEnabled {
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
if p.VisibleText == "" || s.bufferContent {
continue
}
s.sendChunk(map[string]any{
"candidates": []map[string]any{
{
"index": 0,
"content": map[string]any{
"role": "model",
"parts": []map[string]any{{"text": p.VisibleText, "thought": true}},
},
},
},
"modelVersion": s.model,
})
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed == "" {
if p.RawText == "" || p.CitationOnly || p.VisibleText == "" {
continue
}
s.text.WriteString(trimmed)
if s.bufferContent {
continue
}
@@ -149,23 +163,39 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
"index": 0,
"content": map[string]any{
"role": "model",
"parts": []map[string]any{{"text": trimmed}},
"parts": []map[string]any{{"text": p.VisibleText}},
},
},
},
"modelVersion": s.model,
})
}
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}
//nolint:unused // retained for native Gemini stream handling path.
func (s *geminiStreamRuntime) finalize() {
finalThinking := s.thinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
rawText, text, rawThinking, thinking, detectionThinking := s.accumulator.Snapshot()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: rawText,
VisibleText: text,
RawThinking: rawThinking,
VisibleThinking: thinking,
DetectionThinking: detectionThinking,
ContentFilter: s.contentFilter,
ResponseMessageID: s.responseMessageID,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
})
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if s.bufferContent {
parts := buildGeminiPartsFromFinal(finalText, finalThinking, s.toolNames)
parts := buildGeminiPartsFromTurn(turn)
s.sendChunk(map[string]any{
"candidates": []map[string]any{
{
@@ -193,7 +223,11 @@ func (s *geminiStreamRuntime) finalize() {
"finishReason": "STOP",
},
},
"modelVersion": s.model,
"usageMetadata": buildGeminiUsage(s.model, s.finalPrompt, finalThinking, finalText),
"modelVersion": s.model,
"usageMetadata": map[string]any{
"promptTokenCount": outcome.Usage.InputTokens,
"candidatesTokenCount": outcome.Usage.OutputTokens,
"totalTokenCount": outcome.Usage.TotalTokens,
},
})
}

View File

@@ -13,12 +13,14 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
)
type testGeminiConfig struct{}
func (testGeminiConfig) ModelAliases() map[string]string { return nil }
func (testGeminiConfig) CompatStripReferenceMarkers() bool { return true }
func (testGeminiConfig) ModelAliases() map[string]string { return nil }
func (testGeminiConfig) CurrentInputFileEnabled() bool { return true }
func (testGeminiConfig) CurrentInputFileMinChars() int { return 0 }
type testGeminiAuth struct {
a *auth.RequestAuth
@@ -44,22 +46,31 @@ func (testGeminiAuth) Release(_ *auth.RequestAuth) {}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
type testGeminiDS struct {
resp *http.Response
err error
resp *http.Response
err error
uploadCalls []dsclient.UploadFileRequest
payloads []map[string]any
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
func (m *testGeminiDS) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
return "session-id", nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
func (m *testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
return "pow", nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
func (m *testGeminiDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
m.uploadCalls = append(m.uploadCalls, req)
return &dsclient.UploadFileResult{ID: "file-gemini-history"}, nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m *testGeminiDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
m.payloads = append(m.payloads, payload)
if m.err != nil {
return nil, m.err
}
@@ -123,6 +134,46 @@ func makeGeminiUpstreamResponse(lines ...string) *http.Response {
}
}
func TestGeminiDirectAppliesCurrentInputFile(t *testing.T) {
ds := &testGeminiDS{
resp: makeGeminiUpstreamResponse(`data: {"p":"response/content","v":"ok"}`),
}
h := &Handler{
Store: testGeminiConfig{},
Auth: testGeminiAuth{},
DS: ds,
}
reqBody := `{"contents":[{"role":"user","parts":[{"text":"hello from gemini"}]}]}`
req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:generateContent", strings.NewReader(reqBody))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
r := chi.NewRouter()
RegisterRoutes(r, h)
r.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected one current input upload, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", ds.uploadCalls[0].Filename)
}
if len(ds.payloads) != 1 {
t.Fatalf("expected one completion payload, got %d", len(ds.payloads))
}
refIDs, _ := ds.payloads[0]["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-gemini-history" {
t.Fatalf("expected uploaded history ref id, got %#v", ds.payloads[0]["ref_file_ids"])
}
prompt, _ := ds.payloads[0]["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
}
func TestGeminiRoutesRegistered(t *testing.T) {
h := &Handler{
Store: testGeminiConfig{},
@@ -257,6 +308,56 @@ func TestStreamGenerateContentEmitsSSE(t *testing.T) {
}
}
func TestNativeStreamGenerateContentEmitsThoughtParts(t *testing.T) {
h := &Handler{}
resp := makeGeminiUpstreamResponse(
`data: {"p":"response/thinking_content","v":"think"}`,
`data: {"p":"response/content","v":"answer"}`,
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:streamGenerateContent", nil)
h.handleStreamGenerateContent(rec, req, resp, "gemini-2.5-pro", "prompt", true, false, nil, nil)
frames := extractGeminiSSEFrames(t, rec.Body.String())
if len(frames) < 2 {
t.Fatalf("expected thought and text stream frames, body=%s", rec.Body.String())
}
var gotThought, gotText string
for _, frame := range frames {
for _, part := range geminiPartsFromFrame(frame) {
if part["thought"] == true {
gotThought += asString(part["text"])
} else {
gotText += asString(part["text"])
}
}
}
if gotThought != "think" {
t.Fatalf("expected thought part, got %q body=%s", gotThought, rec.Body.String())
}
if !strings.Contains(gotText, "answer") {
t.Fatalf("expected text part answer, got %q body=%s", gotText, rec.Body.String())
}
}
func TestBuildGeminiPartsFromFinalIncludesThoughtPart(t *testing.T) {
parts := buildGeminiPartsFromFinal("answer", "think", nil)
if len(parts) != 2 {
t.Fatalf("expected thought + answer parts, got %#v", parts)
}
if parts[0]["thought"] != true || parts[0]["text"] != "think" {
t.Fatalf("expected first part to be thought, got %#v", parts[0])
}
if _, ok := parts[1]["thought"]; ok {
t.Fatalf("expected second part to be visible text, got %#v", parts[1])
}
if parts[1]["text"] != "answer" {
t.Fatalf("expected answer text, got %#v", parts[1])
}
}
func TestGeminiProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
openAI := &geminiOpenAISuccessStub{}
h := &Handler{Store: testGeminiConfig{}, OpenAI: openAI}
@@ -396,3 +497,21 @@ func extractGeminiSSEFrames(t *testing.T, body string) []map[string]any {
}
return out
}
func geminiPartsFromFrame(frame map[string]any) []map[string]any {
candidates, _ := frame["candidates"].([]any)
if len(candidates) == 0 {
return nil
}
c0, _ := candidates[0].(map[string]any)
content, _ := c0["content"].(map[string]any)
rawParts, _ := content["parts"].([]any)
parts := make([]map[string]any, 0, len(rawParts))
for _, raw := range rawParts {
part, _ := raw.(map[string]any)
if part != nil {
parts = append(parts, part)
}
}
return parts
}

View File

@@ -57,7 +57,7 @@ func blockChatHistoryDetailDir(t *testing.T, detailDir string) func() {
func TestChatCompletionsNonStreamPersistsHistory(t *testing.T) {
historyStore := newTestChatHistoryStore(t)
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -216,7 +216,7 @@ func TestHandleStreamContextCancelledMarksHistoryStopped(t *testing.T) {
func TestChatCompletionsSkipsAdminWebUISource(t *testing.T) {
historyStore := newTestChatHistoryStore(t)
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -248,7 +248,7 @@ func TestChatCompletionsSkipsHistoryWhenDisabled(t *testing.T) {
t.Fatalf("disable history store failed: %v", err)
}
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -278,7 +278,6 @@ func TestChatCompletionsCurrentInputFilePersistsNeutralPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},

View File

@@ -5,7 +5,10 @@ import (
"net/http"
"strings"
"ds2api/internal/assistantturn"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolstream"
@@ -23,6 +26,7 @@ type chatStreamRuntime struct {
refFileTokens int
toolNames []string
toolsRaw any
toolChoice promptcompat.ToolChoicePolicy
thinkingEnabled bool
searchEnabled bool
@@ -34,15 +38,11 @@ type chatStreamRuntime struct {
toolCallsEmitted bool
toolCallsDoneEmitted bool
toolSieve toolstream.State
streamToolCallIDs map[int]string
streamToolNames map[int]string
rawThinking strings.Builder
thinking strings.Builder
toolDetectionThinking strings.Builder
rawText strings.Builder
text strings.Builder
responseMessageID int
toolSieve toolstream.State
streamToolCallIDs map[int]string
streamToolNames map[int]string
accumulator shared.StreamAccumulator
responseMessageID int
finalThinking string
finalText string
@@ -92,6 +92,7 @@ func newChatStreamRuntime(
stripReferenceMarkers bool,
toolNames []string,
toolsRaw any,
toolChoice promptcompat.ToolChoicePolicy,
bufferToolContent bool,
emitEarlyToolDeltas bool,
) *chatStreamRuntime {
@@ -105,6 +106,7 @@ func newChatStreamRuntime(
finalPrompt: finalPrompt,
toolNames: toolNames,
toolsRaw: toolsRaw,
toolChoice: toolChoice,
thinkingEnabled: thinkingEnabled,
searchEnabled: searchEnabled,
stripReferenceMarkers: stripReferenceMarkers,
@@ -112,6 +114,11 @@ func newChatStreamRuntime(
emitEarlyToolDeltas: emitEarlyToolDeltas,
streamToolCallIDs: map[int]string{},
streamToolNames: map[int]string{},
accumulator: shared.StreamAccumulator{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
},
}
}
@@ -120,7 +127,13 @@ func (s *chatStreamRuntime) sendKeepAlive() {
return
}
_, _ = s.w.Write([]byte(": keep-alive\n\n"))
_ = s.rc.Flush()
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{},
nil,
))
}
func (s *chatStreamRuntime) sendChunk(v any) {
@@ -177,8 +190,8 @@ func (s *chatStreamRuntime) markContextCancelled() {
s.finalErrorStatus = 499
s.finalErrorMessage = "request context cancelled"
s.finalErrorCode = string(streamengine.StopReasonContextCancelled)
s.finalThinking = s.thinking.String()
s.finalText = cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
s.finalThinking = s.accumulator.Thinking.String()
s.finalText = cleanVisibleOutput(s.accumulator.Text.String(), s.stripReferenceMarkers)
s.finalFinishReason = string(streamengine.StopReasonContextCancelled)
}
@@ -191,16 +204,34 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
s.finalErrorStatus = 0
s.finalErrorMessage = ""
s.finalErrorCode = ""
finalThinking := s.thinking.String()
finalToolDetectionThinking := s.toolDetectionThinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
s.finalThinking = finalThinking
s.finalText = finalText
detected := detectAssistantToolCalls(s.rawText.String(), finalText, s.rawThinking.String(), finalToolDetectionThinking, s.toolNames)
if len(detected.Calls) > 0 && !s.toolCallsDoneEmitted {
finishReason = "tool_calls"
finalThinking := s.accumulator.Thinking.String()
finalToolDetectionThinking := s.accumulator.ToolDetectionThinking.String()
finalText := s.accumulator.Text.String()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.accumulator.RawText.String(),
VisibleText: finalText,
RawThinking: s.accumulator.RawThinking.String(),
VisibleThinking: finalThinking,
DetectionThinking: finalToolDetectionThinking,
ContentFilter: finishReason == "content_filter",
ResponseMessageID: s.responseMessageID,
AlreadyEmittedCalls: s.toolCallsEmitted,
AlreadyEmittedToolRaw: s.toolCallsDoneEmitted,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
RefFileTokens: s.refFileTokens,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
ToolChoice: s.toolChoice,
})
s.finalThinking = turn.Thinking
s.finalText = turn.Text
if len(turn.ToolCalls) > 0 && !s.toolCallsDoneEmitted {
s.sendDelta(map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(detected.Calls, s.streamToolCallIDs, s.toolsRaw),
"tool_calls": formatFinalStreamToolCallsWithStableIDs(turn.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
})
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
@@ -209,7 +240,6 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
for _, evt := range toolstream.Flush(&s.toolSieve, s.toolNames) {
if len(evt.ToolCalls) > 0 {
batch.flush()
finishReason = "tool_calls"
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
s.sendDelta(map[string]any{
@@ -229,11 +259,11 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
batch.flush()
}
if len(detected.Calls) > 0 || s.toolCallsEmitted {
finishReason = "tool_calls"
}
if len(detected.Calls) == 0 && !s.toolCallsEmitted && strings.TrimSpace(finalText) == "" {
status, message, code := upstreamEmptyOutputDetail(finishReason == "content_filter", finalText, finalThinking)
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsEmitted || s.toolCallsDoneEmitted,
})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if deferEmptyOutput {
s.finalErrorStatus = status
s.finalErrorMessage = message
@@ -243,14 +273,14 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
s.sendFailedChunk(status, message, code)
return true
}
usage := openaifmt.BuildChatUsageForModel(s.model, s.finalPrompt, finalThinking, finalText, s.refFileTokens)
s.finalFinishReason = finishReason
usage := assistantturn.OpenAIChatUsage(turn)
s.finalFinishReason = outcome.FinishReason
s.finalUsage = usage
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{openaifmt.BuildChatStreamFinishChoice(0, finishReason)},
[]map[string]any{openaifmt.BuildChatStreamFinishChoice(0, outcome.FinishReason)},
usage,
))
s.sendDone()
@@ -265,7 +295,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
s.responseMessageID = parsed.ResponseMessageID
}
if parsed.ContentFilter {
if strings.TrimSpace(s.text.String()) == "" {
if strings.TrimSpace(s.accumulator.Text.String()) == "" {
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("content_filter")}
}
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
@@ -277,98 +307,65 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
}
contentSeen := false
batch := chatDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
rawTrimmed := sse.TrimContinuationOverlap(s.rawThinking.String(), p.Text)
if rawTrimmed != "" {
s.rawThinking.WriteString(rawTrimmed)
contentSeen = true
}
if s.thinkingEnabled {
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
batch.append("reasoning_content", trimmed)
}
batch.append("reasoning_content", p.VisibleText)
continue
}
if p.RawText == "" {
continue
}
if p.CitationOnly {
continue
}
if !s.bufferToolContent {
batch.append("content", p.VisibleText)
} else {
rawTrimmed := sse.TrimContinuationOverlap(s.rawText.String(), p.Text)
if rawTrimmed == "" {
continue
}
s.rawText.WriteString(rawTrimmed)
contentSeen = true
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if s.searchEnabled && sse.IsCitation(cleanedText) {
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed != "" {
s.text.WriteString(trimmed)
}
if !s.bufferToolContent {
if trimmed == "" {
events := toolstream.ProcessChunk(&s.toolSieve, p.RawText, s.toolNames)
for _, evt := range events {
if len(evt.ToolCallDeltas) > 0 {
if !s.emitEarlyToolDeltas {
continue
}
filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.streamToolNames)
if len(filtered) == 0 {
continue
}
formatted := formatIncrementalStreamToolCallDeltas(filtered, s.streamToolCallIDs)
if len(formatted) == 0 {
continue
}
batch.flush()
tcDelta := map[string]any{
"tool_calls": formatted,
}
s.toolCallsEmitted = true
s.sendDelta(tcDelta)
continue
}
batch.append("content", trimmed)
} else {
events := toolstream.ProcessChunk(&s.toolSieve, rawTrimmed, s.toolNames)
for _, evt := range events {
if len(evt.ToolCallDeltas) > 0 {
if !s.emitEarlyToolDeltas {
continue
}
filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.streamToolNames)
if len(filtered) == 0 {
continue
}
formatted := formatIncrementalStreamToolCallDeltas(filtered, s.streamToolCallIDs)
if len(formatted) == 0 {
continue
}
batch.flush()
tcDelta := map[string]any{
"tool_calls": formatted,
}
s.toolCallsEmitted = true
s.sendDelta(tcDelta)
if len(evt.ToolCalls) > 0 {
batch.flush()
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(evt.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
}
s.sendDelta(tcDelta)
s.resetStreamToolCallState()
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
if len(evt.ToolCalls) > 0 {
batch.flush()
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(evt.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
}
s.sendDelta(tcDelta)
s.resetStreamToolCallState()
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
batch.append("content", cleaned)
}
batch.append("content", cleaned)
}
}
}
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}

View File

@@ -0,0 +1,87 @@
package chat
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"ds2api/internal/promptcompat"
)
func TestChatStreamKeepAliveEmitsEmptyChoiceDataFrame(t *testing.T) {
rec := httptest.NewRecorder()
runtime := newChatStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"chatcmpl-test",
time.Now().Unix(),
"deepseek-v4-flash",
"prompt",
false,
false,
true,
nil,
nil,
promptcompat.DefaultToolChoicePolicy(),
false,
false,
)
runtime.sendKeepAlive()
body := rec.Body.String()
if !strings.Contains(body, ": keep-alive\n\n") {
t.Fatalf("expected keep-alive comment, got %q", body)
}
frames, done := parseSSEDataFrames(t, body)
if done {
t.Fatalf("keep-alive must not emit [DONE], body=%q", body)
}
if len(frames) != 1 {
t.Fatalf("expected one data frame, got %d body=%q", len(frames), body)
}
if got := asString(frames[0]["id"]); got != "chatcmpl-test" {
t.Fatalf("expected completion id to be preserved, got %q", got)
}
if got := asString(frames[0]["object"]); got != "chat.completion.chunk" {
t.Fatalf("expected chat chunk object, got %q", got)
}
choices, _ := frames[0]["choices"].([]any)
if len(choices) != 0 {
t.Fatalf("expected empty choices heartbeat, got %#v", choices)
}
}
func TestChatStreamFinalizeEnforcesRequiredToolChoice(t *testing.T) {
rec := httptest.NewRecorder()
runtime := newChatStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"chatcmpl-test",
time.Now().Unix(),
"deepseek-v4-flash",
"prompt",
false,
false,
true,
[]string{"Write"},
nil,
promptcompat.ToolChoicePolicy{Mode: promptcompat.ToolChoiceRequired},
true,
false,
)
if !runtime.finalize("stop", false) {
t.Fatalf("expected terminal error to be written")
}
if runtime.finalErrorCode != "tool_choice_violation" {
t.Fatalf("expected tool_choice_violation, got %q body=%s", runtime.finalErrorCode, rec.Body.String())
}
if !strings.Contains(rec.Body.String(), "tool_choice requires") {
t.Fatalf("expected tool choice error in stream body, got %s", rec.Body.String())
}
}

View File

@@ -7,10 +7,12 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
)
@@ -26,6 +28,7 @@ type chatNonStreamResult struct {
body map[string]any
finishReason string
responseMessageID int
outputError *assistantturn.OutputError
}
func (h *Handler) handleNonStreamWithRetry(w http.ResponseWriter, ctx context.Context, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) {
@@ -86,35 +89,40 @@ func (h *Handler) collectChatNonStreamAttempt(w http.ResponseWriter, resp *http.
return chatNonStreamResult{}, false
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
finalText = replaceCitationMarkersWithLinks(finalText, result.CitationLinks)
}
detected := detectAssistantToolCalls(result.Text, finalText, result.Thinking, result.ToolDetectionThinking, toolNames)
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, finalThinking, finalText, detected.Calls, toolsRaw)
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: usagePrompt,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
})
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
return chatNonStreamResult{
rawThinking: result.Thinking,
rawText: result.Text,
thinking: finalThinking,
thinking: turn.Thinking,
toolDetectionThinking: result.ToolDetectionThinking,
text: finalText,
text: turn.Text,
contentFilter: result.ContentFilter,
detectedCalls: len(detected.Calls),
detectedCalls: len(turn.ToolCalls),
body: respBody,
finishReason: chatFinishReason(respBody),
responseMessageID: result.ResponseMessageID,
outputError: turn.Error,
}, true
}
func (h *Handler) finishChatNonStreamResult(w http.ResponseWriter, result chatNonStreamResult, attempts int, usagePrompt string, refFileTokens int, historySession *chatHistorySession) {
if result.detectedCalls == 0 && shouldWriteUpstreamEmptyOutputError(result.text) {
if result.detectedCalls == 0 && strings.TrimSpace(result.text) == "" {
status, message, code := upstreamEmptyOutputDetail(result.contentFilter, result.text, result.thinking)
if result.outputError != nil {
status, message, code = result.outputError.Status, result.outputError.Message, result.outputError.Code
}
if historySession != nil {
historySession.error(status, message, code, result.thinking, result.text)
}
writeUpstreamEmptyOutputError(w, result.text, result.thinking, result.contentFilter)
writeOpenAIErrorWithCode(w, status, message, code)
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "chat.completions", "stream", false, "retry_attempts", attempts, "success_source", "none", "content_filter", result.contentFilter)
return
}
@@ -146,8 +154,8 @@ func shouldRetryChatNonStream(result chatNonStreamResult, attempts int) bool {
strings.TrimSpace(result.text) == ""
}
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) {
streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, historySession)
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, historySession)
if !ok {
return
}
@@ -189,7 +197,7 @@ func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request,
}
}
func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Response, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) (*chatStreamRuntime, string, bool) {
func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Response, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) (*chatStreamRuntime, string, bool) {
if resp.StatusCode != http.StatusOK {
defer func() { _ = resp.Body.Close() }()
body, _ := io.ReadAll(resp.Body)
@@ -214,7 +222,8 @@ func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Res
}
streamRuntime := newChatStreamRuntime(
w, rc, canFlush, completionID, time.Now().Unix(), model, finalPrompt,
thinkingEnabled, searchEnabled, h.compatStripReferenceMarkers(), toolNames, toolsRaw,
thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw,
toolChoice,
len(toolNames) > 0, h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence(),
)
streamRuntime.refFileTokens = refFileTokens
@@ -237,7 +246,7 @@ func (h *Handler) consumeChatStreamAttempt(r *http.Request, resp *http.Response,
OnParsed: func(parsed sse.LineResult) streamengine.ParsedDecision {
decision := streamRuntime.onParsed(parsed)
if historySession != nil {
historySession.progress(streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.progress(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
return decision
},
@@ -249,7 +258,7 @@ func (h *Handler) consumeChatStreamAttempt(r *http.Request, resp *http.Response,
OnContextDone: func() {
streamRuntime.markContextCancelled()
if historySession != nil {
historySession.stopped(streamRuntime.thinking.String(), streamRuntime.text.String(), string(streamengine.StopReasonContextCancelled))
historySession.stopped(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String(), string(streamengine.StopReasonContextCancelled))
}
},
})
@@ -269,7 +278,7 @@ func recordChatStreamHistory(streamRuntime *chatStreamRuntime, historySession *c
return
}
if streamRuntime.finalErrorMessage != "" {
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
return
}
historySession.success(http.StatusOK, streamRuntime.finalThinking, streamRuntime.finalText, streamRuntime.finalFinishReason, streamRuntime.finalUsage)
@@ -278,7 +287,7 @@ func recordChatStreamHistory(streamRuntime *chatStreamRuntime, historySession *c
func failChatStreamRetry(streamRuntime *chatStreamRuntime, historySession *chatHistorySession, status int, message, code string) {
streamRuntime.sendFailedChunk(status, message, code)
if historySession != nil {
historySession.error(status, message, code, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(status, message, code, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
}

View File

@@ -8,6 +8,7 @@ import (
"time"
"ds2api/internal/chathistory"
"ds2api/internal/promptcompat"
"ds2api/internal/stream"
)
@@ -48,6 +49,7 @@ func TestConsumeChatStreamAttemptMarksContextCancelledState(t *testing.T) {
true,
nil,
nil,
promptcompat.DefaultToolChoicePolicy(),
false,
false,
)

View File

@@ -12,6 +12,7 @@ import (
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/textclean"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
)
@@ -35,11 +36,8 @@ type streamLease struct {
ExpiresAt time.Time
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil {
return true
}
return shared.CompatStripReferenceMarkers(h.Store)
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func (h *Handler) applyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
@@ -80,6 +78,10 @@ func writeOpenAIError(w http.ResponseWriter, status int, message string) {
shared.WriteOpenAIError(w, status, message)
}
func writeOpenAIErrorWithCode(w http.ResponseWriter, status int, message, code string) {
shared.WriteOpenAIErrorWithCode(w, status, message, code)
}
func openAIErrorType(status int) string {
return shared.OpenAIErrorType(status)
}
@@ -104,22 +106,10 @@ func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
return shared.CleanVisibleOutput(text, stripReferenceMarkers)
}
func replaceCitationMarkersWithLinks(text string, links map[int]string) string {
return shared.ReplaceCitationMarkersWithLinks(text, links)
}
func shouldWriteUpstreamEmptyOutputError(text string) bool {
return shared.ShouldWriteUpstreamEmptyOutputError(text)
}
func upstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
return shared.UpstreamEmptyOutputDetail(contentFilter, text, thinking)
}
func writeUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
return shared.WriteUpstreamEmptyOutputError(w, text, thinking, contentFilter)
}
func emptyOutputRetryEnabled() bool {
return shared.EmptyOutputRetryEnabled()
}

View File

@@ -8,7 +8,9 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
@@ -76,44 +78,44 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
}
historySession := startChatHistory(h.ChatHistory, r, a, stdReq)
sessionID, err = h.DS.CreateSession(r.Context(), a, 3)
if err != nil {
if a.UseConfigToken {
if !stdReq.Stream {
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
sessionID = result.SessionID
if outErr != nil {
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.", "error", "", "")
historySession.error(outErr.Status, outErr.Message, outErr.Code, result.Turn.Thinking, result.Turn.Text)
}
writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
} else {
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.", "error", "", "")
}
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
return
}
pow, err := h.DS.GetPow(r.Context(), a, 3)
if err != nil {
respBody := openaifmt.BuildChatCompletionWithToolCalls(result.SessionID, stdReq.ResponseModel, result.Turn.Prompt, result.Turn.Thinking, result.Turn.Text, result.Turn.ToolCalls, stdReq.ToolsRaw)
respBody["usage"] = assistantturn.OpenAIChatUsage(result.Turn)
finishReason := assistantturn.FinalizeTurn(result.Turn, assistantturn.FinalizeOptions{}).FinishReason
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).", "error", "", "")
historySession.success(http.StatusOK, result.Turn.Thinking, result.Turn.Text, finishReason, assistantturn.OpenAIChatUsage(result.Turn))
}
writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
writeJSON(w, http.StatusOK, respBody)
return
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := h.DS.CallCompletion(r.Context(), a, payload, pow, 3)
if err != nil {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
sessionID = start.SessionID
if outErr != nil {
if historySession != nil {
historySession.error(http.StatusInternalServerError, "Failed to get completion.", "error", "", "")
historySession.error(outErr.Status, outErr.Message, outErr.Code, "", "")
}
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
refFileTokens := stdReq.RefFileTokens
if stdReq.Stream {
h.handleStreamWithRetry(w, r, a, resp, payload, pow, sessionID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, historySession)
return
}
h.handleNonStreamWithRetry(w, r.Context(), a, resp, payload, pow, sessionID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, historySession)
streamReq := start.Request
refFileTokens := streamReq.RefFileTokens
h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
}
func (h *Handler) autoDeleteRemoteSession(ctx context.Context, a *auth.RequestAuth, sessionID string) {
@@ -161,33 +163,29 @@ func (h *Handler) handleNonStream(w http.ResponseWriter, resp *http.Response, co
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
finalText = replaceCitationMarkersWithLinks(finalText, result.CitationLinks)
}
detected := detectAssistantToolCalls(result.Text, finalText, result.Thinking, result.ToolDetectionThinking, toolNames)
if shouldWriteUpstreamEmptyOutputError(finalText) && len(detected.Calls) == 0 {
status, message, code := upstreamEmptyOutputDetail(result.ContentFilter, finalText, finalThinking)
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: finalPrompt,
RefFileTokens: refFileTokens,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
ToolChoice: promptcompat.DefaultToolChoicePolicy(),
})
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if historySession != nil {
historySession.error(status, message, code, finalThinking, finalText)
historySession.error(status, message, code, turn.Thinking, turn.Text)
}
writeUpstreamEmptyOutputError(w, finalText, finalThinking, result.ContentFilter)
writeOpenAIErrorWithCode(w, status, message, code)
return
}
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, finalPrompt, finalThinking, finalText, detected.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(respBody, refFileTokens)
}
finishReason := "stop"
if choices, ok := respBody["choices"].([]map[string]any); ok && len(choices) > 0 {
if fr, _ := choices[0]["finish_reason"].(string); strings.TrimSpace(fr) != "" {
finishReason = fr
}
}
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, finalPrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
respBody["usage"] = assistantturn.OpenAIChatUsage(turn)
if historySession != nil {
historySession.success(http.StatusOK, finalThinking, finalText, finishReason, openaifmt.BuildChatUsageForModel(model, finalPrompt, finalThinking, finalText, refFileTokens))
historySession.success(http.StatusOK, turn.Thinking, turn.Text, outcome.FinishReason, assistantturn.OpenAIChatUsage(turn))
}
writeJSON(w, http.StatusOK, respBody)
}
@@ -215,7 +213,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
created := time.Now().Unix()
bufferToolContent := len(toolNames) > 0
emitEarlyToolDeltas := h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence()
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
initialType := "text"
if thinkingEnabled {
initialType = "thinking"
@@ -234,6 +232,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
stripReferenceMarkers,
toolNames,
toolsRaw,
promptcompat.DefaultToolChoicePolicy(),
bufferToolContent,
emitEarlyToolDeltas,
)
@@ -254,7 +253,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
OnParsed: func(parsed sse.LineResult) streamengine.ParsedDecision {
decision := streamRuntime.onParsed(parsed)
if historySession != nil {
historySession.progress(streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.progress(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
return decision
},
@@ -268,14 +267,14 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
return
}
if streamRuntime.finalErrorMessage != "" {
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
return
}
historySession.success(http.StatusOK, streamRuntime.finalThinking, streamRuntime.finalText, streamRuntime.finalFinishReason, streamRuntime.finalUsage)
},
OnContextDone: func() {
if historySession != nil {
historySession.stopped(streamRuntime.thinking.String(), streamRuntime.text.String(), string(streamengine.StopReasonContextCancelled))
historySession.stopped(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String(), string(streamengine.StopReasonContextCancelled))
}
},
})

View File

@@ -75,7 +75,6 @@ func TestChatCompletionsAutoDeleteModes(t *testing.T) {
}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: tc.mode,
},
Auth: streamStatusAuthStub{},
@@ -123,7 +122,6 @@ func TestAutoDeleteRemoteSessionIgnoresCanceledParentContext(t *testing.T) {
ds := &autoDeleteCtxDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: "single",
},
DS: ds,

View File

@@ -12,25 +12,18 @@ import (
type mockOpenAIConfig struct {
aliases map[string]string
wideInput bool
autoDeleteMode string
toolMode string
earlyEmit string
responsesTTL int
embedProv string
historySplitEnabled bool
historySplitTurns int
currentInputEnabled bool
currentInputMin int
thinkingInjection *bool
thinkingPrompt string
}
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) CompatWideInputStrictOutput() bool {
return m.wideInput
}
func (m mockOpenAIConfig) CompatStripReferenceMarkers() bool { return true }
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) ToolcallMode() string { return m.toolMode }
func (m mockOpenAIConfig) ToolcallEarlyEmitConfidence() string { return m.earlyEmit }
func (m mockOpenAIConfig) ResponsesStoreTTLSeconds() int { return m.responsesTTL }
@@ -41,14 +34,7 @@ func (m mockOpenAIConfig) AutoDeleteMode() string {
}
return m.autoDeleteMode
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) HistorySplitEnabled() bool { return m.historySplitEnabled }
func (m mockOpenAIConfig) HistorySplitTriggerAfterTurns() int {
if m.historySplitTurns <= 0 {
return 1
}
return m.historySplitTurns
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) CurrentInputFileEnabled() bool { return m.currentInputEnabled }
func (m mockOpenAIConfig) CurrentInputFileMinChars() int {
return m.currentInputMin

View File

@@ -94,7 +94,6 @@ func TestHandleVercelStreamPrepareAppliesCurrentInputFile(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -151,7 +150,6 @@ func TestHandleVercelStreamPrepareMapsCurrentInputFileManagedAuthFailureTo401(t
}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusManagedAuthStub{},

View File

@@ -109,13 +109,10 @@ func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Reque
"final_prompt": stdReq.FinalPrompt,
"thinking_enabled": stdReq.Thinking,
"search_enabled": stdReq.Search,
"compat": map[string]any{
"strip_reference_markers": h.compatStripReferenceMarkers(),
},
"tool_names": stdReq.ToolNames,
"deepseek_token": a.DeepSeekToken,
"pow_header": powHeader,
"payload": payload,
"tool_names": stdReq.ToolNames,
"deepseek_token": a.DeepSeekToken,
"pow_header": powHeader,
"payload": payload,
})
}

View File

@@ -1,6 +1,7 @@
package openai
import (
"strings"
"testing"
"ds2api/internal/promptcompat"
@@ -8,25 +9,18 @@ import (
type mockOpenAIConfig struct {
aliases map[string]string
wideInput bool
autoDeleteMode string
toolMode string
earlyEmit string
responsesTTL int
embedProv string
historySplitEnabled bool
historySplitTurns int
currentInputEnabled bool
currentInputMin int
thinkingInjection *bool
thinkingPrompt string
}
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) CompatWideInputStrictOutput() bool {
return m.wideInput
}
func (m mockOpenAIConfig) CompatStripReferenceMarkers() bool { return true }
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) ToolcallMode() string { return m.toolMode }
func (m mockOpenAIConfig) ToolcallEarlyEmitConfidence() string { return m.earlyEmit }
func (m mockOpenAIConfig) ResponsesStoreTTLSeconds() int { return m.responsesTTL }
@@ -37,14 +31,7 @@ func (m mockOpenAIConfig) AutoDeleteMode() string {
}
return m.autoDeleteMode
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) HistorySplitEnabled() bool { return m.historySplitEnabled }
func (m mockOpenAIConfig) HistorySplitTriggerAfterTurns() int {
if m.historySplitTurns <= 0 {
return 1
}
return m.historySplitTurns
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) CurrentInputFileEnabled() bool { return m.currentInputEnabled }
func (m mockOpenAIConfig) CurrentInputFileMinChars() int {
return m.currentInputMin
@@ -62,7 +49,6 @@ func TestNormalizeOpenAIChatRequestWithConfigInterface(t *testing.T) {
aliases: map[string]string{
"my-model": "deepseek-v4-flash-search",
},
wideInput: true,
}
req := map[string]any{
"model": "my-model",
@@ -81,7 +67,7 @@ func TestNormalizeOpenAIChatRequestWithConfigInterface(t *testing.T) {
}
func TestNormalizeOpenAIChatRequestDisablesThinkingForNoThinkingModel(t *testing.T) {
cfg := mockOpenAIConfig{wideInput: true}
cfg := mockOpenAIConfig{}
req := map[string]any{
"model": "deepseek-v4-pro-nothinking",
"messages": []any{map[string]any{"role": "user", "content": "hello"}},
@@ -102,28 +88,22 @@ func TestNormalizeOpenAIChatRequestDisablesThinkingForNoThinkingModel(t *testing
}
}
func TestNormalizeOpenAIResponsesRequestWideInputPolicyFromInterface(t *testing.T) {
func TestNormalizeOpenAIResponsesRequestAlwaysAcceptsWideInput(t *testing.T) {
req := map[string]any{
"model": "deepseek-v4-flash",
"input": "hi",
}
_, err := promptcompat.NormalizeOpenAIResponsesRequest(mockOpenAIConfig{
aliases: map[string]string{},
wideInput: false,
}, req, "")
if err == nil {
t.Fatal("expected error when wide input is disabled and only input is provided")
}
out, err := promptcompat.NormalizeOpenAIResponsesRequest(mockOpenAIConfig{
aliases: map[string]string{},
wideInput: true,
aliases: map[string]string{},
}, req, "")
if err != nil {
t.Fatalf("unexpected error when wide input is enabled: %v", err)
t.Fatalf("unexpected error for wide input request: %v", err)
}
if out.Surface != "openai_responses" {
t.Fatalf("unexpected surface: %q", out.Surface)
}
if !strings.Contains(out.FinalPrompt, "<User>hi") {
t.Fatalf("unexpected final prompt: %q", out.FinalPrompt)
}
}

View File

@@ -151,7 +151,7 @@ func TestPreprocessInlineFileInputsDeduplicatesIdenticalPayloads(t *testing.T) {
func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
reqBody := `{"model":"deepseek-v4-vision","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
req.Header.Set("Authorization", "Bearer direct-token")
@@ -180,7 +180,7 @@ func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-pro","input":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"input_image","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
@@ -208,7 +208,7 @@ func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
reqBody := `{"model":"deepseek-v4-flash","messages":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,%%%"}}]}],"stream":false}`
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
req.Header.Set("Authorization", "Bearer direct-token")
@@ -227,7 +227,7 @@ func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
func TestChatCompletionsInlineUploadLimitReturnsBadRequest(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
content := []any{map[string]any{"type": "input_text", "text": "hi"}}
for i := 0; i < 51; i++ {
content = append(content, map[string]any{
@@ -266,7 +266,7 @@ func TestChatCompletionsInlineUploadLimitReturnsBadRequest(t *testing.T) {
func TestResponsesInlineUploadFailureReturnsInternalServerError(t *testing.T) {
ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-flash","input":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
@@ -289,7 +289,7 @@ func TestVercelPrepareUploadsInlineFilesBeforeLeasePayload(t *testing.T) {
t.Setenv("VERCEL", "1")
t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-flash","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":true}`

View File

@@ -1,11 +1,15 @@
package files
import (
"context"
"errors"
"io"
"net/http"
"strings"
"time"
"github.com/go-chi/chi/v5"
"ds2api/internal/auth"
"ds2api/internal/chathistory"
"ds2api/internal/config"
@@ -22,6 +26,10 @@ type Handler struct {
ChatHistory *chathistory.Store
}
type fileFetcher interface {
FetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*dsclient.UploadFileResult, error)
}
func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
a, err := h.Auth.Determine(r)
if err != nil {
@@ -85,6 +93,44 @@ func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
shared.WriteJSON(w, http.StatusOK, buildOpenAIFileObject(result))
}
func (h *Handler) RetrieveFile(w http.ResponseWriter, r *http.Request) {
a, err := h.Auth.Determine(r)
if err != nil {
status := http.StatusUnauthorized
detail := err.Error()
if err == auth.ErrNoAccount {
status = http.StatusTooManyRequests
}
shared.WriteOpenAIError(w, status, detail)
return
}
defer h.Auth.Release(a)
fileID := strings.TrimSpace(chi.URLParam(r, "file_id"))
if fileID == "" {
shared.WriteOpenAIError(w, http.StatusBadRequest, "file_id is required")
return
}
fetcher, ok := h.DS.(fileFetcher)
if !ok {
shared.WriteOpenAIError(w, http.StatusNotImplemented, "file retrieval is not available")
return
}
result, err := fetcher.FetchUploadedFile(r.Context(), a, fileID)
if err != nil {
if errors.Is(err, dsclient.ErrUploadFileNotFound) {
shared.WriteOpenAIError(w, http.StatusNotFound, "file not found")
return
}
shared.WriteOpenAIError(w, http.StatusInternalServerError, "Failed to retrieve file.")
return
}
if result != nil && result.AccountID == "" {
result.AccountID = a.AccountID
}
shared.WriteJSON(w, http.StatusOK, buildOpenAIFileObject(result))
}
func resolveUploadModelType(store shared.ConfigReader, r *http.Request) string {
for _, candidate := range []string{r.FormValue("model_type"), r.Header.Get("X-Model-Type")} {
if modelType := normalizeUploadModelType(candidate); modelType != "" {

View File

@@ -43,6 +43,7 @@ func (managedFilesAuthStub) Release(_ *auth.RequestAuth) {}
type filesRouteDSStub struct {
lastReq dsclient.UploadFileRequest
upload *dsclient.UploadFileResult
fetched *dsclient.UploadFileResult
err error
}
@@ -65,6 +66,16 @@ func (m *filesRouteDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, re
return &dsclient.UploadFileResult{ID: "file-123", Filename: req.Filename, Bytes: int64(len(req.Data)), Purpose: req.Purpose, Status: "uploaded"}, nil
}
func (m *filesRouteDSStub) FetchUploadedFile(_ context.Context, _ *auth.RequestAuth, fileID string) (*dsclient.UploadFileResult, error) {
if m.err != nil {
return nil, m.err
}
if m.fetched != nil {
return m.fetched, nil
}
return &dsclient.UploadFileResult{ID: fileID, Filename: "notes.txt", Bytes: 11, Purpose: "assistants", Status: "processed"}, nil
}
func (m *filesRouteDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
return nil, errors.New("not implemented")
}
@@ -109,7 +120,7 @@ func newMultipartUploadRequest(t *testing.T, purpose string, filename string, da
func TestFilesRouteUploadSuccess(t *testing.T) {
ds := &filesRouteDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
@@ -149,7 +160,7 @@ func TestFilesRouteUploadSuccess(t *testing.T) {
func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
ds := &filesRouteDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: managedFilesAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: managedFilesAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
@@ -169,8 +180,56 @@ func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
}
}
func TestFilesRouteRetrieveSuccess(t *testing.T) {
ds := &filesRouteDSStub{fetched: &dsclient.UploadFileResult{
ID: "file-123",
Filename: "notes.txt",
Bytes: 11,
Purpose: "assistants",
Status: "processed",
}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: managedFilesAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := httptest.NewRequest(http.MethodGet, "/v1/files/file-123", nil)
req.Header.Set("Authorization", "Bearer direct-token")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
var out map[string]any
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
}
if out["id"] != "file-123" || out["filename"] != "notes.txt" || out["status"] != "processed" {
t.Fatalf("unexpected file object: %#v", out)
}
if out["account_id"] != "acct-123" {
t.Fatalf("expected account_id acct-123, got %#v", out["account_id"])
}
}
func TestFilesRouteRetrieveNotFound(t *testing.T) {
ds := &filesRouteDSStub{err: dsclient.ErrUploadFileNotFound}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := httptest.NewRequest(http.MethodGet, "/v1/files/missing-file", nil)
req.Header.Set("Authorization", "Bearer direct-token")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
if rec.Code != http.StatusNotFound {
t.Fatalf("expected 404, got %d body=%s", rec.Code, rec.Body.String())
}
}
func TestFilesRouteRejectsNonMultipart(t *testing.T) {
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
@@ -186,7 +245,7 @@ func TestFilesRouteRejectsNonMultipart(t *testing.T) {
}
func TestFilesRouteRequiresFileField(t *testing.T) {
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)

View File

@@ -19,8 +19,22 @@ const (
currentInputPurpose = "assistants"
)
type CurrentInputConfigReader interface {
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type CurrentInputUploader interface {
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
}
type Service struct {
Store CurrentInputConfigReader
DS CurrentInputUploader
}
func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
if s.DS == nil || s.Store == nil || a == nil || !s.Store.CurrentInputFileEnabled() {
if stdReq.CurrentInputFileApplied || s.DS == nil || s.Store == nil || a == nil || !s.Store.CurrentInputFileEnabled() {
return stdReq, nil
}
threshold := s.Store.CurrentInputFileMinChars()
@@ -95,3 +109,20 @@ func latestUserInputForFile(messages []any) (int, string) {
func currentInputFilePrompt() string {
return "Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly."
}
func prependUniqueRefFileID(existing []string, fileID string) []string {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return existing
}
out := make([]string, 0, len(existing)+1)
out = append(out, fileID)
for _, id := range existing {
trimmed := strings.TrimSpace(id)
if trimmed == "" || strings.EqualFold(trimmed, fileID) {
continue
}
out = append(out, trimmed)
}
return out
}

View File

@@ -1,90 +0,0 @@
package history
import (
"context"
"strings"
"ds2api/internal/auth"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
)
type Service struct {
Store shared.ConfigReader
DS shared.DeepSeekCaller
}
// Apply is retained for legacy compatibility only. The active split path is
// current input file handling in ApplyCurrentInputFile.
func (s Service) Apply(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
return stdReq, nil
}
func SplitOpenAIHistoryMessages(messages []any, triggerAfterTurns int) ([]any, []any) {
if triggerAfterTurns <= 0 {
triggerAfterTurns = 1
}
lastUserIndex := -1
userTurns := 0
for i, raw := range messages {
msg, ok := raw.(map[string]any)
if !ok {
continue
}
role := strings.ToLower(strings.TrimSpace(shared.AsString(msg["role"])))
if role != "user" {
continue
}
userTurns++
lastUserIndex = i
}
if userTurns <= triggerAfterTurns || lastUserIndex < 0 {
return messages, nil
}
promptMessages := make([]any, 0, len(messages)-lastUserIndex)
historyMessages := make([]any, 0, lastUserIndex)
for i, raw := range messages {
msg, ok := raw.(map[string]any)
if !ok {
if i >= lastUserIndex {
promptMessages = append(promptMessages, raw)
} else {
historyMessages = append(historyMessages, raw)
}
continue
}
role := strings.ToLower(strings.TrimSpace(shared.AsString(msg["role"])))
switch role {
case "system", "developer":
promptMessages = append(promptMessages, raw)
default:
if i >= lastUserIndex {
promptMessages = append(promptMessages, raw)
} else {
historyMessages = append(historyMessages, raw)
}
}
}
if len(promptMessages) == 0 {
return messages, nil
}
return promptMessages, historyMessages
}
func prependUniqueRefFileID(existing []string, fileID string) []string {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return existing
}
out := make([]string, 0, len(existing)+1)
out = append(out, fileID)
for _, id := range existing {
trimmed := strings.TrimSpace(id)
if trimmed == "" || strings.EqualFold(trimmed, fileID) {
continue
}
out = append(out, trimmed)
}
return out
}

View File

@@ -62,8 +62,7 @@ func (streamStatusManagedAuthStub) DetermineCaller(_ *http.Request) (*auth.Reque
func (streamStatusManagedAuthStub) Release(_ *auth.RequestAuth) {}
func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *testing.T) {
_, historyMessages := splitOpenAIHistoryMessages(historySplitTestMessages(), 1)
transcript := buildOpenAICurrentInputContextTranscript(historyMessages)
transcript := buildOpenAICurrentInputContextTranscript(historySplitTestMessages())
if strings.Contains(transcript, "[file content end]") || strings.Contains(transcript, "[file content begin]") || strings.Contains(transcript, "[file name]:") {
t.Fatalf("expected transcript without file wrapper tags, got %q", transcript)
@@ -75,11 +74,14 @@ func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *
t.Fatalf("expected history transcript description, got %q", transcript)
}
for _, want := range []string{
"=== 1. USER ===",
"=== 2. ASSISTANT ===",
"=== 3. TOOL ===",
"=== 1. SYSTEM ===",
"=== 2. USER ===",
"=== 3. ASSISTANT ===",
"=== 4. TOOL ===",
"=== 5. USER ===",
"first user turn",
"tool result",
"latest user turn",
"[reasoning_content]",
"hidden reasoning",
"<|DSML|tool_calls>",
@@ -90,43 +92,10 @@ func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *
}
}
func TestSplitOpenAIHistoryMessagesUsesLatestUserTurn(t *testing.T) {
messages := []any{
map[string]any{"role": "system", "content": "system instructions"},
map[string]any{"role": "user", "content": "first user turn"},
map[string]any{"role": "assistant", "content": "first assistant turn"},
map[string]any{"role": "user", "content": "middle user turn"},
map[string]any{"role": "assistant", "content": "middle assistant turn"},
map[string]any{"role": "user", "content": "latest user turn"},
}
promptMessages, historyMessages := splitOpenAIHistoryMessages(messages, 1)
if len(promptMessages) == 0 || len(historyMessages) == 0 {
t.Fatalf("expected both prompt and history messages, got prompt=%d history=%d", len(promptMessages), len(historyMessages))
}
promptText, _ := promptcompat.BuildOpenAIPrompt(promptMessages, nil, "", defaultToolChoicePolicy(), true)
if !strings.Contains(promptText, "latest user turn") {
t.Fatalf("expected latest user turn in prompt, got %s", promptText)
}
if strings.Contains(promptText, "middle user turn") {
t.Fatalf("expected middle user turn to be moved into history, got %s", promptText)
}
historyText := buildOpenAICurrentInputContextTranscript(historyMessages)
if !strings.Contains(historyText, "middle user turn") {
t.Fatalf("expected middle user turn in split history, got %s", historyText)
}
if strings.Contains(historyText, "latest user turn") {
t.Fatalf("expected latest user turn to remain live, got %s", historyText)
}
}
func TestApplyCurrentInputFileSkipsShortInputWhenThresholdNotReached(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 10,
},
@@ -159,7 +128,6 @@ func TestApplyThinkingInjectionAppendsLatestUserPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
thinkingInjection: boolPtr(true),
},
DS: ds,
@@ -191,7 +159,6 @@ func TestApplyThinkingInjectionUsesCustomPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
thinkingInjection: boolPtr(true),
thinkingPrompt: "custom thinking format",
},
@@ -221,7 +188,6 @@ func TestApplyCurrentInputFileDisabledPassThrough(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: false,
},
DS: ds,
@@ -254,7 +220,6 @@ func TestApplyCurrentInputFileUploadsFirstTurnWithNumberedHistoryTranscript(t *t
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 10,
thinkingInjection: boolPtr(true),
@@ -324,7 +289,6 @@ func TestApplyCurrentInputFilePreservesFullContextPromptForTokenCounting(t *test
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 0,
thinkingInjection: boolPtr(true),
@@ -370,7 +334,6 @@ func TestApplyCurrentInputFileUploadsFullContextFile(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 0,
thinkingInjection: boolPtr(true),
@@ -421,7 +384,6 @@ func TestApplyCurrentInputFileCarriesHistoryText(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
DS: ds,
@@ -454,7 +416,6 @@ func TestChatCompletionsCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *t
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -525,7 +486,6 @@ func TestResponsesCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *testing
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -583,7 +543,6 @@ func TestChatCompletionsCurrentInputFileMapsManagedAuthFailureTo401(t *testing.T
}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusManagedAuthStub{},
@@ -615,7 +574,6 @@ func TestResponsesCurrentInputFileMapsDirectAuthFailureTo401(t *testing.T) {
}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -647,7 +605,6 @@ func TestChatCompletionsCurrentInputFileUploadFailureReturnsInternalServerError(
ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -676,7 +633,6 @@ func TestCurrentInputFileWorksAcrossAutoDeleteModes(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: mode,
currentInputEnabled: true,
},
@@ -716,10 +672,6 @@ func TestCurrentInputFileWorksAcrossAutoDeleteModes(t *testing.T) {
}
}
func defaultToolChoicePolicy() promptcompat.ToolChoicePolicy {
return promptcompat.DefaultToolChoicePolicy()
}
func boolPtr(v bool) *bool {
return &v
}

View File

@@ -1,7 +1,6 @@
package responses
import (
"context"
"io"
"net/http"
"strings"
@@ -10,128 +9,10 @@ import (
"ds2api/internal/auth"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolcall"
)
type responsesNonStreamResult struct {
rawThinking string
rawText string
thinking string
toolDetectionThinking string
text string
contentFilter bool
parsed toolcall.ToolCallParseResult
body map[string]any
responseMessageID int
}
func (h *Handler) handleResponsesNonStreamWithRetry(w http.ResponseWriter, ctx context.Context, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
attempts := 0
currentResp := resp
usagePrompt := finalPrompt
accumulatedThinking := ""
accumulatedRawThinking := ""
accumulatedToolDetectionThinking := ""
for {
result, ok := h.collectResponsesNonStreamAttempt(w, currentResp, responseID, model, usagePrompt, thinkingEnabled, searchEnabled, toolNames, toolsRaw)
if !ok {
return
}
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, result.thinking)
accumulatedRawThinking += sse.TrimContinuationOverlap(accumulatedRawThinking, result.rawThinking)
accumulatedToolDetectionThinking += sse.TrimContinuationOverlap(accumulatedToolDetectionThinking, result.toolDetectionThinking)
result.thinking = accumulatedThinking
result.rawThinking = accumulatedRawThinking
result.toolDetectionThinking = accumulatedToolDetectionThinking
result.parsed = detectAssistantToolCalls(result.rawText, result.text, result.rawThinking, result.toolDetectionThinking, toolNames)
result.body = openaifmt.BuildResponseObjectWithToolCalls(responseID, model, usagePrompt, result.thinking, result.text, result.parsed.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(result.body, refFileTokens)
}
if !shouldRetryResponsesNonStream(result, attempts) {
h.finishResponsesNonStreamResult(w, result, attempts, owner, responseID, toolChoice, traceID)
return
}
attempts++
config.Logger.Info("[openai_empty_retry] attempting synthetic retry", "surface", "responses", "stream", false, "retry_attempt", attempts, "parent_message_id", result.responseMessageID)
retryPow, powErr := h.DS.GetPow(ctx, a, 3)
if powErr != nil {
config.Logger.Warn("[openai_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", "responses", "stream", false, "retry_attempt", attempts, "error", powErr)
retryPow = pow
}
nextResp, err := h.DS.CallCompletion(ctx, a, clonePayloadForEmptyOutputRetry(payload, result.responseMessageID), retryPow, 3)
if err != nil {
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
config.Logger.Warn("[openai_empty_retry] retry request failed", "surface", "responses", "stream", false, "retry_attempt", attempts, "error", err)
return
}
usagePrompt = usagePromptWithEmptyOutputRetry(usagePrompt, attempts)
currentResp = nextResp
}
}
func (h *Handler) collectResponsesNonStreamAttempt(w http.ResponseWriter, resp *http.Response, responseID, model, usagePrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any) (responsesNonStreamResult, bool) {
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
writeOpenAIError(w, resp.StatusCode, strings.TrimSpace(string(body)))
return responsesNonStreamResult{}, false
}
result := sse.CollectStream(resp, thinkingEnabled, false)
stripReferenceMarkers := h.compatStripReferenceMarkers()
sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
sanitizedText = replaceCitationMarkersWithLinks(sanitizedText, result.CitationLinks)
}
textParsed := detectAssistantToolCalls(result.Text, sanitizedText, result.Thinking, result.ToolDetectionThinking, toolNames)
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, usagePrompt, sanitizedThinking, sanitizedText, textParsed.Calls, toolsRaw)
return responsesNonStreamResult{
rawThinking: result.Thinking,
rawText: result.Text,
thinking: sanitizedThinking,
toolDetectionThinking: result.ToolDetectionThinking,
text: sanitizedText,
contentFilter: result.ContentFilter,
parsed: textParsed,
body: responseObj,
responseMessageID: result.ResponseMessageID,
}, true
}
func (h *Handler) finishResponsesNonStreamResult(w http.ResponseWriter, result responsesNonStreamResult, attempts int, owner, responseID string, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
if len(result.parsed.Calls) == 0 && writeUpstreamEmptyOutputError(w, result.text, result.thinking, result.contentFilter) {
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "responses", "stream", false, "retry_attempts", attempts, "success_source", "none", "content_filter", result.contentFilter)
return
}
logResponsesToolPolicyRejection(traceID, toolChoice, result.parsed, "text")
if toolChoice.IsRequired() && len(result.parsed.Calls) == 0 {
writeOpenAIErrorWithCode(w, http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
return
}
h.getResponseStore().put(owner, responseID, result.body)
writeJSON(w, http.StatusOK, result.body)
source := "first_attempt"
if attempts > 0 {
source = "synthetic_retry"
}
config.Logger.Info("[openai_empty_retry] completed", "surface", "responses", "stream", false, "retry_attempts", attempts, "success_source", source)
}
func shouldRetryResponsesNonStream(result responsesNonStreamResult, attempts int) bool {
return emptyOutputRetryEnabled() &&
attempts < emptyOutputRetryMaxAttempts() &&
!result.contentFilter &&
len(result.parsed.Calls) == 0 &&
strings.TrimSpace(result.text) == ""
}
func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
streamRuntime, initialType, ok := h.prepareResponsesStreamRuntime(w, resp, owner, responseID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, traceID)
if !ok {
@@ -193,7 +74,7 @@ func (h *Handler) prepareResponsesStreamRuntime(w http.ResponseWriter, resp *htt
}
streamRuntime := newResponsesStreamRuntime(
w, rc, canFlush, responseID, model, finalPrompt, thinkingEnabled, searchEnabled,
h.compatStripReferenceMarkers(), toolNames, toolsRaw, len(toolNames) > 0,
stripReferenceMarkersEnabled(), toolNames, toolsRaw, len(toolNames) > 0,
h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence(),
toolChoice, traceID, func(obj map[string]any) {
h.getResponseStore().put(owner, responseID, obj)

View File

@@ -11,7 +11,7 @@ import (
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/toolcall"
"ds2api/internal/textclean"
"ds2api/internal/toolstream"
)
@@ -29,11 +29,8 @@ type Handler struct {
responses *responseStore
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil {
return true
}
return shared.CompatStripReferenceMarkers(h.Store)
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func (h *Handler) applyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
@@ -98,18 +95,6 @@ func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
return shared.CleanVisibleOutput(text, stripReferenceMarkers)
}
func replaceCitationMarkersWithLinks(text string, links map[int]string) string {
return shared.ReplaceCitationMarkersWithLinks(text, links)
}
func upstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
return shared.UpstreamEmptyOutputDetail(contentFilter, text, thinking)
}
func writeUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
return shared.WriteUpstreamEmptyOutputError(w, text, thinking, contentFilter)
}
func emptyOutputRetryEnabled() bool {
return shared.EmptyOutputRetryEnabled()
}
@@ -129,7 +114,3 @@ func usagePromptWithEmptyOutputRetry(originalPrompt string, retryAttempts int) s
func filterIncrementalToolCallDeltasByAllowed(deltas []toolstream.ToolCallDelta, seenNames map[int]string) []toolstream.ToolCallDelta {
return shared.FilterIncrementalToolCallDeltasByAllowed(deltas, seenNames)
}
func detectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking string, toolNames []string) toolcall.ToolCallParseResult {
return shared.DetectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking, toolNames)
}

View File

@@ -11,7 +11,9 @@ import (
"github.com/go-chi/chi/v5"
"github.com/google/uuid"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
@@ -92,34 +94,35 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
return
}
sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
if err != nil {
if a.UseConfigToken {
writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
} else {
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
responseID := "resp_" + strings.ReplaceAll(uuid.NewString(), "-", "")
if !stdReq.Stream {
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
return
}
pow, err := h.DS.GetPow(r.Context(), a, 3)
if err != nil {
writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
return
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := h.DS.CallCompletion(r.Context(), a, payload, pow, 3)
if err != nil {
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, stdReq.ResponseModel, result.Turn.Prompt, result.Turn.Thinking, result.Turn.Text, result.Turn.ToolCalls, stdReq.ToolsRaw)
responseObj["usage"] = assistantturn.OpenAIResponsesUsage(result.Turn)
h.getResponseStore().put(owner, responseID, responseObj)
writeJSON(w, http.StatusOK, responseObj)
return
}
responseID := "resp_" + strings.ReplaceAll(uuid.NewString(), "-", "")
refFileTokens := stdReq.RefFileTokens
if stdReq.Stream {
h.handleResponsesStreamWithRetry(w, r, a, resp, payload, pow, owner, responseID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, stdReq.ToolChoice, traceID)
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
h.handleResponsesNonStreamWithRetry(w, r.Context(), a, resp, payload, pow, owner, responseID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, stdReq.ToolChoice, traceID)
streamReq := start.Request
refFileTokens := streamReq.RefFileTokens
h.handleResponsesStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, owner, responseID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, traceID)
}
func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Response, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
@@ -130,28 +133,26 @@ func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Res
return
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
sanitizedText = replaceCitationMarkersWithLinks(sanitizedText, result.CitationLinks)
}
textParsed := detectAssistantToolCalls(result.Text, sanitizedText, result.Thinking, result.ToolDetectionThinking, toolNames)
if len(textParsed.Calls) == 0 && writeUpstreamEmptyOutputError(w, sanitizedText, sanitizedThinking, result.ContentFilter) {
return
}
logResponsesToolPolicyRejection(traceID, toolChoice, textParsed, "text")
callCount := len(textParsed.Calls)
if toolChoice.IsRequired() && callCount == 0 {
writeOpenAIErrorWithCode(w, http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: finalPrompt,
RefFileTokens: refFileTokens,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
ToolChoice: toolChoice,
})
logResponsesToolPolicyRejection(traceID, toolChoice, turn.ParsedToolCalls, "text")
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if outcome.ShouldFail {
writeOpenAIErrorWithCode(w, outcome.Error.Status, outcome.Error.Message, outcome.Error.Code)
return
}
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, finalPrompt, sanitizedThinking, sanitizedText, textParsed.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(responseObj, refFileTokens)
}
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, finalPrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
responseObj["usage"] = assistantturn.OpenAIResponsesUsage(turn)
h.getResponseStore().put(owner, responseID, responseObj)
writeJSON(w, http.StatusOK, responseObj)
}
@@ -176,7 +177,7 @@ func (h *Handler) handleResponsesStream(w http.ResponseWriter, r *http.Request,
}
bufferToolContent := len(toolNames) > 0
emitEarlyToolDeltas := h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence()
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
streamRuntime := newResponsesStreamRuntime(
w,

View File

@@ -1,12 +1,14 @@
package responses
import (
"ds2api/internal/assistantturn"
"ds2api/internal/toolcall"
"net/http"
"strings"
"ds2api/internal/config"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
@@ -36,31 +38,27 @@ type responsesStreamRuntime struct {
toolCallsEmitted bool
toolCallsDoneEmitted bool
sieve toolstream.State
rawThinking strings.Builder
thinking strings.Builder
toolDetectionThinking strings.Builder
rawText strings.Builder
text strings.Builder
visibleText strings.Builder
responseMessageID int
streamToolCallIDs map[int]string
functionItemIDs map[int]string
functionOutputIDs map[int]int
functionArgs map[int]string
functionDone map[int]bool
functionAdded map[int]bool
functionNames map[int]string
messageItemID string
messageOutputID int
nextOutputID int
messageAdded bool
messagePartAdded bool
sequence int
failed bool
finalErrorStatus int
finalErrorMessage string
finalErrorCode string
sieve toolstream.State
accumulator shared.StreamAccumulator
visibleText strings.Builder
responseMessageID int
streamToolCallIDs map[int]string
functionItemIDs map[int]string
functionOutputIDs map[int]int
functionArgs map[int]string
functionDone map[int]bool
functionAdded map[int]bool
functionNames map[int]string
messageItemID string
messageOutputID int
nextOutputID int
messageAdded bool
messagePartAdded bool
sequence int
failed bool
finalErrorStatus int
finalErrorMessage string
finalErrorCode string
persistResponse func(obj map[string]any)
}
@@ -108,6 +106,11 @@ func newResponsesStreamRuntime(
toolChoice: toolChoice,
traceID: traceID,
persistResponse: persistResponse,
accumulator: shared.StreamAccumulator{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
},
}
}
@@ -155,11 +158,31 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
s.processToolStreamEvents(toolstream.Flush(&s.sieve, s.toolNames), true, true)
}
finalThinking := s.thinking.String()
finalToolDetectionThinking := s.toolDetectionThinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
textParsed := detectAssistantToolCalls(s.rawText.String(), finalText, s.rawThinking.String(), finalToolDetectionThinking, s.toolNames)
detected := textParsed.Calls
finalThinking := s.accumulator.Thinking.String()
finalToolDetectionThinking := s.accumulator.ToolDetectionThinking.String()
finalText := s.accumulator.Text.String()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.accumulator.RawText.String(),
VisibleText: finalText,
RawThinking: s.accumulator.RawThinking.String(),
VisibleThinking: finalThinking,
DetectionThinking: finalToolDetectionThinking,
ContentFilter: finishReason == "content_filter",
ResponseMessageID: s.responseMessageID,
AlreadyEmittedCalls: s.toolCallsEmitted,
AlreadyEmittedToolRaw: s.toolCallsDoneEmitted,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
RefFileTokens: s.refFileTokens,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
ToolChoice: s.toolChoice,
})
textParsed := turn.ParsedToolCalls
detected := turn.ToolCalls
s.logToolPolicyRejections(textParsed)
if len(detected) > 0 {
@@ -171,12 +194,11 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
s.closeMessageItem()
if s.toolChoice.IsRequired() && len(detected) == 0 {
s.failResponse(http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
return true
}
if len(detected) == 0 && strings.TrimSpace(finalText) == "" {
status, message, code := upstreamEmptyOutputDetail(finishReason == "content_filter", finalText, finalThinking)
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsEmitted || s.toolCallsDoneEmitted,
})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if deferEmptyOutput {
s.finalErrorStatus = status
s.finalErrorMessage = message
@@ -188,7 +210,7 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
}
s.closeIncompleteFunctionItems()
obj := s.buildCompletedResponseObject(finalThinking, finalText, detected)
obj := s.buildCompletedResponseObject(turn.Thinking, turn.Text, detected)
if s.persistResponse != nil {
s.persistResponse(obj)
}
@@ -228,62 +250,27 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
return streamengine.ParsedDecision{Stop: true}
}
contentSeen := false
batch := responsesDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
rawTrimmed := sse.TrimContinuationOverlap(s.rawThinking.String(), p.Text)
if rawTrimmed != "" {
s.rawThinking.WriteString(rawTrimmed)
contentSeen = true
}
if !s.thinkingEnabled {
continue
}
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
batch.append("reasoning", trimmed)
batch.append("reasoning", p.VisibleText)
continue
}
rawTrimmed := sse.TrimContinuationOverlap(s.rawText.String(), p.Text)
if rawTrimmed == "" {
if p.RawText == "" {
continue
}
s.rawText.WriteString(rawTrimmed)
contentSeen = true
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if s.searchEnabled && sse.IsCitation(cleanedText) {
if p.CitationOnly {
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed != "" {
s.text.WriteString(trimmed)
}
if !s.bufferToolContent {
if trimmed == "" {
continue
}
batch.append("text", trimmed)
batch.append("text", p.VisibleText)
continue
}
batch.flush()
s.processToolStreamEvents(toolstream.ProcessChunk(&s.sieve, rawTrimmed, s.toolNames), true, true)
s.processToolStreamEvents(toolstream.ProcessChunk(&s.sieve, p.RawText, s.toolNames), true, true)
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}

View File

@@ -35,16 +35,12 @@ type DeepSeekCaller interface {
type ConfigReader interface {
ModelAliases() map[string]string
CompatWideInputStrictOutput() bool
CompatStripReferenceMarkers() bool
ToolcallMode() string
ToolcallEarlyEmitConfidence() string
ResponsesStoreTTLSeconds() int
EmbeddingsProvider() string
AutoDeleteMode() string
AutoDeleteSessions() bool
HistorySplitEnabled() bool
HistorySplitTriggerAfterTurns() int
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
ThinkingInjectionEnabled() bool
@@ -58,13 +54,6 @@ type Deps struct {
ChatHistory *chathistory.Store
}
func CompatStripReferenceMarkers(store ConfigReader) bool {
if store == nil {
return true
}
return store.CompatStripReferenceMarkers()
}
var WriteJSON = util.WriteJSON
var _ AuthResolver = (*auth.Resolver)(nil)

View File

@@ -0,0 +1,104 @@
package shared
import (
"strings"
"ds2api/internal/sse"
)
type StreamAccumulator struct {
ThinkingEnabled bool
SearchEnabled bool
StripReferenceMarkers bool
RawThinking strings.Builder
Thinking strings.Builder
ToolDetectionThinking strings.Builder
RawText strings.Builder
Text strings.Builder
}
type StreamPartDelta struct {
Type string
RawText string
VisibleText string
CitationOnly bool
}
type StreamAccumulatorResult struct {
ContentSeen bool
Parts []StreamPartDelta
}
func (a *StreamAccumulator) Apply(parsed sse.LineResult) StreamAccumulatorResult {
out := StreamAccumulatorResult{}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.ToolDetectionThinking, p.Text)
if trimmed != "" {
a.ToolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
if p.Type == "thinking" {
delta := a.applyThinkingPart(p.Text)
if delta.RawText != "" {
out.ContentSeen = true
}
if delta.RawText != "" || delta.VisibleText != "" {
out.Parts = append(out.Parts, delta)
}
continue
}
delta := a.applyTextPart(p.Text)
if delta.RawText != "" {
out.ContentSeen = true
}
if delta.RawText != "" || delta.VisibleText != "" || delta.CitationOnly {
out.Parts = append(out.Parts, delta)
}
}
return out
}
func (a *StreamAccumulator) applyThinkingPart(text string) StreamPartDelta {
rawTrimmed := sse.TrimContinuationOverlapFromBuilder(&a.RawThinking, text)
if rawTrimmed != "" {
a.RawThinking.WriteString(rawTrimmed)
}
delta := StreamPartDelta{Type: "thinking", RawText: rawTrimmed}
if !a.ThinkingEnabled || rawTrimmed == "" {
return delta
}
cleanedText := CleanVisibleOutput(rawTrimmed, a.StripReferenceMarkers)
if cleanedText == "" {
return delta
}
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.Thinking, cleanedText)
if trimmed == "" {
return delta
}
a.Thinking.WriteString(trimmed)
delta.VisibleText = trimmed
return delta
}
func (a *StreamAccumulator) applyTextPart(text string) StreamPartDelta {
rawTrimmed := sse.TrimContinuationOverlapFromBuilder(&a.RawText, text)
if rawTrimmed == "" {
return StreamPartDelta{Type: "text"}
}
a.RawText.WriteString(rawTrimmed)
delta := StreamPartDelta{Type: "text", RawText: rawTrimmed}
cleanedText := CleanVisibleOutput(rawTrimmed, a.StripReferenceMarkers)
if a.SearchEnabled && sse.IsCitation(cleanedText) {
delta.CitationOnly = true
return delta
}
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.Text, cleanedText)
if trimmed == "" {
return delta
}
a.Text.WriteString(trimmed)
delta.VisibleText = trimmed
return delta
}

View File

@@ -0,0 +1,97 @@
package shared
import (
"testing"
"ds2api/internal/sse"
)
func TestStreamAccumulatorAppliesThinkingAndTextDedupe(t *testing.T) {
acc := StreamAccumulator{ThinkingEnabled: true, StripReferenceMarkers: true}
thinkingPrefix := "this is a long thinking snapshot prefix used by DeepSeek continue replay"
textPrefix := "this is a long visible answer snapshot prefix used by DeepSeek continue replay"
first := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: thinkingPrefix},
{Type: "text", Text: textPrefix},
},
})
second := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: thinkingPrefix + " next"},
{Type: "text", Text: textPrefix + " world"},
},
})
if !first.ContentSeen || !second.ContentSeen {
t.Fatalf("expected both chunks to mark content seen")
}
if got := acc.RawThinking.String(); got != thinkingPrefix+" next" {
t.Fatalf("raw thinking = %q", got)
}
if got := acc.Thinking.String(); got != thinkingPrefix+" next" {
t.Fatalf("thinking = %q", got)
}
if got := acc.RawText.String(); got != textPrefix+" world" {
t.Fatalf("raw text = %q", got)
}
if got := acc.Text.String(); got != textPrefix+" world" {
t.Fatalf("text = %q", got)
}
if got := second.Parts[0].VisibleText; got != " next" {
t.Fatalf("thinking delta = %q", got)
}
if got := second.Parts[1].VisibleText; got != " world" {
t.Fatalf("text delta = %q", got)
}
}
func TestStreamAccumulatorKeepsHiddenThinkingForToolDetection(t *testing.T) {
acc := StreamAccumulator{ThinkingEnabled: false, StripReferenceMarkers: true}
result := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: "<tool_calls></tool_calls>"},
},
ToolDetectionThinkingParts: []sse.ContentPart{
{Type: "thinking", Text: "detect"},
{Type: "thinking", Text: " tools"},
},
})
if !result.ContentSeen {
t.Fatalf("expected hidden thinking to count as upstream content")
}
if got := acc.RawThinking.String(); got != "<tool_calls></tool_calls>" {
t.Fatalf("raw thinking = %q", got)
}
if got := acc.Thinking.String(); got != "" {
t.Fatalf("visible thinking = %q", got)
}
if got := acc.ToolDetectionThinking.String(); got != "detect tools" {
t.Fatalf("tool detection thinking = %q", got)
}
}
func TestStreamAccumulatorSuppressesCitationTextWhenSearchEnabled(t *testing.T) {
acc := StreamAccumulator{SearchEnabled: true, StripReferenceMarkers: true}
result := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{{Type: "text", Text: "[citation:1]"}},
})
if !result.ContentSeen {
t.Fatalf("expected citation chunk to mark upstream content")
}
if len(result.Parts) != 1 || !result.Parts[0].CitationOnly {
t.Fatalf("expected citation-only delta, got %#v", result.Parts)
}
if got := acc.RawText.String(); got != "[citation:1]" {
t.Fatalf("raw text = %q", got)
}
if got := acc.Text.String(); got != "" {
t.Fatalf("visible text = %q", got)
}
}

View File

@@ -1,9 +1,12 @@
package shared
import "net/http"
import (
"net/http"
"strings"
)
func ShouldWriteUpstreamEmptyOutputError(text string) bool {
return text == ""
func ShouldWriteUpstreamEmptyOutputError(text, thinking string) bool {
return strings.TrimSpace(text) == ""
}
func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
@@ -18,7 +21,7 @@ func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int,
}
func WriteUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
if !ShouldWriteUpstreamEmptyOutputError(text) {
if !ShouldWriteUpstreamEmptyOutputError(text, thinking) {
return false
}
status, message, code := UpstreamEmptyOutputDetail(contentFilter, text, thinking)

View File

@@ -135,7 +135,7 @@ func captureStatusMiddleware(statuses *[]int) func(http.Handler) http.Handler {
func TestChatCompletionsStreamStatusCapturedAs200(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello"}`, "data: [DONE]")},
}
@@ -164,7 +164,7 @@ func TestChatCompletionsStreamStatusCapturedAs200(t *testing.T) {
func TestResponsesStreamStatusCapturedAs200(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello"}`, "data: [DONE]")},
}
@@ -193,7 +193,7 @@ func TestResponsesStreamStatusCapturedAs200(t *testing.T) {
func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"合法前缀"}`,
@@ -243,7 +243,7 @@ func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T
func TestChatCompletionsStreamEmitsFailureFrameWhenUpstreamOutputEmpty(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse("data: [DONE]")},
}
@@ -289,7 +289,7 @@ func TestChatCompletionsStreamRetriesEmptyOutputOnSameSession(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -345,11 +345,11 @@ func TestChatCompletionsStreamRetriesEmptyOutputOnSameSession(t *testing.T) {
func TestChatCompletionsNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
ds := &streamStatusDSSeqStub{resps: []*http.Response{
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -380,9 +380,6 @@ func TestChatCompletionsNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
if asString(message["content"]) != "visible" {
t.Fatalf("expected retry visible content, got %#v", message)
}
if !strings.Contains(asString(message["reasoning_content"]), "plan") {
t.Fatalf("expected first-attempt reasoning to be preserved, got %#v", message)
}
}
func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
@@ -391,7 +388,7 @@ func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -413,7 +410,7 @@ func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
func TestResponsesStreamUsageIgnoresBatchAccumulatedTokenUsage(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"hello"}`,
@@ -464,7 +461,7 @@ func TestResponsesStreamRetriesThinkingOnlyOutput(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -499,11 +496,11 @@ func TestResponsesStreamRetriesThinkingOnlyOutput(t *testing.T) {
func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
ds := &streamStatusDSSeqStub{resps: []*http.Response{
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -540,16 +537,16 @@ func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
if len(content) == 0 {
t.Fatalf("expected content entries, got %#v", item)
}
reasoning, _ := content[0].(map[string]any)
if asString(reasoning["type"]) != "reasoning" || !strings.Contains(asString(reasoning["text"]), "plan") {
t.Fatalf("expected preserved reasoning entry, got %#v", content)
textEntry, _ := content[0].(map[string]any)
if asString(textEntry["type"]) != "output_text" || asString(textEntry["text"]) != "visible" {
t.Fatalf("expected visible text entry, got %#v", content)
}
}
func TestResponsesNonStreamUsageIgnoresPromptAndOutputTokenUsage(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"ok"}`,

View File

@@ -104,13 +104,10 @@ func registerOpenAITestRoutes(r chi.Router, h *openAITestSurface) {
r.Post("/v1/responses", h.responsesHandler().Responses)
r.Get("/v1/responses/{response_id}", h.responsesHandler().GetResponseByID)
r.Post("/v1/files", h.filesHandler().UploadFile)
r.Get("/v1/files/{file_id}", h.filesHandler().RetrieveFile)
r.Post("/v1/embeddings", h.embeddingsHandler().Embeddings)
}
func splitOpenAIHistoryMessages(messages []any, triggerAfterTurns int) ([]any, []any) {
return history.SplitOpenAIHistoryMessages(messages, triggerAfterTurns)
}
func buildOpenAICurrentInputContextTranscript(messages []any) string {
return promptcompat.BuildOpenAICurrentInputContextTranscript(messages)
}

View File

@@ -1,7 +1,7 @@
'use strict';
const MIN_DELTA_FLUSH_CHARS = 160;
const MAX_DELTA_FLUSH_WAIT_MS = 80;
const MIN_DELTA_FLUSH_CHARS = 16;
const MAX_DELTA_FLUSH_WAIT_MS = 20;
function createChatCompletionEmitter({ res, sessionID, created, model, isClosed }) {
let firstChunkSent = false;

View File

@@ -17,7 +17,6 @@ const {
resolveToolcallPolicy,
formatIncrementalToolCallDeltas,
filterIncrementalToolCallDeltasByAllowed,
boolDefaultTrue,
resetStreamToolCallState,
} = require('./toolcall_policy');
const { createChatCompletionEmitter, createDeltaCoalescer } = require('./stream_emitter');
@@ -58,7 +57,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
const toolPolicy = resolveToolcallPolicy(prep.body, payload.tools);
const toolNames = toolPolicy.toolNames;
const emitEarlyToolDeltas = toolPolicy.emitEarlyToolDeltas;
const stripReferenceMarkers = boolDefaultTrue(prep.body.compat && prep.body.compat.strip_reference_markers);
const stripReferenceMarkers = true;
if (!model || !leaseID || !deepseekToken || !initialPowHeader || !completionPayload) {
writeOpenAIError(res, 500, 'invalid vercel prepare response');

View File

@@ -2,7 +2,12 @@
const CDATA_PATTERN = /^<!\[CDATA\[([\s\S]*?)]]>$/i;
const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi;
const TOOL_MARKUP_NAMES = ['tool_calls', 'invoke', 'parameter'];
const TOOL_MARKUP_NAMES = [
{ raw: 'tool_calls', canonical: 'tool_calls' },
{ raw: 'tool-calls', canonical: 'tool_calls', dsmlOnly: true },
{ raw: 'invoke', canonical: 'invoke' },
{ raw: 'parameter', canonical: 'parameter' },
];
const {
toStringSafe,
@@ -437,7 +442,7 @@ function scanToolMarkupTagAt(text, start) {
const prefix = consumeToolMarkupNamePrefix(raw, lower, i);
i = prefix.next;
const dsmlLike = prefix.dsmlLike;
const { name, len } = matchToolMarkupName(lower, i);
const { name, len } = matchToolMarkupName(lower, i, dsmlLike);
if (!name) {
return null;
}
@@ -610,24 +615,31 @@ function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
return { next: idx + 1, ok: true };
}
if (lower.startsWith('dsml', idx)) {
return { next: idx + 'dsml'.length, ok: true };
let next = idx + 'dsml'.length;
if (next < raw.length && raw[next] === '-') {
next += 1;
}
return { next, ok: true };
}
return { next: idx, ok: false };
}
function hasToolMarkupNamePrefix(lowerTail) {
for (const name of TOOL_MARKUP_NAMES) {
if (lowerTail.startsWith(name) || name.startsWith(lowerTail)) {
if (lowerTail.startsWith(name.raw) || name.raw.startsWith(lowerTail)) {
return true;
}
}
return false;
}
function matchToolMarkupName(lower, start) {
function matchToolMarkupName(lower, start, dsmlLike) {
for (const name of TOOL_MARKUP_NAMES) {
if (lower.startsWith(name, start)) {
return { name, len: name.length };
if (name.dsmlOnly && !dsmlLike) {
continue;
}
if (lower.startsWith(name.raw, start)) {
return { name: name.canonical, len: name.raw.length };
}
}
return { name: '', len: 0 };

View File

@@ -10,7 +10,6 @@ import (
type ConfigReader interface {
ModelAliases() map[string]string
CompatWideInputStrictOutput() bool
}
func NormalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID string) (StandardRequest, error) {
@@ -74,17 +73,7 @@ func NormalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
thinkingEnabled = false
}
// Keep width-control as an explicit policy hook even if current default is true.
allowWideInput := true
if store != nil {
allowWideInput = store.CompatWideInputStrictOutput()
}
var messagesRaw []any
if allowWideInput {
messagesRaw = ResponsesMessagesFromRequest(req)
} else if msgs, ok := req["messages"].([]any); ok && len(msgs) > 0 {
messagesRaw = msgs
}
messagesRaw := ResponsesMessagesFromRequest(req)
if len(messagesRaw) == 0 {
return StandardRequest{}, fmt.Errorf("request must include 'input' or 'messages'")
}

View File

@@ -99,6 +99,7 @@ func NewApp() (*App, error) {
r.Post("/v1/responses", responsesHandler.Responses)
r.Get("/v1/responses/{response_id}", responsesHandler.GetResponseByID)
r.Post("/v1/files", filesHandler.UploadFile)
r.Get("/v1/files/{file_id}", filesHandler.RetrieveFile)
r.Post("/v1/embeddings", embeddingsHandler.Embeddings)
// Root OpenAI aliases support clients configured with the bare DS2API service URL.
r.Get("/models", modelsHandler.ListModels)
@@ -107,6 +108,7 @@ func NewApp() (*App, error) {
r.Post("/responses", responsesHandler.Responses)
r.Get("/responses/{response_id}", responsesHandler.GetResponseByID)
r.Post("/files", filesHandler.UploadFile)
r.Get("/files/{file_id}", filesHandler.RetrieveFile)
r.Post("/embeddings", embeddingsHandler.Embeddings)
claude.RegisterRoutes(r, claudeHandler)
gemini.RegisterRoutes(r, geminiHandler)

View File

@@ -36,6 +36,7 @@ func TestAPIRoutesRemainRegistered(t *testing.T) {
"POST /v1/responses",
"GET /v1/responses/{response_id}",
"POST /v1/files",
"GET /v1/files/{file_id}",
"POST /v1/embeddings",
"GET /models",
"GET /models/{model_id}",
@@ -43,6 +44,7 @@ func TestAPIRoutesRemainRegistered(t *testing.T) {
"POST /responses",
"GET /responses/{response_id}",
"POST /files",
"GET /files/{file_id}",
"POST /embeddings",
"GET /anthropic/v1/models",
"POST /anthropic/v1/messages",

View File

@@ -1,12 +1,12 @@
package sse
import "strings"
import (
"strings"
"unicode/utf8"
)
const minContinuationSnapshotLen = 32
// TrimContinuationOverlap removes the already-seen prefix when DeepSeek
// continue rounds resend the full fragment snapshot instead of only the new
// suffix. Non-overlapping chunks are returned unchanged.
func TrimContinuationOverlap(existing, incoming string) string {
if incoming == "" {
return ""
@@ -14,11 +14,44 @@ func TrimContinuationOverlap(existing, incoming string) string {
if existing == "" {
return incoming
}
if len(incoming) >= minContinuationSnapshotLen && strings.HasPrefix(incoming, existing) {
return incoming[len(existing):]
if utf8.RuneCountInString(incoming) < minContinuationSnapshotLen {
return incoming
}
if len(incoming) >= minContinuationSnapshotLen && strings.HasPrefix(existing, incoming) {
if len(incoming) > len(existing) {
if strings.HasPrefix(incoming, existing) {
return incoming[len(existing):]
}
return incoming
}
if len(incoming) < len(existing) && strings.HasPrefix(existing, incoming) {
return ""
}
return incoming
}
func TrimContinuationOverlapFromBuilder(existing *strings.Builder, incoming string) string {
if incoming == "" {
return ""
}
if existing == nil || existing.Len() == 0 {
return incoming
}
if utf8.RuneCountInString(incoming) < minContinuationSnapshotLen {
return incoming
}
existingLen := existing.Len()
if len(incoming) > existingLen {
existingStr := existing.String()
if strings.HasPrefix(incoming, existingStr) {
return incoming[existingLen:]
}
return incoming
}
if len(incoming) < existingLen {
existingStr := existing.String()
if strings.HasPrefix(existingStr, incoming) {
return ""
}
}
return incoming
}

View File

@@ -1,6 +1,9 @@
package sse
import "testing"
import (
"strings"
"testing"
)
func TestTrimContinuationOverlapReturnsSuffixForSnapshotReplay(t *testing.T) {
existing := "我们被问到:这是一个很长的续答快照前缀,用来验证去重逻辑不会误伤正常 token。"
@@ -37,3 +40,12 @@ func TestTrimContinuationOverlapKeepsShortPrefixLikeNormalToken(t *testing.T) {
t.Fatalf("expected short token preserved, got %q", got)
}
}
func TestTrimContinuationOverlapKeepsShortMultibyteChunk(t *testing.T) {
existing := strings.Repeat("字", 36)
incoming := strings.Repeat("字", 16)
got := TrimContinuationOverlap(existing, incoming)
if got != incoming {
t.Fatalf("expected short multibyte chunk preserved, got %q", got)
}
}

View File

@@ -4,149 +4,359 @@ import (
"bufio"
"context"
"io"
"strings"
"time"
"unicode/utf8"
)
const (
parsedLineBufferSize = 128
lineReaderBufferSize = 64 * 1024
minFlushChars = 160
maxFlushWait = 80 * time.Millisecond
)
// StartParsedLinePump scans an upstream DeepSeek SSE body and emits normalized
// line parse results. It centralizes scanner setup + current fragment type
// tracking for all streaming adapters.
type AccumulateConfig struct {
Enabled bool
MinChars int
MaxWait time.Duration
FlushOnFinish bool
WordBoundary bool
FlushOnNewline bool
}
var productionAccumulate = AccumulateConfig{
Enabled: true,
MinChars: 16,
MaxWait: 10 * time.Millisecond,
FlushOnFinish: true,
WordBoundary: false,
FlushOnNewline: true,
}
func StartParsedLinePump(ctx context.Context, body io.Reader, thinkingEnabled bool, initialType string) (<-chan LineResult, <-chan error) {
return startParsedLinePumpWithConfig(ctx, body, thinkingEnabled, initialType, productionAccumulate)
}
func startParsedLinePumpWithConfig(ctx context.Context, body io.Reader, thinkingEnabled bool, initialType string, cfg AccumulateConfig) (<-chan LineResult, <-chan error) {
out := make(chan LineResult, parsedLineBufferSize)
done := make(chan error, 1)
go func() {
defer close(out)
type scanItem struct {
line []byte
err error
eof bool
}
lineCh := make(chan scanItem, 1)
stopReader := make(chan struct{})
defer close(stopReader)
reader := bufio.NewReaderSize(body, lineReaderBufferSize)
currentType := initialType
var pumpErr error
var textBuffer strings.Builder
var thinkingBuffer strings.Builder
var toolDetectionThinkingBuffer strings.Builder
var textPendingType string
var thinkingPendingType string
var anyFlushed bool
var pendingResponseMessageID int
scanCh := make(chan []byte, parsedLineBufferSize)
scanDone := make(chan error, 1)
go func() {
sendScanItem := func(item scanItem) bool {
select {
case lineCh <- item:
return true
case <-ctx.Done():
return false
case <-stopReader:
return false
}
}
defer close(lineCh)
reader := bufio.NewReaderSize(body, lineReaderBufferSize)
for {
line, err := reader.ReadBytes('\n')
if len(line) > 0 {
line = append([]byte{}, line...)
if !sendScanItem(scanItem{line: line}) {
copied := append([]byte(nil), line...)
select {
case scanCh <- copied:
case <-ctx.Done():
close(scanCh)
scanDone <- ctx.Err()
return
}
}
if err != nil {
close(scanCh)
if err == io.EOF {
err = nil
}
_ = sendScanItem(scanItem{err: err, eof: true})
scanDone <- err
return
}
}
}()
ticker := time.NewTicker(maxFlushWait)
defer ticker.Stop()
currentType := initialType
var pending *LineResult
pendingChars := 0
maxWaitTimer := time.NewTimer(0)
if !maxWaitTimer.Stop() {
<-maxWaitTimer.C
}
maxWaitActive := false
sendResult := func(r LineResult) bool {
select {
case out <- r:
return true
case <-ctx.Done():
done <- ctx.Err()
return false
resetMaxWait := func() {
if maxWaitActive {
if !maxWaitTimer.Stop() {
select {
case <-maxWaitTimer.C:
default:
}
}
}
maxWaitTimer.Reset(cfg.MaxWait)
maxWaitActive = true
}
stopMaxWait := func() {
if maxWaitActive {
if !maxWaitTimer.Stop() {
select {
case <-maxWaitTimer.C:
default:
}
}
maxWaitActive = false
}
}
flushPending := func() bool {
if pending == nil {
defer stopMaxWait()
shouldFlushImmediate := func(text string) bool {
if cfg.FlushOnNewline && strings.ContainsAny(text, "\n\r") {
return true
}
if !sendResult(*pending) {
return false
return false
}
hasBufferedData := func() bool {
return textBuffer.Len() > 0 || thinkingBuffer.Len() > 0 || toolDetectionThinkingBuffer.Len() > 0
}
flushBuffer := func(force bool) {
if !cfg.Enabled {
return
}
textChars := utf8.RuneCountInString(textBuffer.String())
thinkingChars := utf8.RuneCountInString(thinkingBuffer.String())
shouldFlush := force ||
!anyFlushed ||
textChars >= cfg.MinChars ||
(thinkingChars > 0 && textChars >= 50)
if !shouldFlush {
return
}
anyFlushed = true
var parts []ContentPart
if thinkingChars > 0 {
parts = append(parts, ContentPart{Text: thinkingBuffer.String(), Type: thinkingPendingType})
thinkingBuffer.Reset()
}
if textChars > 0 {
parts = append(parts, ContentPart{Text: textBuffer.String(), Type: textPendingType})
textBuffer.Reset()
}
if len(parts) > 0 || toolDetectionThinkingBuffer.Len() > 0 {
var detectionParts []ContentPart
if toolDetectionThinkingBuffer.Len() > 0 {
detectionParts = append(detectionParts, ContentPart{Text: toolDetectionThinkingBuffer.String(), Type: "thinking"})
toolDetectionThinkingBuffer.Reset()
}
result := LineResult{
Parsed: true,
Stop: false,
Parts: parts,
ToolDetectionThinkingParts: detectionParts,
NextType: currentType,
ResponseMessageID: pendingResponseMessageID,
}
pendingResponseMessageID = 0
select {
case out <- result:
case <-ctx.Done():
pumpErr = ctx.Err()
return
}
}
if hasBufferedData() {
resetMaxWait()
} else {
stopMaxWait()
}
}
processLine := func(result LineResult) bool {
currentType = result.NextType
if result.ResponseMessageID > 0 {
pendingResponseMessageID = result.ResponseMessageID
}
if result.Stop {
if cfg.Enabled && cfg.FlushOnFinish {
for _, p := range result.ToolDetectionThinkingParts {
toolDetectionThinkingBuffer.WriteString(p.Text)
}
if textBuffer.Len() > 0 || len(result.Parts) > 0 || toolDetectionThinkingBuffer.Len() > 0 {
for _, p := range result.Parts {
if p.Type == "thinking" {
thinkingBuffer.WriteString(p.Text)
thinkingPendingType = "thinking"
} else {
textBuffer.WriteString(p.Text)
textPendingType = p.Type
}
}
flushBuffer(true)
}
} else if !cfg.Enabled {
var filteredParts []ContentPart
for _, p := range result.Parts {
if p.Type == "thinking" && !thinkingEnabled {
continue
}
filteredParts = append(filteredParts, p)
}
result.Parts = filteredParts
}
if result.ErrorMessage != "" || result.ContentFilter {
select {
case out <- result:
case <-ctx.Done():
pumpErr = ctx.Err()
return false
}
} else {
stopResult := LineResult{
Parsed: true,
Stop: true,
NextType: currentType,
ResponseMessageID: pendingResponseMessageID,
}
pendingResponseMessageID = 0
select {
case out <- stopResult:
case <-ctx.Done():
pumpErr = ctx.Err()
return false
}
}
return true
}
if !result.Parsed {
return true
}
if cfg.Enabled {
for _, p := range result.ToolDetectionThinkingParts {
toolDetectionThinkingBuffer.WriteString(p.Text)
}
for _, p := range result.Parts {
if p.Type == "thinking" {
if textBuffer.Len() > 0 {
flushBuffer(true)
}
thinkingBuffer.WriteString(p.Text)
thinkingPendingType = "thinking"
} else {
textBuffer.WriteString(p.Text)
textPendingType = p.Type
if shouldFlushImmediate(p.Text) {
flushBuffer(true)
}
}
}
if utf8.RuneCountInString(textBuffer.String()) >= cfg.MinChars {
flushBuffer(false)
}
if hasBufferedData() && !maxWaitActive {
resetMaxWait()
}
} else {
var parts []ContentPart
for _, p := range result.Parts {
if p.Type == "thinking" && !thinkingEnabled {
continue
}
parts = append(parts, p)
}
if len(parts) > 0 || len(result.ToolDetectionThinkingParts) > 0 {
filteredResult := LineResult{
Parsed: true,
Stop: false,
Parts: parts,
ToolDetectionThinkingParts: result.ToolDetectionThinkingParts,
NextType: currentType,
}
select {
case out <- filteredResult:
case <-ctx.Done():
pumpErr = ctx.Err()
return false
}
}
}
pending = nil
pendingChars = 0
return true
}
for {
select {
case <-ctx.Done():
done <- ctx.Err()
return
case <-ticker.C:
if !flushPending() {
return
}
case item, ok := <-lineCh:
if !ok || item.eof {
if !flushPending() {
return
pumpErr = ctx.Err()
goto done
case line, ok := <-scanCh:
if !ok {
scanCh = nil
err := <-scanDone
if err != nil {
pumpErr = err
}
done <- item.err
return
goto done
}
line := item.line
result := ParseDeepSeekContentLine(line, thinkingEnabled, currentType)
currentType = result.NextType
canAccumulate := result.Parsed && !result.Stop && result.ErrorMessage == "" && !result.ContentFilter && result.ResponseMessageID == 0
if canAccumulate {
lineChars := 0
for _, p := range result.Parts {
lineChars += len(p.Text)
}
for _, p := range result.ToolDetectionThinkingParts {
lineChars += len(p.Text)
}
if lineChars > 0 {
if pending == nil {
cp := result
pending = &cp
} else {
pending.Parts = append(pending.Parts, result.Parts...)
pending.ToolDetectionThinkingParts = append(pending.ToolDetectionThinkingParts, result.ToolDetectionThinkingParts...)
pending.NextType = result.NextType
}
pendingChars += lineChars
if pendingChars < minFlushChars {
continue
}
if !flushPending() {
return
}
continue
}
if !processLine(result) {
goto done
}
if !flushPending() {
return
case err, ok := <-scanDone:
if !ok || scanCh == nil {
goto done
}
if !sendResult(result) {
return
if err != nil {
pumpErr = err
}
for line := range scanCh {
result := ParseDeepSeekContentLine(line, thinkingEnabled, currentType)
if !processLine(result) {
goto done
}
}
goto done
case <-maxWaitTimer.C:
maxWaitActive = false
if hasBufferedData() {
flushBuffer(true)
}
}
}
done:
stopMaxWait()
if cfg.Enabled {
flushBuffer(true)
}
if pumpErr != nil {
done <- pumpErr
} else {
done <- nil
}
}()
return out, done
}

View File

@@ -5,6 +5,7 @@ import (
"io"
"strings"
"testing"
"time"
)
func TestStartParsedLinePumpEmptyBody(t *testing.T) {
@@ -41,11 +42,17 @@ func TestStartParsedLinePumpMultipleLines(t *testing.T) {
if len(collected) < 2 {
t.Fatalf("expected at least 2 results, got %d", len(collected))
}
// First should be thinking
if collected[0].Parts[0].Type != "thinking" {
t.Fatalf("expected first part thinking, got %q", collected[0].Parts[0].Type)
hasThinking := false
for _, r := range collected {
for _, p := range r.Parts {
if p.Type == "thinking" {
hasThinking = true
}
}
}
if !hasThinking {
t.Fatal("expected thinking part in results")
}
// Last should be stop
last := collected[len(collected)-1]
if !last.Stop {
t.Fatal("expected last result to be stop")
@@ -70,15 +77,24 @@ func TestStartParsedLinePumpTypeTracking(t *testing.T) {
}
<-done
// Should have: thinking, thinking, text, text
expected := []string{"thinking", "thinking", "text", "text"}
if len(types) != len(expected) {
t.Fatalf("expected types %v, got %v", expected, types)
if len(types) == 0 {
t.Fatal("expected some parts, got none")
}
for i, want := range expected {
if types[i] != want {
t.Fatalf("type[%d] mismatch: want %q got %q (all=%v)", i, want, types[i], types)
hasThinking := false
hasText := false
for _, tp := range types {
if tp == "thinking" {
hasThinking = true
}
if tp == "text" {
hasText = true
}
}
if !hasThinking {
t.Fatalf("expected thinking type in results, got %v", types)
}
if !hasText {
t.Fatalf("expected text type in results, got %v", types)
}
}
@@ -88,29 +104,23 @@ func TestStartParsedLinePumpContextCancellation(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
results, done := StartParsedLinePump(ctx, pr, false, "text")
// Write one line to allow it to start
go func() {
_, _ = io.WriteString(pw, "data: {\"p\":\"response/content\",\"v\":\"hello\"}\n")
// Don't close yet - wait for context cancel
time.Sleep(50 * time.Millisecond)
_ = pw.Close()
}()
// Read first result
r := <-results
if !r.Parsed || len(r.Parts) == 0 {
t.Fatalf("expected first parsed result, got %#v", r)
}
// Cancel context - this will cause the pump to exit on next send
cancel()
// Close the pipe to unblock scanner.Scan()
_ = pw.Close()
// Drain remaining results
for range results {
}
err := <-done
// Error may be context.Canceled or nil (if pipe closed first)
if err != nil && err != context.Canceled {
t.Fatalf("expected context.Canceled or nil error, got %v", err)
}
@@ -202,13 +212,47 @@ func TestStartParsedLinePumpAccumulatesSmallChunks(t *testing.T) {
t.Fatalf("unexpected error: %v", err)
}
if len(collected) != 2 {
t.Fatalf("expected 2 results (accumulated content + done), got %d", len(collected))
last := collected[len(collected)-1]
if !last.Stop {
t.Fatal("expected last result to stop")
}
if len(collected[0].Parts) != 2 {
t.Fatalf("expected 2 accumulated parts, got %d", len(collected[0].Parts))
allText := strings.Builder{}
for _, r := range collected {
for _, p := range r.Parts {
allText.WriteString(p.Text)
}
}
if !collected[1].Stop {
t.Fatal("expected second result to stop")
if allText.String() != "hi" {
t.Fatalf("expected accumulated text 'hi', got %q", allText.String())
}
}
func TestStartParsedLinePumpFirstFlushImmediate(t *testing.T) {
body := strings.NewReader(
"data: {\"p\":\"response/content\",\"v\":\"Hi\"}\n" +
"data: [DONE]\n",
)
results, done := StartParsedLinePump(context.Background(), body, false, "text")
collected := make([]LineResult, 0)
for r := range results {
collected = append(collected, r)
}
if err := <-done; err != nil {
t.Fatalf("unexpected error: %v", err)
}
hasContent := false
for _, r := range collected {
for _, p := range r.Parts {
if p.Text == "Hi" {
hasContent = true
}
}
}
if !hasContent {
t.Fatal("expected 'Hi' content in results")
}
}

View File

@@ -43,7 +43,7 @@ func TestStartParsedLinePumpParsesAndStops(t *testing.T) {
}
func TestStartParsedLinePumpHandlesLongSingleSSELine(t *testing.T) {
payload := strings.Repeat("x", 2*1024*1024+4096)
payload := strings.Repeat("x", 5*1024*1024+4096)
results, done := StartParsedLinePump(context.Background(), strings.NewReader(makeLargeContentSSEBody(t, payload)), false, "text")
var got strings.Builder

View File

@@ -10,3 +10,11 @@ func StripReferenceMarkers(text string) string {
}
return referenceMarkerPattern.ReplaceAllString(text, "")
}
// StripReferenceMarkersEnabled returns true while reference-marker
// stripping remains the fixed runtime default. When the behaviour is
// eventually removed this function can be deleted and callers can drop
// the conditional.
func StripReferenceMarkersEnabled() bool {
return true
}

View File

@@ -21,7 +21,7 @@ func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
var b strings.Builder
b.Grow(len(text))
for i := 0; i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
b.WriteString(text[i:])
break

View File

@@ -147,13 +147,14 @@ func stripFencedCodeBlocks(text string) string {
inFence := false
fenceMarker := ""
inCDATA := false
cdataFenceMarker := ""
// Track builder length when a fence opens so we can preserve content
// collected before the unclosed fence.
beforeFenceLen := 0
for _, line := range lines {
if inCDATA || cdataStartsBeforeFence(line) {
b.WriteString(line)
inCDATA = updateCDATAState(inCDATA, line)
inCDATA, cdataFenceMarker = updateCDATAStateForStrip(inCDATA, cdataFenceMarker, line)
continue
}
trimmed := strings.TrimLeft(line, " \t")
@@ -210,28 +211,65 @@ func firstFenceMarkerIndex(line string) int {
}
}
func updateCDATAState(inCDATA bool, line string) bool {
func updateCDATAStateForStrip(inCDATA bool, cdataFenceMarker, line string) (bool, string) {
lower := strings.ToLower(line)
pos := 0
state := inCDATA
for pos < len(lower) {
if state {
end := strings.Index(lower[pos:], "]]>")
if end < 0 {
return true
}
pos += end + len("]]>")
state = false
continue
}
fenceMarker := cdataFenceMarker
lineForFence := line
if !state {
start := strings.Index(lower[pos:], "<![cdata[")
if start < 0 {
return false
return false, ""
}
pos += start + len("<![cdata[")
state = true
lineForFence = line[pos:]
}
return state
if !state {
return false, ""
}
trimmed := strings.TrimLeft(lineForFence, " \t")
if fenceMarker == "" {
if marker, ok := parseFenceOpen(trimmed); ok {
fenceMarker = marker
}
} else if isFenceClose(trimmed, fenceMarker) {
fenceMarker = ""
}
for pos < len(lower) {
end := strings.Index(lower[pos:], "]]>")
if end < 0 {
return true, fenceMarker
}
endPos := pos + end
pos = endPos + len("]]>")
if fenceMarker != "" {
continue
}
if cdataEndLooksStructural(lower, pos) || strings.TrimSpace(lower[pos:]) == "" {
state = false
for pos < len(lower) {
start := strings.Index(lower[pos:], "<![cdata[")
if start < 0 {
return false, ""
}
pos += start + len("<![cdata[")
state = true
trimmedTail := strings.TrimLeft(line[pos:], " \t")
if marker, ok := parseFenceOpen(trimmedTail); ok {
fenceMarker = marker
} else {
fenceMarker = ""
}
break
}
continue
}
}
return state, fenceMarker
}
func parseFenceOpen(line string) (string, bool) {

View File

@@ -144,7 +144,7 @@ func findXMLStartTagOutsideCDATA(text, tag string, from int) (start, bodyStart i
lower := strings.ToLower(text)
target := "<" + strings.ToLower(tag)
for i := maxInt(from, 0); i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
return -1, -1, "", false
}
@@ -170,7 +170,7 @@ func findMatchingXMLEndTagOutsideCDATA(text, tag string, from int) (closeStart,
closeTarget := "</" + strings.ToLower(tag)
depth := 1
for i := maxInt(from, 0); i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
return -1, -1, false
}
@@ -206,14 +206,14 @@ func findMatchingXMLEndTagOutsideCDATA(text, tag string, from int) (closeStart,
return -1, -1, false
}
func skipXMLIgnoredSection(lower string, i int) (next int, advanced bool, blocked bool) {
func skipXMLIgnoredSection(text, lower string, i int) (next int, advanced bool, blocked bool) {
switch {
case strings.HasPrefix(lower[i:], "<![cdata["):
end := strings.Index(lower[i+len("<![cdata["):], "]]>")
end := findToolCDATAEnd(text, lower, i+len("<![cdata["))
if end < 0 {
return 0, false, true
}
return i + len("<![cdata[") + end + len("]]>"), true, false
return end + len("]]>"), true, false
case strings.HasPrefix(lower[i:], "<!--"):
end := strings.Index(lower[i+len("<!--"):], "-->")
if end < 0 {
@@ -225,6 +225,69 @@ func skipXMLIgnoredSection(lower string, i int) (next int, advanced bool, blocke
}
}
func findToolCDATAEnd(text, lower string, from int) int {
if from < 0 || from > len(text) {
return -1
}
const closeMarker = "]]>"
firstNonFenceEnd := -1
for searchFrom := from; searchFrom < len(text); {
rel := strings.Index(lower[searchFrom:], closeMarker)
if rel < 0 {
break
}
end := searchFrom + rel
searchFrom = end + len(closeMarker)
if cdataOffsetIsInsideMarkdownFence(text[from:end]) {
continue
}
if firstNonFenceEnd < 0 {
firstNonFenceEnd = end
}
if cdataEndLooksStructural(lower, searchFrom) {
return end
}
}
return firstNonFenceEnd
}
func cdataEndLooksStructural(lower string, after int) bool {
for after < len(lower) {
switch lower[after] {
case ' ', '\t', '\r', '\n':
after++
continue
default:
}
break
}
return strings.HasPrefix(lower[after:], "</")
}
func cdataOffsetIsInsideMarkdownFence(fragment string) bool {
if fragment == "" {
return false
}
lines := strings.SplitAfter(fragment, "\n")
inFence := false
fenceMarker := ""
for _, line := range lines {
trimmed := strings.TrimLeft(line, " \t")
if !inFence {
if marker, ok := parseFenceOpen(trimmed); ok {
inFence = true
fenceMarker = marker
}
continue
}
if isFenceClose(trimmed, fenceMarker) {
inFence = false
fenceMarker = ""
}
}
return inFence
}
func findXMLTagEnd(text string, from int) int {
quote := byte(0)
for i := maxInt(from, 0); i < len(text); i++ {

View File

@@ -2,7 +2,18 @@ package toolcall
import "strings"
var toolMarkupNames = []string{"tool_calls", "invoke", "parameter"}
type toolMarkupNameAlias struct {
raw string
canonical string
dsmlOnly bool
}
var toolMarkupNames = []toolMarkupNameAlias{
{raw: "tool_calls", canonical: "tool_calls"},
{raw: "tool-calls", canonical: "tool_calls", dsmlOnly: true},
{raw: "invoke", canonical: "invoke"},
{raw: "parameter", canonical: "parameter"},
}
type ToolMarkupTag struct {
Start int
@@ -19,7 +30,7 @@ type ToolMarkupTag struct {
func ContainsToolMarkupSyntaxOutsideIgnored(text string) (hasDSML, hasCanonical bool) {
lower := strings.ToLower(text)
for i := 0; i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
return hasDSML, hasCanonical
}
@@ -47,7 +58,7 @@ func ContainsToolMarkupSyntaxOutsideIgnored(text string) (hasDSML, hasCanonical
func ContainsToolCallWrapperSyntaxOutsideIgnored(text string) (hasDSML, hasCanonical bool) {
lower := strings.ToLower(text)
for i := 0; i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
return hasDSML, hasCanonical
}
@@ -79,7 +90,7 @@ func ContainsToolCallWrapperSyntaxOutsideIgnored(text string) (hasDSML, hasCanon
func FindToolMarkupTagOutsideIgnored(text string, start int) (ToolMarkupTag, bool) {
lower := strings.ToLower(text)
for i := maxInt(start, 0); i < len(text); {
next, advanced, blocked := skipXMLIgnoredSection(lower, i)
next, advanced, blocked := skipXMLIgnoredSection(text, lower, i)
if blocked {
return ToolMarkupTag{}, false
}
@@ -137,7 +148,7 @@ func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
i++
}
i, dsmlLike := consumeToolMarkupNamePrefix(lower, text, i)
name, nameLen := matchToolMarkupName(lower, i)
name, nameLen := matchToolMarkupName(lower, i, dsmlLike)
if nameLen == 0 {
return ToolMarkupTag{}, false
}
@@ -230,24 +241,31 @@ func consumeToolMarkupNamePrefixOnce(lower, text string, idx int) (int, bool) {
return idx + 1, true
}
if strings.HasPrefix(lower[idx:], "dsml") {
return idx + len("dsml"), true
next := idx + len("dsml")
if next < len(text) && (text[next] == '-' || text[next] == '_') {
next++
}
return next, true
}
return idx, false
}
func hasToolMarkupNamePrefix(lowerTail string) bool {
for _, name := range toolMarkupNames {
if strings.HasPrefix(lowerTail, name) || strings.HasPrefix(name, lowerTail) {
if strings.HasPrefix(lowerTail, name.raw) || strings.HasPrefix(name.raw, lowerTail) {
return true
}
}
return false
}
func matchToolMarkupName(lower string, start int) (string, int) {
func matchToolMarkupName(lower string, start int, dsmlLike bool) (string, int) {
for _, name := range toolMarkupNames {
if strings.HasPrefix(lower[start:], name) {
return name, len(name)
if name.dsmlOnly && !dsmlLike {
continue
}
if strings.HasPrefix(lower[start:], name.raw) {
return name.canonical, len(name.raw)
}
}
return "", 0

View File

@@ -41,6 +41,45 @@ func TestParseToolCallsSupportsDSMLShell(t *testing.T) {
}
}
func TestParseToolCallsSupportsHyphenatedDSMLShellWithHereDocCDATA(t *testing.T) {
text := `<dsml-tool-calls>
<dsml-invoke name="Bash">
<dsml-parameter name="command"><![CDATA[git commit -m "$(cat <<'EOF'
docs: add missing directory entries and package descriptions to architecture docs
Fill gaps identified in architecture audit: add artifacts/ and static/ to
directory tree, and document 7 auxiliary internal/ packages (textclean,
claudeconv, compat, rawsample, devcapture, util, version) in Section 3.
Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com
EOF
)"]]></dsml-parameter>
<dsml-parameter name="description"><![CDATA[Create commit with architecture doc updates]]></dsml-parameter>
</dsml-invoke>
</dsml-tool-calls>`
calls := ParseToolCalls(text, []string{"Bash"})
if len(calls) != 1 {
t.Fatalf("expected 1 hyphenated DSML call, got %#v", calls)
}
if calls[0].Name != "Bash" {
t.Fatalf("expected Bash tool, got %#v", calls[0])
}
command, _ := calls[0].Input["command"].(string)
if !strings.Contains(command, `git commit -m "$(cat <<'EOF'`) || !strings.Contains(command, "Co-Authored-By: Claude Opus 4.7") {
t.Fatalf("expected here-doc CDATA command to be preserved, got %q", command)
}
if calls[0].Input["description"] != "Create commit with architecture doc updates" {
t.Fatalf("expected description parameter, got %#v", calls[0].Input)
}
}
func TestParseToolCallsIgnoresBareHyphenatedToolCallsLookalike(t *testing.T) {
text := `<tool-calls><invoke name="Bash"><parameter name="command">pwd</parameter></invoke></tool-calls>`
calls := ParseToolCalls(text, []string{"Bash"})
if len(calls) != 0 {
t.Fatalf("expected bare hyphenated lookalike to be ignored, got %#v", calls)
}
}
func TestParseToolCallsToleratesDSMLTrailingPipeTagTerminator(t *testing.T) {
text := strings.Join([]string{
`<|DSML|tool_calls| `,
@@ -99,6 +138,61 @@ func TestParseToolCallsSupportsDSMLShellWithCanonicalExampleInCDATA(t *testing.T
}
}
func TestParseToolCallsKeepsHereDocCDATAWithFencedDSMLAndLiteralCDATAEnd(t *testing.T) {
command := strings.Join([]string{
"cat > docs/project-value.md << 'ENDOFFILE'",
"# DS2API project value",
"",
"```xml",
`<|DSML|tool_calls>`,
` <|DSML|invoke name="Bash">`,
` <|DSML|parameter name="command"><![CDATA[grep -E "error|fail" < input.log 2>&1]]></|DSML|parameter>`,
` </|DSML|invoke>`,
`</|DSML|tool_calls>`,
"```",
"",
"Only the literal `]]>` needs special handling.",
"",
"ENDOFFILE",
`echo "Done. Lines: $(wc -l < docs/project-value.md)"`,
}, "\n")
text := `<|DSML|tool_calls><|DSML|invoke name="Bash"><|DSML|parameter name="command"><![CDATA[` + command + `]]></|DSML|parameter><|DSML|parameter name="description"><![CDATA[Write project value doc]]></|DSML|parameter></|DSML|invoke></|DSML|tool_calls>`
calls := ParseToolCalls(text, []string{"Bash"})
if len(calls) != 1 {
t.Fatalf("expected one DSML call with extreme heredoc CDATA, got %#v", calls)
}
got, _ := calls[0].Input["command"].(string)
if got != command {
t.Fatalf("expected full heredoc command to survive, got:\n%q\nwant:\n%q", got, command)
}
if calls[0].Input["description"] != "Write project value doc" {
t.Fatalf("expected sibling parameter after command, got %#v", calls[0].Input)
}
}
func TestParseToolCallsKeepsCompactCDATAWithImmediateFencedDSML(t *testing.T) {
content := strings.Join([]string{
"```xml",
`<|DSML|tool_calls>`,
` <|DSML|invoke name="Bash">`,
` <|DSML|parameter name="command"><![CDATA[echo compact]]></|DSML|parameter>`,
` </|DSML|invoke>`,
`</|DSML|tool_calls>`,
"```",
"tail",
}, "\n")
text := `<tool_calls><invoke name="Write"><parameter name="content"><![CDATA[` + content + `]]></parameter></invoke></tool_calls>`
calls := ParseToolCalls(text, []string{"Write"})
if len(calls) != 1 {
t.Fatalf("expected one compact CDATA call, got %#v", calls)
}
if calls[0].Input["content"] != content {
t.Fatalf("expected compact CDATA content to survive, got %#v", calls[0].Input["content"])
}
}
func TestParseToolCallsPreservesSimpleCDATAInlineMarkupAsText(t *testing.T) {
text := `<tool_calls><invoke name="Write"><parameter name="description"><![CDATA[<b>urgent</b>]]></parameter></invoke></tool_calls>`
calls := ParseToolCalls(text, []string{"Write"})

View File

@@ -555,6 +555,51 @@ func TestSieve_ChineseReviewSamplePreservesInlineDSMLMention(t *testing.T) {
}
}
func TestSieve_HyphenatedDSMLShellWithHereDocCDATA(t *testing.T) {
var state State
chunks := []string{
"<dsml-tool-calls>\n",
"<dsml-invoke name=\"Bash\">\n",
"<dsml-parameter name=\"command\"><![CDATA[git commit -m \"$(cat <<'EOF'\n",
"docs: add missing directory entries and package descriptions to architecture docs\n",
"Fill gaps identified in architecture audit: add artifacts/ and static/ to\n",
"directory tree, and document 7 auxiliary internal/ packages (textclean,\n",
"claudeconv, compat, rawsample, devcapture, util, version) in Section 3.\n\n",
"Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com\n",
"EOF\n",
")\"]]></dsml-parameter>\n",
"<dsml-parameter name=\"description\"><![CDATA[Create commit with architecture doc updates]]></dsml-parameter>\n",
"</dsml-invoke>\n",
"</dsml-tool-calls>",
}
var events []Event
for _, c := range chunks {
events = append(events, ProcessChunk(&state, c, []string{"Bash"})...)
}
events = append(events, Flush(&state, []string{"Bash"})...)
var text strings.Builder
var command string
callCount := 0
for _, e := range events {
text.WriteString(e.Content)
for _, call := range e.ToolCalls {
callCount++
command, _ = call.Input["command"].(string)
}
}
if callCount != 1 {
t.Fatalf("应解析出 1 个 hyphenated DSML 工具调用got %d, text=%q", callCount, text.String())
}
if !strings.Contains(command, `git commit -m "$(cat <<'EOF'`) || !strings.Contains(command, "Co-Authored-By: Claude Opus 4.7") {
t.Fatalf("here-doc command 未完整保留, got %q", command)
}
if strings.Contains(text.String(), "dsml-tool-calls") || strings.Contains(text.String(), "git commit -m") {
t.Fatalf("真实工具块不应泄漏到正文, got %q", text.String())
}
}
func TestSieve_ToleratesDSMLSpaceSeparatorTypo(t *testing.T) {
var state State
chunks := []string{

View File

@@ -265,6 +265,117 @@ func TestProcessToolSieveKeepsCDATAEmbeddedToolClosingBuffered(t *testing.T) {
}
}
func TestProcessToolSieveKeepsExtremeHereDocCDATAUntilOuterClose(t *testing.T) {
var state State
command := strings.Join([]string{
"cat > docs/project-value.md << 'ENDOFFILE'",
"# DS2API project value",
"",
"```xml",
`<|DSML|tool_calls>`,
` <|DSML|invoke name="Bash">`,
` <|DSML|parameter name="command"><![CDATA[grep -E "error|fail" < input.log 2>&1]]></|DSML|parameter>`,
` </|DSML|invoke>`,
`</|DSML|tool_calls>`,
"```",
"",
"Only the literal `]]>` needs special handling.",
"",
"ENDOFFILE",
`echo "Done. Lines: $(wc -l < docs/project-value.md)"`,
}, "\n")
innerClose := strings.Index(command, `</|DSML|tool_calls>`) + len(`</|DSML|tool_calls>`)
chunks := []string{
`<|DSML|tool_calls>` + "\n",
`<|DSML|invoke name="Bash">` + "\n",
`<|DSML|parameter name="command"><![CDATA[` + command[:innerClose],
command[innerClose:],
`]]></|DSML|parameter>` + "\n",
`<|DSML|parameter name="description"><![CDATA[Write project value doc]]></|DSML|parameter>` + "\n",
`</|DSML|invoke>` + "\n",
`</|DSML|tool_calls>`,
}
var events []Event
for i, c := range chunks {
next := ProcessChunk(&state, c, []string{"Bash"})
if i <= 2 {
for _, evt := range next {
if evt.Content != "" || len(evt.ToolCalls) > 0 {
t.Fatalf("expected no events before outer close, chunk=%d events=%#v", i, next)
}
}
}
events = append(events, next...)
}
events = append(events, Flush(&state, []string{"Bash"})...)
var textContent strings.Builder
var gotCommand string
toolCalls := 0
for _, evt := range events {
textContent.WriteString(evt.Content)
if len(evt.ToolCalls) > 0 {
toolCalls += len(evt.ToolCalls)
gotCommand, _ = evt.ToolCalls[0].Input["command"].(string)
}
}
if toolCalls != 1 {
t.Fatalf("expected one parsed tool call, got %d events=%#v", toolCalls, events)
}
if textContent.Len() != 0 {
t.Fatalf("expected no leaked text, got %q", textContent.String())
}
if gotCommand != command {
t.Fatalf("expected full heredoc command to survive, got len=%d want=%d", len(gotCommand), len(command))
}
}
func TestProcessToolSieveKeepsCompactCDATAWithImmediateFencedDSML(t *testing.T) {
var state State
content := strings.Join([]string{
"```xml",
`<|DSML|tool_calls>`,
` <|DSML|invoke name="Bash">`,
` <|DSML|parameter name="command"><![CDATA[echo compact]]></|DSML|parameter>`,
` </|DSML|invoke>`,
`</|DSML|tool_calls>`,
"```",
"tail",
}, "\n")
chunks := []string{
`<tool_calls><invoke name="Write"><parameter name="content"><![CDATA[` + content[:len("```xml\n")],
content[len("```xml\n"):],
`]]></parameter></invoke></tool_calls>`,
}
var events []Event
for _, c := range chunks {
events = append(events, ProcessChunk(&state, c, []string{"Write"})...)
}
events = append(events, Flush(&state, []string{"Write"})...)
var textContent strings.Builder
var gotContent string
toolCalls := 0
for _, evt := range events {
textContent.WriteString(evt.Content)
if len(evt.ToolCalls) > 0 {
toolCalls += len(evt.ToolCalls)
gotContent, _ = evt.ToolCalls[0].Input["content"].(string)
}
}
if toolCalls != 1 {
t.Fatalf("expected one compact CDATA tool call, got %d events=%#v", toolCalls, events)
}
if textContent.Len() != 0 {
t.Fatalf("expected no leaked text, got %q", textContent.String())
}
if gotContent != content {
t.Fatalf("expected compact CDATA content to survive, got len=%d want=%d", len(gotContent), len(content))
}
}
func TestProcessToolSieveFallsBackWhenCDATANeverCloses(t *testing.T) {
var state State
chunks := []string{

View File

@@ -0,0 +1,60 @@
'use strict';
const test = require('node:test');
const assert = require('node:assert/strict');
async function loadUtils() {
return import('../../webui/src/features/chatHistory/chatHistoryUtils.js');
}
test('chat history strict parser merges current input file placeholder', async () => {
const {
buildListModeMessages,
} = await loadUtils();
const t = (key) => key;
const item = {
messages: [{
role: 'user',
content: 'Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.',
}],
history_text: [
'<begin▁of▁sentence>',
'<User>hello',
'<Assistant>hi<end▁of▁sentence>',
].join(''),
};
const result = buildListModeMessages(item, t);
assert.equal(result.historyMerged, true);
assert.deepEqual(result.messages, [
{ role: 'user', content: 'hello' },
{ role: 'assistant', content: 'hi' },
]);
});
test('chat history strict parser inserts history after system messages', async () => {
const {
buildListModeMessages,
} = await loadUtils();
const t = (key) => key;
const item = {
messages: [
{ role: 'system', content: 'policy' },
{ role: 'user', content: 'latest' },
],
history_text: [
'<begin▁of▁sentence>',
'<User>old',
'<Assistant>done<end▁of▁sentence>',
].join(''),
};
const result = buildListModeMessages(item, t);
assert.equal(result.historyMerged, true);
assert.deepEqual(result.messages, [
{ role: 'system', content: 'policy' },
{ role: 'user', content: 'old' },
{ role: 'assistant', content: 'done' },
{ role: 'user', content: 'latest' },
]);
});

Some files were not shown because too many files have changed in this diff Show More