feat: expand tool-call parsing resilience, refine model alias resolution, and update API documentation

This commit is contained in:
CJACK
2026-05-10 01:35:43 +08:00
parent 740a78ad5a
commit 77b6d83266
22 changed files with 145 additions and 108 deletions

View File

@@ -32,7 +32,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
| Base URL | `http://localhost:5001` or your deployment domain |
| Default Content-Type | `application/json` |
| Health probes | `GET /healthz`, `GET /readyz` |
| CORS | Enabled (uniformly covers `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*`; echoes the browser `Origin` when present, otherwise `*`; default allow-list includes `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`, and also accepts third-party preflight-requested headers such as `x-stainless-*`; `/v1/chat/completions` on Vercel Node Runtime matches the same behavior; internal-only `X-Ds2-Internal-Token` remains blocked) |
| CORS | Enabled (uniformly covers `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, `/api/*`, and `/admin/*`; echoes the browser `Origin` when present, otherwise `*`; default allow-list includes `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`, and also accepts third-party preflight-requested headers such as `x-stainless-*`; `/v1/chat/completions` on Vercel Node Runtime matches the same behavior; internal-only `X-Ds2-Internal-Token` remains blocked) |
- All JSON request bodies must be valid UTF-8; malformed byte sequences are rejected on ingress with `400 invalid json`.
@@ -40,9 +40,10 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
- OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`.
- Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths.
- Tool-calling semantics are aligned between Go and Node runtime: models should output the DSML shell `<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`; DS2API also accepts legacy canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`. DSML is normalized back to XML at the parser entry, so internal parsing remains XML-based, with stream-time anti-leak filtering.
- Tool-calling semantics are aligned between Go and Node runtime: models should output the fullwidth-separator DSML shell `<DSMLtool_calls>``<DSMLinvoke name="...">``<DSMLparameter name="...">`; DS2API also accepts the halfwidth DSML wrapper `<|DSML|tool_calls>`, DSML wrapper aliases such as `<dsml|tool_calls>`, `<|tool_calls>`, `<tool_calls>`, common DSML separator drift such as `<|DSML tool_calls>`, collapsed DSML local names such as `<DSMLtool_calls>`, control-separator drift such as `<DSML␂tool_calls>` / raw STX `\x02`, arbitrary protocol prefixes such as `<proto💥tool_calls>`, and legacy canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`. The scanner normalizes fixed local names (`tool_calls` / `invoke` / `parameter`) back to XML before parsing; only wrapped tool blocks or the narrow missing-opening-wrapper repair path enter the tool path, while bare `<invoke>` does not count as supported syntax. JSON literal parameter bodies are preserved as structured values, explicit empty or whitespace-only parameters are preserved as empty strings, malformed complete wrappers are released as plain text, and loose CDATA is narrowly repaired at final parse/flush when it can preserve a complete outer tool call.
- `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior.
- When upstream returns a thinking-only response with no visible text, the Go main path for both streaming and non-streaming completions retries once in the same DeepSeek session: it appends the prompt suffix `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` and sets `parent_message_id`. If that same-account retry would still end as `429 upstream_empty_output`, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once before returning 429.
- Citation/reference marker boundary: streaming output hides upstream `[citation:N]` / `[reference:N]` placeholders by default; non-stream output converts DeepSeek search reference markers into Markdown links.
---
@@ -227,16 +228,18 @@ For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-outp
1. Match DeepSeek native model IDs first.
2. Then match exact keys in `model_aliases`.
3. If still unmatched, fall back by known family heuristics (`o*`, `gpt-*`, `claude-*`, etc.).
4. If still unmatched, return `invalid_request_error`.
3. If the request name ends with `-nothinking`, resolve the base alias and append the corresponding no-thinking variant.
4. If still unmatched, return `invalid_request_error`. Unknown model families are not guessed heuristically; add explicit compatibility names through `model_aliases`.
Built-in aliases come from `internal/config/models.go`; `config.model_aliases` can override or add mappings at runtime. Excerpt:
- OpenAI / Codex: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5.5`, `gpt-5-codex`, `gpt-5.3-codex`, `codex-mini-latest`
- OpenAI reasoning: `o1`, `o3`, `o3-deep-research`, `o4-mini`
- Claude: `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-3-5-sonnet-latest`
- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-pro-vision`
- Other compatibility families: `llama-*`, `qwen-*`, `mistral-*`, and `command-*` fall back through family heuristics
- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-3.1-pro`, `gemini-3-pro`, `gemini-3-flash`, `gemini-3.1-flash-lite`, `gemini-pro-vision`
- Other exact built-in aliases: `llama-3.1-70b-instruct`, `qwen-max`
Aliases with a `-nothinking` suffix also map to the corresponding forced no-thinking DeepSeek model.
Current vision support resolves only to `deepseek-v4-vision` and does not expose a separate `vision-search` variant.
@@ -244,7 +247,7 @@ Retired historical families such as `claude-1.*`, `claude-2.*`, `claude-instant-
### `POST /v1/chat/completions`
> Path note: besides the canonical `/v1/chat/completions`, DS2API also accepts the root shortcut `/chat/completions`. On Vercel Runtime, `stream=true` on either path is handled by the Node streaming bridge, while non-stream stays on the Go primary path.
> Path note: besides the canonical `/v1/chat/completions`, DS2API also accepts the root shortcut `/chat/completions`. On Vercel Runtime, `vercel.json` rewrites only the canonical `/v1/chat/completions` path to the Node streaming bridge; the root shortcut stays on the Go primary path. Use `/v1/chat/completions` on Vercel when real-time streaming is required.
**Headers**:
@@ -257,7 +260,7 @@ Content-Type: application/json
| Field | Type | Required | Notes |
| --- | --- | --- | --- |
| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5.5`, `gpt-5.4-mini`, `gpt-5.3-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-2.5-flash`, etc.) |
| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5.5`, `gpt-5.4-mini`, `gpt-5.3-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-3.1-pro`, `gemini-3-flash`, etc.); `-nothinking` suffixes force thinking / reasoning off |
| `messages` | array | ✅ | OpenAI-style messages |
| `stream` | boolean | ❌ | Default `false` |
| `tools` | array | ❌ | Function calling schema |
@@ -352,7 +355,8 @@ When `tools` is present, DS2API performs anti-leak handling:
Additional notes:
- The parser treats DSML shell tool blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`) and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. DSML is normalized back to XML at the parser entry; internal parsing remains XML-based. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text.
- The parser treats the recommended DSML shell tool blocks (`<DSMLtool_calls>` / `<DSMLinvoke name="...">` / `<DSMLparameter name="...">`), halfwidth DSML shell blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`), DSML wrapper aliases (`<dsml|tool_calls>`, `<|tool_calls>`, `<tool_calls>`), common DSML separator drift (`<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`), collapsed DSML local names (`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`), control-separator drift (`<DSML␂tool_calls>` / raw STX `\x02`), arbitrary protocol prefixes (`<proto💥tool_calls>`), and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. These shells normalize back to XML first, while internal parsing remains XML-based. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text; complete but malformed wrappers are also released as plain text.
- The parser no longer drops tool calls solely because parameter values are empty; explicit empty strings or whitespace-only parameters become empty strings in structured `tool_calls`. Prompting still tells the model not to emit blank parameters, and missing/empty argument rejection belongs in the tool executor or client schema validation.
- If the final visible response text is empty but the reasoning stream contains an executable tool call, Chat / Responses emits a standard OpenAI `tool_calls` / `function_call` output during finalization. If thinking/reasoning was not enabled by the client, that reasoning text is used only for detection and is not exposed as visible text or `reasoning_content`.
- `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls.
@@ -765,6 +769,7 @@ Reads runtime settings and status, including:
- `responses` / `embeddings`
- `auto_delete` (`mode`: `none` / `single` / `all`; legacy `sessions=true` is still treated as `all`)
- `current_input_file` (`enabled` defaults to `true`, plus `min_chars`)
- `thinking_injection` (`enabled` defaults to `true`, `prompt`, and `default_prompt`)
- `model_aliases`
- `env_backed`, `needs_vercel_sync`
- `toolcall` policy is fixed to `feature_match + high` and is no longer returned or editable via settings
@@ -779,6 +784,7 @@ Hot-updates runtime settings. Supported fields:
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `thinking_injection.enabled` / `thinking_injection.prompt`
- `model_aliases`
- `toolcall` policy is fixed and is no longer writable through settings

24
API.md
View File

@@ -32,7 +32,7 @@
| Base URL | `http://localhost:5001` 或你的部署域名 |
| 默认 Content-Type | `application/json` |
| 健康检查 | `GET /healthz``GET /readyz` |
| CORS | 已启用(统一覆盖 `/v1/*``/anthropic/*``/v1beta/models/*``/admin/*`;浏览器有 `Origin` 时回显该 Origin否则为 `*`;默认允许 `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`,并会放行预检里声明的第三方请求头,如 `x-stainless-*`Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同行为;内部专用头 `X-Ds2-Internal-Token` 仍被拦截) |
| CORS | 已启用(统一覆盖 `/v1/*``/anthropic/*``/v1beta/models/*``/api/*``/admin/*`;浏览器有 `Origin` 时回显该 Origin否则为 `*`;默认允许 `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`,并会放行预检里声明的第三方请求头,如 `x-stainless-*`Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同行为;内部专用头 `X-Ds2-Internal-Token` 仍被拦截) |
- 所有 JSON 请求体都必须是合法 UTF-8非法字节序列会在入站阶段被拒绝为 `400 invalid json`
@@ -40,7 +40,7 @@
- OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。
- 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>``<|tool_calls>``<tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、任意协议前缀壳(如 `<proto💥tool_calls>`),以及旧式 canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`。实现上采用结构扫描:只要固定本地标签名是 `tool_calls` / `invoke` / `parameter`,前缀壳会在解析入口归一化;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123``true``null`、数组或对象),会按结构化值输出,不再一律当作字符串;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出全角分隔符 DSML 外壳 `<DSMLtool_calls>``<DSMLinvoke name="...">``<DSMLparameter name="...">`;兼容层也接受半角 DSML wrapper `<|DSML|tool_calls>`DSML wrapper 别名 `<dsml|tool_calls>``<|tool_calls>``<tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、任意协议前缀壳(如 `<proto💥tool_calls>`),以及旧式 canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`。实现上采用结构扫描:只要固定本地标签名是 `tool_calls` / `invoke` / `parameter`,前缀壳会在解析入口归一化;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123``true``null`、数组或对象),会按结构化值输出,不再一律当作字符串;显式空字符串和纯空白参数会结构化保留为空字符串,是否拒绝缺参由工具执行侧决定;完整但 malformed 的 wrapper 会作为普通文本释放,不会吞掉或伪造成工具调用;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
- `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。
- 当上游返回 thinking-only 响应模型输出了推理链但无可见文本Go 主路径的流式与非流式补全都会先自动重试一次:以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出;同账号重试最大 1 次。若同账号重试后仍即将返回 `429 upstream_empty_output`,托管账号模式会在返回 429 前自动切换到下一个可用账号,新建 session用原始 payload 再 fresh retry 一次。
- 引用标记处理边界:流式输出默认隐藏 `[citation:N]` / `[reference:N]` 这类上游内部占位符;非流式输出默认把 DeepSeek 搜索引用标记转换为 Markdown 引用链接。
@@ -172,12 +172,12 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
| GET | `/admin/chat-history/{id}` | Admin | 查看单条服务器端对话记录 |
| DELETE | `/admin/chat-history/{id}` | Admin | 删除单条服务器端对话记录 |
| PUT | `/admin/chat-history/settings` | Admin | 更新对话记录保留条数 |
服务器端记录本质上是 DeepSeek 上游响应归档OpenAI Chat、OpenAI Responses、Claude Messages、Gemini GenerateContent 等直连 DeepSeek 的生成接口,在收到上游响应后会于各协议回译/裁剪前写入记录列表按请求创建时间倒序展示流式请求会在生成过程中持续刷新状态与详情。WebUI「API 测试」发出的请求也会进入该记录。
| GET | `/admin/version` | Admin | 查询当前版本与最新 Release |
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models``/models/{id}``/chat/completions``/responses``/responses/{response_id}``/embeddings``/files``/files/{file_id}`
服务器端记录本质上是 DeepSeek 上游响应归档OpenAI Chat、OpenAI Responses、Claude Messages、Gemini GenerateContent 等直连 DeepSeek 的生成接口,在收到上游响应后会于各协议回译/裁剪前写入记录列表按请求创建时间倒序展示流式请求会在生成过程中持续刷新状态与详情。WebUI「API 测试」发出的请求也会进入该记录。
---
## 健康检查
@@ -231,16 +231,15 @@ OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端
1. 先匹配 DeepSeek 原生模型。
2. 再匹配 `model_aliases` 精确映射。
3. 如果请求名以 `-nothinking` 结尾,则在最终解析出的规范模型上追加对应的无思考变体。
4. 未命中时按模型家族规则回退(如 `o*``gpt-*``claude-*`
5. 仍未命中则返回 `invalid_request_error`
4. 未命中则返回 `invalid_request_error`。当前不会按未知模型家族做启发式兜底;需要新增兼容名时请通过 `model_aliases` 明确配置
当前内置默认 alias 来自 `internal/config/models.go``config.model_aliases` 会在运行时覆盖或补充同名映射。节选:
- OpenAI / Codex`gpt-4o``gpt-4.1``gpt-5``gpt-5.5``gpt-5-codex``gpt-5.3-codex``codex-mini-latest`
- OpenAI reasoning`o1``o3``o3-deep-research``o4-mini`
- Claude`claude-opus-4-6``claude-sonnet-4-6``claude-haiku-4-5``claude-3-5-sonnet-latest`
- Gemini`gemini-2.5-pro``gemini-2.5-flash``gemini-pro-vision`
- 其他兼容族:`llama-*``qwen-*``mistral-*``command-*` 会按家族启发式回退
- Gemini`gemini-2.5-pro``gemini-2.5-flash``gemini-3.1-pro``gemini-3-pro``gemini-3-flash``gemini-3.1-flash-lite``gemini-pro-vision`
- 其他内置精确 alias`llama-3.1-70b-instruct``qwen-max`
上述 alias 若在请求名后追加 `-nothinking` 后缀,也会映射到对应的强制关闭 thinking 版本。
当前视觉能力仅对应 `deepseek-v4-vision` / `deepseek-v4-vision-nothinking`,不会解析出独立的 `vision-search` 变体。
@@ -249,7 +248,7 @@ OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端
### `POST /v1/chat/completions`
> 路径说明:除规范路径 `/v1/chat/completions` 外,也支持根路径快捷别名 `/chat/completions`在 Vercel Runtime 上,这两个路径 `stream=true` 请求都会进入 Node 流式桥接逻辑,非流式仍走 Go 主链路。
> 路径说明:除规范路径 `/v1/chat/completions` 外,也支持根路径快捷别名 `/chat/completions`在 Vercel Runtime 上,`vercel.json` 仅把规范路径 `/v1/chat/completions` 重写到 Node 流式桥接;根路径快捷别名仍走 Go 主链路。因此 Vercel 上需要实时流式时请使用 `/v1/chat/completions`。
**请求头**
@@ -262,7 +261,7 @@ Content-Type: application/json
| 字段 | 类型 | 必填 | 说明 |
| --- | --- | --- | --- |
| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias`gpt-5.5``gpt-5.4-mini``gpt-5.3-codex``o3``claude-opus-4-6``claude-sonnet-4-6``gemini-2.5-pro``gemini-2.5-flash` 等);若模型名带 `-nothinking` 后缀,则强制关闭 thinking / reasoning |
| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias`gpt-5.5``gpt-5.4-mini``gpt-5.3-codex``o3``claude-opus-4-6``claude-sonnet-4-6``gemini-2.5-pro``gemini-3.1-pro``gemini-3-flash` 等);若模型名带 `-nothinking` 后缀,则强制关闭 thinking / reasoning |
| `messages` | array | ✅ | OpenAI 风格消息数组 |
| `stream` | boolean | ❌ | 默认 `false` |
| `tools` | array | ❌ | Function Calling 定义 |
@@ -358,7 +357,8 @@ data: [DONE]
补充说明:
- **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call前后普通文本仍可透传
- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`、DSML wrapper 别名(`<dsml|tool_calls>``<|tool_calls>``<tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、任意协议前缀壳(如 `<proto💥tool_calls>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;这些前缀壳会先归一化回 XML内部仍以 XML 解析语义为准。旧式 `<tools>``<tool_call>``<tool_name>``<param>``<function_call>``tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。
- 解析器当前把推荐 DSML 外壳(`<DSMLtool_calls>` / `<DSMLinvoke name="...">` / `<DSMLparameter name="...">`)、半角 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`、DSML wrapper 别名(`<dsml|tool_calls>``<|tool_calls>``<tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、任意协议前缀壳(如 `<proto💥tool_calls>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;这些前缀壳会先归一化回 XML内部仍以 XML 解析语义为准。旧式 `<tools>``<tool_call>``<tool_name>``<param>``<function_call>``tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理;完整但 malformed 的 wrapper 同样会作为普通文本释放
- 解析层不会因为参数值为空而丢弃工具调用;显式空字符串或纯空白参数会按空字符串进入结构化 `tool_calls`。Prompt 会要求模型不要主动输出空参数,缺参/空命令的拒绝应由工具执行侧或客户端 schema 校验负责。
- 当最终可见正文为空但思维链里包含可执行工具调用时Chat / Responses 会在收尾阶段补发标准 OpenAI `tool_calls` / `function_call` 输出;如果客户端未开启 thinking / reasoning该思维链只用于检测不会作为可见正文或 `reasoning_content` 暴露。
- Markdown fenced code block例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。
@@ -775,6 +775,7 @@ data: {"type":"message_stop"}
- `responses` / `embeddings`
- `auto_delete``mode``none` / `single` / `all`;旧配置 `sessions=true` 仍按 `all` 处理)
- `current_input_file``enabled` 默认返回 `true``min_chars`
- `thinking_injection``enabled` 默认返回 `true``prompt``default_prompt`
- `model_aliases`
- `env_backed``needs_vercel_sync`
- `toolcall` 策略已固定为 `feature_match + high`,不再通过 settings 返回或修改
@@ -789,6 +790,7 @@ data: {"type":"message_stop"}
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `thinking_injection.enabled` / `thinking_injection.prompt`
- `model_aliases`
- `toolcall` 策略已固定,不再作为可写入字段

View File

@@ -134,7 +134,8 @@ flowchart LR
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files`、`GET /v1/files/{file_id}` |
| Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`(及快捷路径 `/v1/messages`、`/messages` |
| Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`(及 `/v1/models/{model}:*` 路径) |
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/admin/*` 统一走同一套 CORS 策略Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
| Ollama 兼容 | `GET /api/version`、`GET /api/tags`、`POST /api/show` |
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/api/*`、`/admin/*` 统一走同一套 CORS 策略Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
| 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 |
| 并发队列控制 | 每账号 in-flight 上限 + 等待队列,动态计算建议并发值 |
| DeepSeek PoW | 纯 Go 高性能实现DeepSeekHashV1毫秒级响应 |
@@ -195,11 +196,11 @@ OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/com
- `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址(例如 `http://127.0.0.1:5001`Claude Code 会请求 `/v1/messages?beta=true`。
- `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致;建议同时保留常规 key 与 `sk-ant-*` 形态 key兼容不同客户端校验习惯。
- 若系统设置了代理,建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`,避免本地回环请求被代理拦截。
- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为推荐的 DSML 工具块:`<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`。兼容层也接受旧式 canonical XML`<tool_calls><invoke name="..."><parameter name="...">...`;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行。
- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为推荐的全角分隔符 DSML 工具块:`<DSMLtool_calls><DSMLinvoke name="..."><DSMLparameter name="...">...`。兼容层也接受半角 DSML 与旧式 canonical XML`<tool_calls><invoke name="..."><parameter name="...">...`;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行,会作为普通文本处理
### Gemini 接口
Gemini 适配器将模型名通过 `model_aliases` 或内置规则映射到 DeepSeek 原生模型,支持 `generateContent` 和 `streamGenerateContent` 两种调用方式,并完整支持 Tool Calling`functionDeclarations` → `functionCall` 输出)。若 Gemini 模型名带 `-nothinking` 后缀,例如 `gemini-2.5-pro-nothinking`,会映射到对应的强制关闭思考模型。
Gemini 适配器将模型名通过 `model_aliases` 或内置精确 alias 映射到 DeepSeek 原生模型(覆盖 `gemini-2.5-*`、`gemini-3*`、`gemini-pro-vision` 等常见名称),支持 `generateContent` 和 `streamGenerateContent` 两种调用方式,并完整支持 Tool Calling`functionDeclarations` → `functionCall` 输出)。若 Gemini 模型名带 `-nothinking` 后缀,例如 `gemini-2.5-pro-nothinking`,会映射到对应的强制关闭思考模型。
## 快速开始
@@ -295,13 +296,13 @@ cp config.example.json config.json
base64 < config.json | tr -d '\n'
```
> **流式说明**OpenAI Chat 流式在 Vercel 上会由 `api/chat-stream.js`Node Runtime承接支持规范路径 `/v1/chat/completions` 根路径快捷别名 `/chat/completions`。鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口完成;流式响应(含 `tools`)在 Node 侧执行与 Go 对齐的输出组装与防泄漏处理。虽然这里只有 OpenAI chat 流式走 Node但 CORS 放行策略仍与 Go 主路由保持一致,统一覆盖第三方客户端预检场景
> **流式说明**OpenAI Chat 流式在 Vercel 上会由 `api/chat-stream.js`Node Runtime承接但 `vercel.json` 只把规范路径 `/v1/chat/completions` 重写到 Node根路径快捷别名 `/chat/completions` 仍走 Go 主链路。鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口完成;流式响应(含 `tools`)在 Node 侧执行与 Go 对齐的输出组装与防泄漏处理。Vercel 上需要实时流式时请使用 `/v1/chat/completions`
详细部署说明请参阅 [部署指南](docs/DEPLOY.md)。
### 方式四:本地源码运行
**前置要求**Go 1.26+Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时);同时确保 `npm` 可用,建议 `npm 10+`
**前置要求**Go 1.26+Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时CI / Docker 构建使用 Node 24);同时确保 `npm` 可用,建议 `npm 10+`
```bash
# 1. 克隆仓库
@@ -320,7 +321,7 @@ go run ./cmd/ds2api
服务实际绑定:`0.0.0.0:5001`,因此同一局域网设备通常也可以通过你的内网 IP 访问。
> **WebUI 自动构建**:本地首次启动时,若 `static/admin` 不存在,会自动尝试执行 `npm ci`(仅在缺少依赖时)和 `npm run build -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js 和 npm。你也可以手动构建`./scripts/build-webui.sh`
> **WebUI 自动构建**:本地首次启动时,若 WebUI 静态目录不存在,会自动尝试执行 `npm ci --prefix webui`(仅在缺少依赖时)和 `npm run build --prefix webui -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js 和 npm;静态目录可用 `DS2API_STATIC_ADMIN_DIR` 覆盖)。你也可以手动构建:`./scripts/build-webui.sh`
## 配置说明
@@ -372,12 +373,13 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `
当请求中带 `tools` 时DS2API 会做防泄漏处理与结构化转译:
1. 只在**非代码块上下文**启用执行型 toolcall 识别(代码块示例默认不触发)
2. 解析层当前把 DSML 外壳视为推荐可执行调用:`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。DSML 只是外壳别名,内部仍以 XML 解析语义为准;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理
2. 解析层当前把全角分隔符 DSML 外壳视为推荐可执行调用:`<DSMLtool_calls>` → `<DSMLinvoke name="...">` → `<DSMLparameter name="...">`;兼容半角 DSML、旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`,以及若干 DSML 前缀/分隔符漂移。DSML 只是外壳别名,内部仍以 XML 解析语义为准;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理,完整但 malformed 的 wrapper 也会作为普通文本释放
3. `responses` 流式严格使用官方 item 生命周期事件(`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`
4. `responses` 支持并执行 `tool_choice``auto`/`none`/`required`/强制函数);`required` 违规时非流式返回 `422`,流式返回 `response.failed`
5. 客户端请求哪种协议就按该协议返回工具调用OpenAI/Claude/Gemini 各自原生结构);模型侧优先约束输出规范 XML再由兼容层转译
> 说明:当前版本 parser 层以”尽量解析成功”为优先,所有格式合法的 XML 工具调用都会通过,不做工具名 allow-list 过滤。
> 解析层会保留显式空字符串或纯空白参数Prompt 会要求模型不要主动输出空参数,缺参/空命令的拒绝应由工具执行侧或客户端 schema 校验负责。
>
> 想评估”把工具调用封装成 XML 再输入模型”的方案,可参考:`docs/toolcall-semantics.md`。

View File

@@ -131,7 +131,8 @@ For the full module-by-module architecture and directory responsibilities, see [
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files`, `GET /v1/files/{file_id}` |
| Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) |
| Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
| Ollama compatible | `GET /api/version`, `GET /api/tags`, `POST /api/show` |
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, `/api/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
| Multi-account rotation | Auto token refresh, email/mobile dual login |
| Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency |
| DeepSeek PoW | Pure Go high-performance solver (DeepSeekHashV1), ms-level response |
@@ -184,11 +185,11 @@ Besides the primary aliases above, `/anthropic/v1/models` also returns Claude 4.
- Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`.
- `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility.
- If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,<your_host_ip>` for DS2API to avoid proxy interception of local traffic.
- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended DSML block: `<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`. DS2API also accepts legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed.
- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended fullwidth-separator DSML block: `<DSMLtool_calls><DSMLinvoke name="..."><DSMLparameter name="...">...`. DS2API also accepts halfwidth DSML and legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed and stay plain text.
### Gemini Endpoint
The Gemini adapter maps model names to DeepSeek native models via `model_aliases` or built-in heuristics, supporting both `generateContent` and `streamGenerateContent` call patterns with full Tool Calling support (`functionDeclarations``functionCall` output).
The Gemini adapter maps model names to DeepSeek native models via `model_aliases` or exact built-in aliases (covering common `gemini-2.5-*`, `gemini-3*`, and `gemini-pro-vision` names), supporting both `generateContent` and `streamGenerateContent` call patterns with full Tool Calling support (`functionDeclarations``functionCall` output). If the Gemini model name has a `-nothinking` suffix, such as `gemini-2.5-pro-nothinking`, it maps to the corresponding forced no-thinking model.
## Quick Start
@@ -283,13 +284,13 @@ Recommended: convert `config.json` to Base64 locally, then paste into `DS2API_CO
base64 < config.json | tr -d '\n'
```
> **Streaming note**: OpenAI Chat streaming on Vercel is routed to `api/chat-stream.js` (Node Runtime), with both the canonical `/v1/chat/completions` path and the root shortcut `/chat/completions` supported. Auth, account selection, and session/PoW preparation are still handled by the Go internal prepare endpoint; streaming output (including `tools`) is assembled on Node with Go-aligned anti-leak handling. This is the only interface family currently routed through Node, and its CORS allow behavior is kept aligned with the Go router so third-party preflight handling stays unified.
> **Streaming note**: OpenAI Chat streaming on Vercel is routed to `api/chat-stream.js` (Node Runtime), but `vercel.json` rewrites only the canonical `/v1/chat/completions` path to Node; the root shortcut `/chat/completions` stays on the Go main path. Auth, account selection, and session/PoW preparation are still handled by the Go internal prepare endpoint; streaming output (including `tools`) is assembled on Node with Go-aligned anti-leak handling. Use `/v1/chat/completions` on Vercel when real-time streaming is required.
For detailed deployment instructions, see the [Deployment Guide](docs/DEPLOY.en.md).
### Option 4: Local Run
**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)
**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally; CI / Docker builds use Node 24), and npm available; npm 10+ is recommended
```bash
# 1. Clone
@@ -308,7 +309,7 @@ Default local URL: `http://127.0.0.1:5001`
The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.
> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`
> **WebUI auto-build**: On first local startup, if the WebUI static directory is missing, DS2API auto-runs `npm ci --prefix webui` (only when dependencies are missing) and `npm run build --prefix webui -- --outDir static/admin --emptyOutDir` (requires Node.js; `DS2API_STATIC_ADMIN_DIR` can override the static directory). You can also build manually: `./scripts/build-webui.sh`
## Configuration
@@ -349,7 +350,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
```
- When inflight slots are full, requests enter a waiting queue — **no immediate 429**
- 429 is returned only when total load exceeds inflight + queue capacity
- 429 is returned only when total load exceeds inflight + queue capacity; current responses do not include `Retry-After`
- Completion empty-output 429s first get the same-account compensation retry; managed-account mode also tries one alternate-account fresh retry before returning the final 429
- `GET /admin/queue/status` returns real-time concurrency state
@@ -358,12 +359,13 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
When `tools` is present in the request, DS2API performs anti-leak handling:
1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored)
2. The parser now treats the DSML shell as the recommended executable tool-calling syntax: `<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`; it also accepts legacy canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text
2. The parser treats the fullwidth-separator DSML shell as the recommended executable tool-calling syntax: `<DSMLtool_calls>``<DSMLinvoke name="...">``<DSMLparameter name="...">`; it also accepts halfwidth DSML, legacy canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`, plus common DSML prefix/separator drift. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text, and complete but malformed wrappers are released as plain text too
3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream
5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation
> Note: the current parser still prioritizes “parse successfully whenever possible”; hard allow-list rejection for undeclared tool names is not enabled yet.
> Explicit empty strings or whitespace-only parameters are preserved by the parser; prompting tells the model not to emit blank parameters, and missing/empty argument rejection belongs in the tool executor or client schema validation.
## Local Dev Packet Capture

View File

@@ -41,6 +41,7 @@ ds2api/
│ │ ├── admin/ # Admin API root assembly and resource packages
│ │ ├── claude/ # Claude HTTP protocol adapter
│ │ ├── gemini/ # Gemini HTTP protocol adapter
│ │ ├── ollama/ # Ollama-compatible model/capability query endpoints
│ │ ├── openai/ # OpenAI HTTP surface
│ │ │ ├── chat/ # Chat Completions execution entrypoint
│ │ │ ├── responses/ # Responses API and response store
@@ -57,6 +58,7 @@ ds2api/
│ ├── prompt/ # Prompt composition
│ ├── promptcompat/ # API request -> DeepSeek web-chat plain-text compatibility
│ ├── rawsample/ # Raw sample read/write and management
│ ├── responsehistory/ # DeepSeek upstream response archive and session snapshots
│ ├── server/ # Router and middleware assembly
│ │ └── data/ # Router/runtime helper data
│ ├── sse/ # SSE parsing utilities
@@ -188,6 +190,7 @@ flowchart LR
- `internal/server`: router tree + middlewares (health, protocol routes, Admin/WebUI).
- `internal/httpapi/openai/*`: OpenAI HTTP surface split into chat, responses, files, embeddings, history, and shared packages; chat/responses share the promptcompat, stream, and toolcall semantics.
- `internal/httpapi/{claude,gemini}`: protocol adapters that normalize into the same prompt compatibility semantics; normal direct paths must share DeepSeek session/PoW/completion execution through `completionruntime`, while `translatorcliproxy` is reserved for Vercel prepare/release, missing-backend fallback, and regression tests.
- `internal/httpapi/ollama`: Ollama-compatible model list and capability query endpoints.
- `internal/httpapi/requestbody`: shared HTTP body reading, JSON pre-validation, and UTF-8 error helpers across protocol adapters.
- `internal/promptcompat`: compatibility core for turning OpenAI/Claude/Gemini requests into DeepSeek web-chat plain-text context.
- `internal/assistantturn`: Go output-side canonical semantics, converting DeepSeek SSE collection results and stream finalization state into assistant turns and centralizing thinking, tool call, citation, usage, stop/error behavior.
@@ -199,6 +202,7 @@ flowchart LR
- `internal/toolcall` + `internal/toolstream`: DSML shell compatibility plus canonical XML tool-call parsing and anti-leak sieve; DSML is normalized back to XML at the entrypoint, and internal parsing remains XML-based.
- `internal/httpapi/admin/*`: Admin API root assembly plus auth/accounts/config/settings/proxies/rawsamples/vercel/history/devcapture/version resource packages.
- `internal/chathistory`: server-side conversation history persistence, pagination, detail lookup, and retention policy.
- `internal/responsehistory`: DeepSeek upstream response archive, saving assistant text, thinking, raw tool-call fragments, and streaming detail before protocol rendering/trimming.
- `internal/config`: config loading/validation + runtime settings hot-reload.
- `internal/account`: managed account pool, inflight slots, waiting queue.
- `internal/textclean`: text cleanup helpers, e.g. stripping `[reference: N]` markers.

View File

@@ -41,6 +41,7 @@ ds2api/
│ │ ├── admin/ # Admin API 根装配与资源子包
│ │ ├── claude/ # Claude HTTP 协议适配
│ │ ├── gemini/ # Gemini HTTP 协议适配
│ │ ├── ollama/ # Ollama 兼容模型/能力查询接口
│ │ ├── openai/ # OpenAI HTTP surface
│ │ │ ├── chat/ # Chat Completions 执行入口
│ │ │ ├── responses/ # Responses API 与 response store
@@ -57,6 +58,7 @@ ds2api/
│ ├── prompt/ # Prompt 组装
│ ├── promptcompat/ # API 请求到 DeepSeek 网页纯文本上下文兼容层
│ ├── rawsample/ # raw sample 读写与管理
│ ├── responsehistory/ # DeepSeek 上游响应归档与会话快照
│ ├── server/ # 路由与中间件装配
│ │ └── data/ # 路由/运行时辅助数据
│ ├── sse/ # SSE 解析工具
@@ -188,6 +190,7 @@ flowchart LR
- `internal/server`路由树和中间件挂载健康检查、协议入口、Admin/WebUI
- `internal/httpapi/openai/*`OpenAI HTTP surface按 chat、responses、files、embeddings、history、shared 拆分chat/responses 共享 promptcompat、stream、toolcall 等核心语义。
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义;正常直连路径必须通过 `completionruntime` 共享 DeepSeek session/PoW/completion 调用,`translatorcliproxy` 仅保留给 Vercel prepare/release、后端缺失 fallback 和回归测试。
- `internal/httpapi/ollama`Ollama 兼容的模型列表与能力查询入口。
- `internal/httpapi/requestbody`跨协议复用的请求体读取、JSON 解码前置校验与 UTF-8 错误处理辅助。
- `internal/promptcompat`OpenAI/Claude/Gemini 请求到 DeepSeek 网页纯文本上下文的兼容内核。
- `internal/assistantturn`Go 输出侧统一语义层,把 DeepSeek SSE 收集结果和流式收尾状态归一成 assistant turn集中处理 thinking、tool call、citation、usage、stop/error 语义。
@@ -199,6 +202,7 @@ flowchart LR
- `internal/toolcall` + `internal/toolstream`DSML 外壳兼容与 canonical XML 工具调用解析、防泄漏筛分DSML 会在入口归一化回 XML内部仍按 XML 语义解析。
- `internal/httpapi/admin/*`Admin API 根装配与 auth/accounts/config/settings/proxies/rawsamples/vercel/history/devcapture/version 等资源子包。
- `internal/chathistory`:服务器端对话记录持久化、分页、单条详情和保留策略。
- `internal/responsehistory`DeepSeek 上游响应归档,会在协议回译/裁剪前保存 assistant text、thinking、tool-call 原始片段和流式详情。
- `internal/config`:配置加载、校验、运行时 settings 热更新。
- `internal/account`:托管账号池、并发槽位、等待队列。
- `internal/textclean`:文本清洗,移除 `[reference: N]` 标记等噪声。

View File

@@ -9,8 +9,8 @@ Thanks for your interest in contributing to DS2API!
### Prerequisites
- Go 1.26+
- Node.js `20.19+` or `22.12+` (for WebUI development)
- npm (bundled with Node.js)
- Node.js `20.19+` or `22.12+` (for WebUI development; CI / Docker builds use Node 24)
- npm (bundled with Node.js; 10+ recommended)
### Backend Development

View File

@@ -9,8 +9,8 @@
### 前置要求
- Go 1.26+
- Node.js `20.19+``22.12+`WebUI 开发时)
- npm随 Node.js 提供)
- Node.js `20.19+``22.12+`WebUI 开发时CI / Docker 构建使用 Node 24
- npm随 Node.js 提供,建议 10+
### 后端开发

View File

@@ -39,8 +39,8 @@ Recommended order when choosing a deployment method:
| Dependency | Minimum Version | Notes |
| --- | --- | --- |
| Go | 1.26+ | Build backend |
| Node.js | `20.19+` or `22.12+` | Only needed to build WebUI locally |
| npm | Bundled with Node.js | Install WebUI dependencies |
| Node.js | `20.19+` or `22.12+` (CI / Docker builds use Node 24) | Only needed to build WebUI locally |
| npm | Bundled with Node.js; 10+ recommended | Install WebUI dependencies |
Config source (choose one):
@@ -299,6 +299,8 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # optional for personal accounts
| `DS2API_VERCEL_INTERNAL_SECRET` | Hybrid streaming internal auth | Falls back to `DS2API_ADMIN_KEY` |
| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | Stream lease TTL | `900` |
| `DS2API_RAW_STREAM_SAMPLE_ROOT` | Raw stream sample root for saving/reading samples | `tests/raw_stream_samples` |
| `DS2API_STATIC_ADMIN_DIR` | WebUI static asset directory | `static/admin` |
| `DS2API_AUTO_BUILD_WEBUI` | Whether local startup auto-builds missing WebUI assets (`1/true/yes/on` or `0/false/no/off`) | Enabled outside Vercel |
| `VERCEL_TOKEN` | Vercel sync token | — |
| `VERCEL_PROJECT_ID` | Vercel project ID | — |
| `VERCEL_TEAM_ID` | Vercel team ID | — |
@@ -321,7 +323,7 @@ Request ──────┐
```
- **Go entry**: `api/index.go` (Serverless Go)
- **Stream entry**: `api/chat-stream.js` (Node Runtime for real-time SSE)
- **Stream entry**: `api/chat-stream.js` (Node Runtime for real-time SSE; `vercel.json` rewrites only the canonical `/v1/chat/completions` path here, while the root shortcut `/chat/completions` stays on the Go entry)
- **Routing**: `vercel.json`
- **Build command**: `npm ci --prefix webui && npm run build --prefix webui` (automatic)
@@ -438,7 +440,7 @@ Default local access URL: `http://127.0.0.1:5001`; the server actually binds to
### 4.2 WebUI Build
On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
On first local startup, if the WebUI static directory is missing, DS2API automatically attempts to build it (requires Node.js/npm; when dependencies are missing it runs `npm ci --prefix webui`, then `npm run build --prefix webui -- --outDir <static-dir> --emptyOutDir`). The default static directory is `static/admin/`, and `DS2API_STATIC_ADMIN_DIR` can override it.
Manual build:

View File

@@ -39,8 +39,8 @@
| 依赖 | 最低版本 | 说明 |
| --- | --- | --- |
| Go | 1.26+ | 编译后端 |
| Node.js | `20.19+``22.12+` | 仅在需要本地构建 WebUI 时 |
| npm | 随 Node.js 提供 | 安装 WebUI 依赖 |
| Node.js | `20.19+``22.12+`CI / Docker 构建使用 Node 24 | 仅在需要本地构建 WebUI 时 |
| npm | 随 Node.js 提供,建议 10+ | 安装 WebUI 依赖 |
配置来源(任选其一):
@@ -299,6 +299,8 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # 个人账号可留空
| `DS2API_VERCEL_INTERNAL_SECRET` | 混合流式内部鉴权 | 回退用 `DS2API_ADMIN_KEY` |
| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | 流式 lease TTL | `900` |
| `DS2API_RAW_STREAM_SAMPLE_ROOT` | raw stream 样本保存/读取根目录 | `tests/raw_stream_samples` |
| `DS2API_STATIC_ADMIN_DIR` | WebUI 静态资源目录 | `static/admin` |
| `DS2API_AUTO_BUILD_WEBUI` | 本地启动时是否自动构建缺失的 WebUI`1/true/yes/on``0/false/no/off` | 非 Vercel 默认开启 |
| `VERCEL_TOKEN` | Vercel 同步 token | — |
| `VERCEL_PROJECT_ID` | Vercel 项目 ID | — |
| `VERCEL_TEAM_ID` | Vercel 团队 ID | — |
@@ -331,7 +333,7 @@ api/index.go api/chat-stream.js
```
- **入口文件**`api/index.go`Serverless Go
- **流式入口**`api/chat-stream.js`Node Runtime保证实时 SSE
- **流式入口**`api/chat-stream.js`Node Runtime保证实时 SSE`vercel.json` 仅把规范路径 `/v1/chat/completions` 重写到这里,根路径快捷别名 `/chat/completions` 仍走 Go 入口
- **路由重写**`vercel.json`
- **构建命令**`npm ci --prefix webui && npm run build --prefix webui`(自动执行)
@@ -448,7 +450,7 @@ go run ./cmd/ds2api
### 4.2 WebUI 构建
本地首次启动时,若 `static/admin/` 不存在,服务会自动尝试构建 WebUI需要 Node.js/npm缺依赖时会先执行 `npm ci`,再执行 `npm run build -- --outDir static/admin --emptyOutDir`
本地首次启动时,若 WebUI 静态目录不存在,服务会自动尝试构建 WebUI需要 Node.js/npm缺依赖时会先执行 `npm ci --prefix webui`,再执行 `npm run build --prefix webui -- --outDir <静态目录> --emptyOutDir`)。默认静态目录为 `static/admin/`,可用 `DS2API_STATIC_ADMIN_DIR` 覆盖
你也可以手动构建:

View File

@@ -81,7 +81,7 @@ Tool call 问题优先跑:
```bash
go test -v ./internal/toolcall ./internal/toolstream -count=1
node --test tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js
./tests/scripts/run-unit-node.sh
```
## 5. 测试选择

View File

@@ -75,7 +75,7 @@ npm run build --prefix webui
1. **Preflight 检查**
- `go test ./... -count=1`(单元测试)
- `./tests/scripts/check-node-split-syntax.sh`Node 拆分模块语法门禁)
- `node --test tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/js_compat_test.js`
- `node --test --test-concurrency=1 tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/chat-history-utils.test.js tests/node/js_compat_test.js`
- `npm run build --prefix webui`WebUI 构建检查)
2. **隔离启动**:复制 `config.json` 到临时目录,启动独立服务进程
@@ -203,10 +203,10 @@ go test ./...
```bash
# 运行 tool calls 相关测试(推荐用于调试 tool call 解析问题)
go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
go test -v -run 'TestParseToolCalls|TestProcessToolSieve|TestRepair' ./internal/toolcall ./internal/toolstream
# 运行单个测试用例
go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/
go test -v -run TestParseToolCallsAllowsAllEmptyParameterPayload ./internal/toolcall
# 运行 format 相关测试
go test -v ./internal/format/...
@@ -221,23 +221,23 @@ go test -v ./internal/httpapi/openai/...
```bash
# 1. 运行 tool calls 相关的所有测试
go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
go test -v -run 'TestParseToolCalls|TestProcessToolSieve|TestRepair' ./internal/toolcall ./internal/toolstream
# 2. 查看测试输出中的详细调试信息
go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/ 2>&1
go test -v -run TestProcessToolSieveReleasesMalformedExecutableXMLBlock ./internal/toolstream 2>&1
# 3. 检查具体测试用例的修复效果
# 测试用例位于 internal/toolcall/toolcalls_test.go包含
# - TestParseToolCallsWithDeepSeekHallucination: DeepSeek 典型幻觉输出
# 重点测试位于 internal/toolcall/toolcalls_test.go 与 internal/toolstream/tool_sieve_xml_test.go,包含:
# - TestParseToolCallsAllowsAllEmptyParameterPayload: 空参数结构化保留
# - TestProcessToolSieveReleasesMalformedExecutableXMLBlock: malformed XML wrapper 释放为文本
# - TestRepairLooseJSONWithNestedObjects: 嵌套对象的方括号修复
# - TestParseToolCallsWithMixedWindowsPaths: Windows 路径处理
```
### 运行 Node.js 测试
```bash
# 运行 Node 测试
node --test tests/node/stream-tool-sieve.test.js
node --test --test-concurrency=1 tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/chat-history-utils.test.js tests/node/js_compat_test.js
# 或使用脚本
./tests/scripts/run-unit-node.sh

View File

@@ -111,7 +111,7 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`。Go 主服务新增 `completionruntime` 启动层,统一执行 DeepSeek session/PoW/call输出侧新增 `assistantturn` 语义层:非流式 OpenAI Chat / Responses / Claude / Gemini 会把 DeepSeek SSE 收集结果先归一成同一份 assistant turn再分别渲染成各协议原生外形流式 OpenAI Chat / Responses / Claude / Gemini 继续保持各协议实时 SSE framing但最终收尾的 tool fallback、schema 归一、usage、empty-output / content-filter 错误语义同样由 `assistantturn` 判定。Claude / Gemini 的常规 Go 主路径不再依赖内部 `httptest` 转发到 OpenAI handler`translatorcliproxy` 仅保留用于 Vercel bridge、后端缺失 fallback 和回归测试,不作为主业务协议转换中心。
- Vercel Node 流式路径本轮不迁移,仍使用现有 Node bridge / stream-tool-sieve 实现;后续若变更 Node 流式语义,需要按 `assistantturn` 的 Go canonical 输出语义同步对齐。
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。未显式关闭时,各 surface 会按解析后的 DeepSeek 模型默认能力开启 thinking并用各自协议的原生形态暴露OpenAI Chat 为 `reasoning_content`OpenAI Responses 为 `response.reasoning.delta` / `reasoning` contentClaude 为 `thinking` block / `thinking_delta`Gemini 为 `thought: true` part。
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本,因此即使最终可见层会剥离完整 leaked DSML / XML `tool_calls` wrapper、并抑制无效 wrapper 块,也不会影响真实工具调用转成结构化 `tool_calls` / `function_call`。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning思维链只用于检测不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本最终可见层会剥离已经成功解析成工具调用的完整 leaked DSML / XML `tool_calls` wrapper;如果遇到完整 wrapper 但内部形态不符合可执行工具调用语义(例如 `<param>` 这类 malformed XML 工具壳),流式 sieve 会把该块作为普通文本释放,而不是吞掉或伪造成工具调用。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning思维链只用于检测不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
- OpenAI Chat / Responses、Claude Messages、Gemini generateContent 的空回复错误处理之前会默认做一次内部补偿重试:第一次上游完整结束后,如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用,并且终止原因不是 `content_filter`,兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略,把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。Go 主路径的非流式重试由 `completionruntime.ExecuteNonStreamWithRetry` 统一处理;流式重试由 `completionruntime.ExecuteStreamWithRetry` 统一处理,各协议 runtime 只负责消费/渲染本协议 SSE framing。重试遵循 DeepSeek 多轮对话协议:从第一次上游 SSE 流中提取 `response_message_id`,并在重试 payload 中设置 `parent_message_id` 为该值,使重试成为同一会话的后续轮次而非断裂的根消息;同时重新获取一次 PoW若 PoW 获取失败则回退到原始 PoW。该同账号重试不会重新标准化消息、不会新建 session也不会向流式客户端插入重试标记第二次 thinking / reasoning 会按正常增量直接接到第一次之后,并继续使用 overlap trim 去重。若同账号补偿重试后即将返回 429 `upstream_empty_output`并且当前是托管账号模式Go 主路径会在返回 429 前切换到下一个可用账号,新建 `chat_session_id`,使用原始 completion payload 再做一次 fresh retry该切号重试不携带空回复 prompt 后缀,也不设置上一账号的 `parent_message_id`。如果没有可切换账号,或切号后的 fresh retry 仍没有可见正文或工具调用,则继续按原错误返回:无任何输出为 503 `upstream_unavailable`,有 reasoning 但没有可见正文或工具调用为 429 `upstream_empty_output`。若任一尝试触发空 `content_filter`,不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`,但因无法直接调用 PoW API 而复用原始 PoW切号 fresh retry 目前由 Go 主路径提供。
- 非流式 OpenAI Chat / Responses、Claude Messages、Gemini generateContent 在最终可见正文渲染阶段,会把 DeepSeek 搜索返回中的 `[citation:N]` / `[reference:N]` 标记替换成对应 Markdown 链接。`citation` 标记按一基序号解析;`reference` 标记只有在同一段正文中出现 `[reference:0]`(允许冒号后有空格)时才按零基序号映射,并且不会影响同段正文里的 `citation` 标记。
@@ -167,7 +167,7 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
3. 再附上统一的 DSML tool call 外壳格式约束。
4. 把这整段内容并入 system prompt。
工具调用正例现在优先示范官方 DSML 风格:`<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`
工具调用正例现在优先示范全角分隔符 DSML 风格:`<DSMLtool_calls>``<DSMLinvoke name="...">``<DSMLparameter name="...">`
兼容层仍接受旧式纯 `<tool_calls>` wrapper并会容错若干 DSML 标签变体,包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`、下划线形式 `<dsml_tool_calls>` / `<dsml_invoke>` / `<dsml_parameter>`,以及其他前缀分隔形态如 `<vendor|tool_calls>` / `<vendor_tool_calls>` / `<vendor - tool_calls>`;标签壳扫描还会把全角 ASCII 漂移归一化,例如 `<tool_calls>` 与全角 `` 结束符。更一般地Go / Node tag 扫描以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准,标签名前任意协议前缀壳都会在解析入口剥离,例如 `<DSML␂tool_calls>``<proto💥tool_calls>` 这类控制符或非 ASCII 分隔符漂移也会归一化回现有 XML 标签后继续走同一套 parser。但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现。解析器会先截获非代码块中的疑似工具 wrapper完整解析失败或工具语义无效时再按普通文本放行。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外,解析器还会回收一些更松散的列表写法,例如 JSON array 字面量或逗号分隔的 JSON 项序列,只要它们足够明确;但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。对 `command` / `content` 等长文本参数CDATA 内部的 Markdown fenced DSML / XML 示例会作为原文保护;示例里的 `]]></parameter>``</tool_calls>` 不会截断外层工具调用,解析器会继续等待围栏外真正的参数 / wrapper 结束标签。
Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限;当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时,非流式收集、流式解析和 auto-continue 透传都会保留完整行,再进入同一套工具解析与序列化流程。
@@ -215,11 +215,11 @@ assistant 的 reasoning 会变成一个显式标签块:
assistant 历史 `tool_calls` 不会保留成 OpenAI 原生 JSON而会转成 prompt 可见的 DSML 外壳:
```xml
<|DSML|tool_calls>
<|DSML|invoke name="read_file">
<|DSML|parameter name="path"><![CDATA[src/main.go]]></|DSML|parameter>
</|DSML|invoke>
</|DSML|tool_calls>
<DSMLtool_calls>
<DSMLinvoke name="read_file">
<DSMLparameter name="path"><![CDATA[src/main.go]]></DSMLparameter>
</DSMLinvoke>
</DSMLtool_calls>
```
解析层同时兼容旧式纯 XML 形态:`<tool_calls>` / `<invoke>` / `<parameter>`。两者都会先归一到现有 XML 解析语义;其他旧格式都会作为普通文本保留,不会作为可执行调用语法。
@@ -424,7 +424,8 @@ Prior conversation history and tool progress.
如果改的是 tool call 相关兼容语义,还应同时检查:
- `go test ./internal/toolcall/...`
- `node --test tests/node/stream-tool-sieve.test.js`
- `go test ./internal/toolstream/...`
- `./tests/scripts/run-unit-node.sh`
## 14. 文档同步约定

View File

@@ -6,14 +6,14 @@
## 1) 当前可执行格式
当前版本推荐模型输出 DSML 外壳:
当前版本推荐模型输出全角分隔符 DSML 外壳:
```xml
<|DSML|tool_calls>
<|DSML|invoke name="read_file">
<|DSML|parameter name="path"><![CDATA[README.MD]]></|DSML|parameter>
</|DSML|invoke>
</|DSML|tool_calls>
<DSMLtool_calls>
<DSMLinvoke name="read_file">
<DSMLparameter name="path"><![CDATA[README.MD]]></DSMLparameter>
</DSMLinvoke>
</DSMLtool_calls>
```
兼容层仍接受旧式 canonical XML
@@ -30,10 +30,10 @@
约束:
- 必须有 `<|DSML|tool_calls>...</|DSML|tool_calls>``<tool_calls>...</tool_calls>` wrapper
- 每个调用必须在 `<|DSML|invoke name="...">...</|DSML|invoke>``<invoke name="...">...</invoke>`
- 必须有 `<DSMLtool_calls>...</DSMLtool_calls>``<tool_calls>...</tool_calls>` wrapper
- 每个调用必须在 `<DSMLinvoke name="...">...</DSMLinvoke>``<invoke name="...">...</invoke>`
- 工具名必须放在 `invoke``name` 属性
- 参数必须使用 `<|DSML|parameter name="...">...</|DSML|parameter>``<parameter name="...">...</parameter>`
- 参数必须使用 `<DSMLparameter name="...">...</DSMLparameter>``<parameter name="...">...</parameter>`
- 同一个工具块内不要混用 DSML 标签和旧 XML 工具标签;混搭会被视为非法工具块
兼容修复:
@@ -54,7 +54,7 @@
在流式链路中Go / Node 一致):
- DSML `<|DSML|tool_calls>` wrapper、短横线形式`<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- DSML `<DSMLtool_calls>` wrapper、短横线形式`<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapperGo 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
- 已识别成功的工具调用不会再次回流到普通文本
- 不符合新格式的块不会执行,并继续按原样文本透传
@@ -80,6 +80,8 @@
解析层不会因为参数值为空而丢弃工具调用。若模型输出了显式空字符串或纯空白参数,它们会按空字符串进入结构化 `tool_calls`;是否拒绝缺参或空命令应由后续工具执行侧 / 客户端 schema 校验决定。Prompt 层仍会要求模型不要主动输出空参数。
完整的 DSML / XML wrapper 只有在成功解析出有效 `invoke name`,并且参数节点(如存在)符合 `parameter` 语义后,才会变成结构化工具调用;真正的零参数工具调用仍然有效。如果 wrapper 完整但内部不是可执行工具调用形态(例如使用 `<param>`、缺少有效 `invoke name`、或其他 malformed XML 工具壳),流式 sieve 会把原始 wrapper 作为普通文本释放,不会吞掉内容,也不会生成空的工具调用。
## 5) 落地建议
1. Prompt 里只示范 DSML 外壳语法。
@@ -93,17 +95,18 @@
```bash
go test -v -run 'TestParseToolCalls|TestProcessToolSieve' ./internal/toolcall ./internal/toolstream ./internal/httpapi/openai/...
node --test tests/node/stream-tool-sieve.test.js
./tests/scripts/run-unit-node.sh
```
重点覆盖:
- DSML `<|DSML|tool_calls>` wrapper 正常解析
- DSML `<DSMLtool_calls>` wrapper 正常解析
- legacy canonical `<tool_calls>` wrapper 正常解析
- 固定本地标签名的 DSML 噪声容错形态(如 `<DSML|tool_calls>``<<|DSML|tool_calls>``<|DSML tool_calls>``<DSMLtool_calls>``<<DSML|DSML|tool_calls>`)正常解析
- 混搭标签DSML wrapper + canonical inner归一化后正常解析
- 波浪线围栏 `~~~` 内的示例不执行
- 嵌套围栏4 反引号嵌套 3 反引号)内的示例不执行
- 文本 mention 标签名后紧跟真正工具调用的场景(含同一 wrapper 变体)
- 空参数结构化保留malformed executable-looking XML wrapper 作为文本释放
- 非兼容内容按普通文本透传
- 代码块示例不执行

View File

@@ -93,10 +93,10 @@ func TestNormalizeClaudeMessagesToolUseToAssistantToolCalls(t *testing.T) {
t.Fatalf("expected call id preserved, got %#v", call)
}
content, _ := m["content"].(string)
if !containsStr(content, "<|DSML|tool_calls>") || !containsStr(content, `<|DSML|invoke name="search_web">`) {
if !containsStr(content, "<DSMLtool_calls>") || !containsStr(content, `<DSMLinvoke name="search_web">`) {
t.Fatalf("expected assistant content to include DSML tool call history, got %q", content)
}
if !containsStr(content, `<|DSML|parameter name="query"><![CDATA[latest]]></|DSML|parameter>`) {
if !containsStr(content, `<DSMLparameter name="query"><![CDATA[latest]]></DSMLparameter>`) {
t.Fatalf("expected assistant content to include serialized parameters, got %q", content)
}
}
@@ -133,7 +133,7 @@ func TestNormalizeClaudeMessagesPreservesThinkingOnToolUseHistory(t *testing.T)
if !containsStr(prompt, "[reasoning_content]\nneed live search before answering\n[/reasoning_content]") {
t.Fatalf("expected thinking in prompt history, got %q", prompt)
}
if !containsStr(prompt, `<|DSML|invoke name="search_web">`) {
if !containsStr(prompt, `<DSMLinvoke name="search_web">`) {
t.Fatalf("expected tool call in prompt history, got %q", prompt)
}
}

View File

@@ -89,7 +89,7 @@ func TestGeminiMessagesFromRequestPreservesThoughtOnFunctionCallHistory(t *testi
if !strings.Contains(prompt, "[reasoning_content]\nneed current state before answering\n[/reasoning_content]") {
t.Fatalf("expected thought in prompt history, got %q", prompt)
}
if !strings.Contains(prompt, `<|DSML|invoke name="search_web">`) {
if !strings.Contains(prompt, `<DSMLinvoke name="search_web">`) {
t.Fatalf("expected tool call in prompt history, got %q", prompt)
}
}

View File

@@ -84,7 +84,7 @@ func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *
"latest user turn",
"[reasoning_content]",
"hidden reasoning",
"<|DSML|tool_calls>",
"<DSMLtool_calls>",
} {
if !strings.Contains(transcript, want) {
t.Fatalf("expected transcript to contain %q, got %q", want, transcript)

View File

@@ -16,6 +16,15 @@ var promptXMLTextEscaper = strings.NewReplacer(
var promptXMLNamePattern = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_.:-]*$`)
const (
promptDSMLToolCallsOpen = "<DSMLtool_calls>"
promptDSMLToolCallsClose = "</DSMLtool_calls>"
promptDSMLInvokeOpen = "<DSMLinvoke"
promptDSMLInvokeClose = "</DSMLinvoke>"
promptDSMLParameterOpen = "<DSMLparameter"
promptDSMLParameterClose = "</DSMLparameter>"
)
// FormatToolCallsForPrompt renders a tool_calls slice into the prompt-visible
// invoke/parameter history block used across adapters.
func FormatToolCallsForPrompt(raw any) string {
@@ -38,7 +47,7 @@ func FormatToolCallsForPrompt(raw any) string {
if len(blocks) == 0 {
return ""
}
return "<|DSML|tool_calls>\n" + strings.Join(blocks, "\n") + "\n</|DSML|tool_calls>"
return promptDSMLToolCallsOpen + "\n" + strings.Join(blocks, "\n") + "\n" + promptDSMLToolCallsClose
}
// StringifyToolCallArguments normalizes tool arguments into a compact string
@@ -94,12 +103,12 @@ func formatToolCallForPrompt(call map[string]any) string {
parameters := formatToolCallParametersForPrompt(argsRaw)
if parameters == "" {
return ` <|DSML|invoke name="` + escapeXMLAttribute(name) + `"></|DSML|invoke>`
return ` ` + promptDSMLInvokeOpen + ` name="` + escapeXMLAttribute(name) + `">` + promptDSMLInvokeClose
}
return " <|DSML|invoke name=\"" + escapeXMLAttribute(name) + "\">\n" +
return " " + promptDSMLInvokeOpen + " name=\"" + escapeXMLAttribute(name) + "\">\n" +
parameters + "\n" +
" </|DSML|invoke>"
" " + promptDSMLInvokeClose
}
func formatToolCallParametersForPrompt(raw any) string {
@@ -113,7 +122,7 @@ func formatToolCallParametersForPrompt(raw any) string {
if strings.TrimSpace(fallback) == "" {
return ""
}
return " <|DSML|parameter name=\"content\">" + renderPromptXMLText(fallback) + "</|DSML|parameter>"
return " " + promptDSMLParameterOpen + " name=\"content\">" + renderPromptXMLText(fallback) + promptDSMLParameterClose
}
func renderPromptToolParameters(value any, indent string) (string, bool) {
@@ -149,9 +158,9 @@ func renderPromptToolParameters(value any, indent string) (string, bool) {
}
return strings.Join(lines, "\n"), true
case string:
return indent + `<|DSML|parameter name="content">` + renderPromptXMLText(v) + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="content">` + renderPromptXMLText(v) + promptDSMLParameterClose, true
default:
return indent + `<|DSML|parameter name="value">` + renderPromptXMLText(fmt.Sprint(v)) + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="value">` + renderPromptXMLText(fmt.Sprint(v)) + promptDSMLParameterClose, true
}
}
@@ -162,29 +171,29 @@ func renderPromptParameterNode(name string, value any, indent string) (string, b
}
switch v := value.(type) {
case nil:
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
case map[string]any:
body, ok := renderPromptToolXMLBody(v, indent+" ")
if !ok {
return "", false
}
if strings.TrimSpace(body) == "" {
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
}
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + promptDSMLParameterClose, true
case []any:
body, ok := renderPromptToolXMLArray(v, indent+" ")
if !ok {
return "", false
}
if strings.TrimSpace(body) == "" {
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
}
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + promptDSMLParameterClose, true
case string:
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(v) + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(v) + promptDSMLParameterClose, true
default:
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(fmt.Sprint(v)) + `</|DSML|parameter>`, true
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(fmt.Sprint(v)) + promptDSMLParameterClose, true
}
}

View File

@@ -22,7 +22,7 @@ func TestFormatToolCallsForPromptDSML(t *testing.T) {
if got == "" {
t.Fatal("expected non-empty formatted tool calls")
}
if got != "<|DSML|tool_calls>\n <|DSML|invoke name=\"search_web\">\n <|DSML|parameter name=\"query\"><![CDATA[latest]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>" {
if got != "<DSMLtool_calls>\n <DSMLinvoke name=\"search_web\">\n <DSMLparameter name=\"query\"><![CDATA[latest]]></DSMLparameter>\n </DSMLinvoke>\n</DSMLtool_calls>" {
t.Fatalf("unexpected formatted tool call DSML: %q", got)
}
}
@@ -34,7 +34,7 @@ func TestFormatToolCallsForPromptEscapesXMLEntities(t *testing.T) {
"arguments": `{"q":"a < b && c > d"}`,
},
})
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"search&lt;&amp;&gt;\">\n <|DSML|parameter name=\"q\"><![CDATA[a < b && c > d]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
want := "<DSMLtool_calls>\n <DSMLinvoke name=\"search&lt;&amp;&gt;\">\n <DSMLparameter name=\"q\"><![CDATA[a < b && c > d]]></DSMLparameter>\n </DSMLinvoke>\n</DSMLtool_calls>"
if got != want {
t.Fatalf("unexpected escaped tool call XML: %q", got)
}
@@ -50,7 +50,7 @@ func TestFormatToolCallsForPromptUsesCDATAForMultilineContent(t *testing.T) {
},
},
})
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"write_file\">\n <|DSML|parameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></|DSML|parameter>\n <|DSML|parameter name=\"path\"><![CDATA[script.sh]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
want := "<DSMLtool_calls>\n <DSMLinvoke name=\"write_file\">\n <DSMLparameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></DSMLparameter>\n <DSMLparameter name=\"path\"><![CDATA[script.sh]]></DSMLparameter>\n </DSMLinvoke>\n</DSMLtool_calls>"
if got != want {
t.Fatalf("unexpected multiline cdata tool call XML: %q", got)
}

View File

@@ -38,10 +38,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
t.Fatalf("expected 4 normalized messages with assistant tool history preserved, got %d", len(normalized))
}
assistantContent, _ := normalized[2]["content"].(string)
if !strings.Contains(assistantContent, "<|DSML|tool_calls>") {
if !strings.Contains(assistantContent, "<DSMLtool_calls>") {
t.Fatalf("assistant tool history should be preserved in DSML form, got %q", assistantContent)
}
if !strings.Contains(assistantContent, `<|DSML|invoke name="get_weather">`) {
if !strings.Contains(assistantContent, `<DSMLinvoke name="get_weather">`) {
t.Fatalf("expected tool name in preserved history, got %q", assistantContent)
}
if !strings.Contains(normalized[3]["content"].(string), `"temp":18`) {
@@ -49,7 +49,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
}
prompt := util.MessagesPrepare(normalized)
if !strings.Contains(prompt, "<|DSML|tool_calls>") {
if !strings.Contains(prompt, "<DSMLtool_calls>") {
t.Fatalf("expected preserved assistant tool history in prompt: %q", prompt)
}
}
@@ -177,10 +177,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantMultipleToolCallsRemainSepara
t.Fatalf("expected assistant tool_call-only message preserved, got %#v", normalized)
}
content, _ := normalized[0]["content"].(string)
if strings.Count(content, "<|DSML|invoke name=") != 2 {
if strings.Count(content, "<DSMLinvoke name=") != 2 {
t.Fatalf("expected two preserved tool call blocks, got %q", content)
}
if !strings.Contains(content, `<|DSML|invoke name="search_web">`) || !strings.Contains(content, `<|DSML|invoke name="eval_javascript">`) {
if !strings.Contains(content, `<DSMLinvoke name="search_web">`) || !strings.Contains(content, `<DSMLinvoke name="eval_javascript">`) {
t.Fatalf("expected both tool names in preserved history, got %q", content)
}
}
@@ -258,7 +258,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantNilContentDoesNotInjectNullLi
if strings.Contains(content, "null") {
t.Fatalf("expected no null literal injection, got %q", content)
}
if !strings.Contains(content, "<|DSML|tool_calls>") {
if !strings.Contains(content, "<DSMLtool_calls>") {
t.Fatalf("expected assistant tool history in normalized content, got %q", content)
}
}

View File

@@ -47,10 +47,10 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes
if !strings.Contains(finalPrompt, `"condition":"sunny"`) {
t.Fatalf("handler finalPrompt should preserve tool output content: %q", finalPrompt)
}
if !strings.Contains(finalPrompt, "<|DSML|tool_calls>") {
if !strings.Contains(finalPrompt, "<DSMLtool_calls>") {
t.Fatalf("handler finalPrompt should preserve assistant tool history: %q", finalPrompt)
}
if !strings.Contains(finalPrompt, `<|DSML|invoke name="get_weather">`) {
if !strings.Contains(finalPrompt, `<DSMLinvoke name="get_weather">`) {
t.Fatalf("handler finalPrompt should include tool name history: %q", finalPrompt)
}
}

View File

@@ -88,7 +88,7 @@ func TestNormalizeResponsesInputArrayMergesReasoningMessageIntoFunctionCallHisto
if !strings.Contains(history, "[reasoning_content]\nneed fresh docs before answering\n[/reasoning_content]") {
t.Fatalf("expected reasoning in history transcript, got %q", history)
}
if !strings.Contains(history, `<|DSML|invoke name="search_web">`) {
if !strings.Contains(history, `<DSMLinvoke name="search_web">`) {
t.Fatalf("expected tool call in history transcript, got %q", history)
}
}