mirror of
https://github.com/CJackHwang/ds2api.git
synced 2026-05-10 03:07:41 +08:00
Merge pull request #469 from CJackHwang/dev
feat: tool-call markup parsing resilience — CJK, arbitrary prefixes, control separators, and retry hardening
This commit is contained in:
@@ -22,6 +22,13 @@ These rules apply to all agent-made changes in this repository.
|
||||
- Keep changes additive and tightly scoped to the requested feature or bugfix.
|
||||
- Do not mix unrelated refactors into feature PRs unless they are required to make the change pass gates.
|
||||
|
||||
## Protocol Adapter Boundary
|
||||
|
||||
- Do not let OpenAI Chat, OpenAI Responses, Claude, Gemini, or other interface protocol formatting own shared business behavior.
|
||||
- Normalize protocol-specific request shapes into the project standard request/turn model first, run shared business logic in one place, then render back to the target protocol at the boundary.
|
||||
- Business logic that must stay globally consistent includes empty-output retry, thinking/reasoning handling, tool-call detection and policy, usage accounting, current-input-file injection, history persistence, file/reference handling, and completion payload assembly.
|
||||
- If a behavior must differ by protocol, keep the difference as an explicit adapter/rendering concern and document why it cannot live in the shared normalized path.
|
||||
|
||||
## Documentation Sync
|
||||
|
||||
- When business logic or user-visible behavior changes, update the corresponding documentation in the same change.
|
||||
|
||||
29
API.en.md
29
API.en.md
@@ -32,7 +32,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
|
||||
| Base URL | `http://localhost:5001` or your deployment domain |
|
||||
| Default Content-Type | `application/json` |
|
||||
| Health probes | `GET /healthz`, `GET /readyz` |
|
||||
| CORS | Enabled (uniformly covers `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*`; echoes the browser `Origin` when present, otherwise `*`; default allow-list includes `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`, and also accepts third-party preflight-requested headers such as `x-stainless-*`; `/v1/chat/completions` on Vercel Node Runtime matches the same behavior; internal-only `X-Ds2-Internal-Token` remains blocked) |
|
||||
| CORS | Enabled (uniformly covers `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, `/api/*`, and `/admin/*`; echoes the browser `Origin` when present, otherwise `*`; default allow-list includes `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`, and also accepts third-party preflight-requested headers such as `x-stainless-*`; `/v1/chat/completions` on Vercel Node Runtime matches the same behavior; internal-only `X-Ds2-Internal-Token` remains blocked) |
|
||||
|
||||
- All JSON request bodies must be valid UTF-8; malformed byte sequences are rejected on ingress with `400 invalid json`.
|
||||
|
||||
@@ -40,8 +40,10 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
|
||||
|
||||
- OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`.
|
||||
- Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths.
|
||||
- Tool-calling semantics are aligned between Go and Node runtime: models should output the DSML shell `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; DS2API also accepts legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. DSML is normalized back to XML at the parser entry, so internal parsing remains XML-based, with stream-time anti-leak filtering.
|
||||
- Tool-calling semantics are aligned between Go and Node runtime: models should output the fullwidth-separator DSML shell `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; DS2API also accepts the halfwidth DSML wrapper `<|DSML|tool_calls>`, DSML wrapper aliases such as `<dsml|tool_calls>`, `<|tool_calls>`, `<|tool_calls>`, common DSML separator drift such as `<|DSML tool_calls>`, collapsed DSML local names such as `<DSMLtool_calls>`, control-separator drift such as `<DSML␂tool_calls>` / raw STX `\x02`, CJK angle bracket and trailing attribute separator drift such as `<DSM|parameter name="command"|>...〈/DSM|parameter〉`, arbitrary protocol prefixes such as `<proto💥tool_calls>`, and legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. The scanner normalizes fixed local names (`tool_calls` / `invoke` / `parameter`) back to XML before parsing; only wrapped tool blocks or the narrow missing-opening-wrapper repair path enter the tool path, while bare `<invoke>` does not count as supported syntax. JSON literal parameter bodies are preserved as structured values, explicit empty or whitespace-only parameters are preserved as empty strings, malformed complete wrappers are released as plain text, and loose CDATA is narrowly repaired at final parse/flush when it can preserve a complete outer tool call.
|
||||
- `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior.
|
||||
- When upstream returns a thinking-only response with no visible text, the Go main path for both streaming and non-streaming completions retries once in the same DeepSeek session: it appends the prompt suffix `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` and sets `parent_message_id`. If that same-account retry would still end as `429 upstream_empty_output`, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once before returning 429.
|
||||
- Citation/reference marker boundary: streaming output hides upstream `[citation:N]` / `[reference:N]` placeholders by default; non-stream output converts DeepSeek search reference markers into Markdown links.
|
||||
|
||||
---
|
||||
|
||||
@@ -84,7 +86,7 @@ Two header formats accepted:
|
||||
- Token is in `config.keys` → **Managed account mode**: DS2API auto-selects an account via rotation
|
||||
- Token is not in `config.keys` → **Direct token mode**: treated as a DeepSeek token directly
|
||||
|
||||
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account; if the target account does not exist or the managed-account queue is exhausted, the request returns `429`, and current responses do not include `Retry-After`. If the account exists but login/refresh fails, the request returns the underlying `401` or upstream error.
|
||||
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account; if the target account does not exist or the managed-account queue is exhausted, the request returns `429`, and current responses do not include `Retry-After`. If the account exists but login/refresh fails, the request returns the underlying `401` or upstream error. Without a pinned target, managed-account completion requests try one alternate-account fresh retry before returning an empty-output 429; pinned-target requests and requests with no other available account do not switch.
|
||||
Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=` as the caller credential source.
|
||||
|
||||
### Admin Endpoints (`/admin/*`)
|
||||
@@ -226,16 +228,18 @@ For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-outp
|
||||
|
||||
1. Match DeepSeek native model IDs first.
|
||||
2. Then match exact keys in `model_aliases`.
|
||||
3. If still unmatched, fall back by known family heuristics (`o*`, `gpt-*`, `claude-*`, etc.).
|
||||
4. If still unmatched, return `invalid_request_error`.
|
||||
3. If the request name ends with `-nothinking`, resolve the base alias and append the corresponding no-thinking variant.
|
||||
4. If still unmatched, return `invalid_request_error`. Unknown model families are not guessed heuristically; add explicit compatibility names through `model_aliases`.
|
||||
|
||||
Built-in aliases come from `internal/config/models.go`; `config.model_aliases` can override or add mappings at runtime. Excerpt:
|
||||
|
||||
- OpenAI / Codex: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5.5`, `gpt-5-codex`, `gpt-5.3-codex`, `codex-mini-latest`
|
||||
- OpenAI reasoning: `o1`, `o3`, `o3-deep-research`, `o4-mini`
|
||||
- Claude: `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-3-5-sonnet-latest`
|
||||
- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-pro-vision`
|
||||
- Other compatibility families: `llama-*`, `qwen-*`, `mistral-*`, and `command-*` fall back through family heuristics
|
||||
- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-3.1-pro`, `gemini-3-pro`, `gemini-3-flash`, `gemini-3.1-flash-lite`, `gemini-pro-vision`
|
||||
- Other exact built-in aliases: `llama-3.1-70b-instruct`, `qwen-max`
|
||||
|
||||
Aliases with a `-nothinking` suffix also map to the corresponding forced no-thinking DeepSeek model.
|
||||
|
||||
Current vision support resolves only to `deepseek-v4-vision` and does not expose a separate `vision-search` variant.
|
||||
|
||||
@@ -243,7 +247,7 @@ Retired historical families such as `claude-1.*`, `claude-2.*`, `claude-instant-
|
||||
|
||||
### `POST /v1/chat/completions`
|
||||
|
||||
> Path note: besides the canonical `/v1/chat/completions`, DS2API also accepts the root shortcut `/chat/completions`. On Vercel Runtime, `stream=true` on either path is handled by the Node streaming bridge, while non-stream stays on the Go primary path.
|
||||
> Path note: besides the canonical `/v1/chat/completions`, DS2API also accepts the root shortcut `/chat/completions`. On Vercel Runtime, `vercel.json` rewrites only the canonical `/v1/chat/completions` path to the Node streaming bridge; the root shortcut stays on the Go primary path. Use `/v1/chat/completions` on Vercel when real-time streaming is required.
|
||||
|
||||
**Headers**:
|
||||
|
||||
@@ -256,7 +260,7 @@ Content-Type: application/json
|
||||
|
||||
| Field | Type | Required | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5.5`, `gpt-5.4-mini`, `gpt-5.3-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-2.5-flash`, etc.) |
|
||||
| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5.5`, `gpt-5.4-mini`, `gpt-5.3-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-3.1-pro`, `gemini-3-flash`, etc.); `-nothinking` suffixes force thinking / reasoning off |
|
||||
| `messages` | array | ✅ | OpenAI-style messages |
|
||||
| `stream` | boolean | ❌ | Default `false` |
|
||||
| `tools` | array | ❌ | Function calling schema |
|
||||
@@ -351,7 +355,8 @@ When `tools` is present, DS2API performs anti-leak handling:
|
||||
|
||||
Additional notes:
|
||||
|
||||
- The parser treats DSML shell tool blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`) and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. DSML is normalized back to XML at the parser entry; internal parsing remains XML-based. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text.
|
||||
- The parser treats the recommended DSML shell tool blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`), halfwidth DSML shell blocks (`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`), DSML wrapper aliases (`<dsml|tool_calls>`, `<|tool_calls>`, `<|tool_calls>`), common DSML separator drift (`<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`), collapsed DSML local names (`<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`), control-separator drift (`<DSML␂tool_calls>` / raw STX `\x02`), CJK angle bracket and trailing attribute separator drift (`<DSM|parameter name="command"|>...〈/DSM|parameter〉`), arbitrary protocol prefixes (`<proto💥tool_calls>`), and legacy canonical XML tool blocks (`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`) as executable tool calls. These shells normalize back to XML first, while internal parsing remains XML-based. Legacy `<tools>`, `<tool_call>`, `<tool_name>`, `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text; complete but malformed wrappers are also released as plain text.
|
||||
- The parser no longer drops tool calls solely because parameter values are empty; explicit empty strings or whitespace-only parameters become empty strings in structured `tool_calls`. Prompting still tells the model not to emit blank parameters, and missing/empty argument rejection belongs in the tool executor or client schema validation.
|
||||
- If the final visible response text is empty but the reasoning stream contains an executable tool call, Chat / Responses emits a standard OpenAI `tool_calls` / `function_call` output during finalization. If thinking/reasoning was not enabled by the client, that reasoning text is used only for detection and is not exposed as visible text or `reasoning_content`.
|
||||
- `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls.
|
||||
|
||||
@@ -764,6 +769,7 @@ Reads runtime settings and status, including:
|
||||
- `responses` / `embeddings`
|
||||
- `auto_delete` (`mode`: `none` / `single` / `all`; legacy `sessions=true` is still treated as `all`)
|
||||
- `current_input_file` (`enabled` defaults to `true`, plus `min_chars`)
|
||||
- `thinking_injection` (`enabled` defaults to `true`, `prompt`, and `default_prompt`)
|
||||
- `model_aliases`
|
||||
- `env_backed`, `needs_vercel_sync`
|
||||
- `toolcall` policy is fixed to `feature_match + high` and is no longer returned or editable via settings
|
||||
@@ -778,6 +784,7 @@ Hot-updates runtime settings. Supported fields:
|
||||
- `embeddings.provider`
|
||||
- `auto_delete.mode`
|
||||
- `current_input_file.enabled` / `current_input_file.min_chars`
|
||||
- `thinking_injection.enabled` / `thinking_injection.prompt`
|
||||
- `model_aliases`
|
||||
- `toolcall` policy is fixed and is no longer writable through settings
|
||||
|
||||
@@ -1258,7 +1265,7 @@ Clients should handle HTTP status code plus `error` / `detail` fields.
|
||||
| Code | Meaning |
|
||||
| --- | --- |
|
||||
| `401` | Authentication failed (invalid key/token, or expired admin JWT) |
|
||||
| `429` | Too many requests (exceeded inflight + queue capacity; current responses do not include `Retry-After`) |
|
||||
| `429` | Too many requests (exceeded inflight + queue capacity, or upstream thinking-only output with no visible answer; managed-account mode first tries one alternate-account fresh retry; current responses do not include `Retry-After`) |
|
||||
| `503` | Model unavailable or upstream error |
|
||||
|
||||
---
|
||||
|
||||
30
API.md
30
API.md
@@ -32,7 +32,7 @@
|
||||
| Base URL | `http://localhost:5001` 或你的部署域名 |
|
||||
| 默认 Content-Type | `application/json` |
|
||||
| 健康检查 | `GET /healthz`、`GET /readyz` |
|
||||
| CORS | 已启用(统一覆盖 `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/admin/*`;浏览器有 `Origin` 时回显该 Origin,否则为 `*`;默认允许 `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`,并会放行预检里声明的第三方请求头,如 `x-stainless-*`;Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同行为;内部专用头 `X-Ds2-Internal-Token` 仍被拦截) |
|
||||
| CORS | 已启用(统一覆盖 `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/api/*`、`/admin/*`;浏览器有 `Origin` 时回显该 Origin,否则为 `*`;默认允许 `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`,并会放行预检里声明的第三方请求头,如 `x-stainless-*`;Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同行为;内部专用头 `X-Ds2-Internal-Token` 仍被拦截) |
|
||||
|
||||
- 所有 JSON 请求体都必须是合法 UTF-8;非法字节序列会在入站阶段被拒绝为 `400 invalid json`。
|
||||
|
||||
@@ -40,9 +40,9 @@
|
||||
|
||||
- OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。
|
||||
- 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。
|
||||
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo(如 `<DSMLtool_calls>`),以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。实现上采用窄容错结构扫描:只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123`、`true`、`null`、数组或对象),会按结构化值输出,不再一律当作字符串;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
|
||||
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出全角分隔符 DSML 外壳 `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容层也接受半角 DSML wrapper `<|DSML|tool_calls>`、DSML wrapper 别名 `<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo(如 `<DSMLtool_calls>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、CJK 尖括号与属性尾部分隔符漂移(如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉`)、任意协议前缀壳(如 `<proto💥tool_calls>`),以及旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。实现上采用结构扫描:只要固定本地标签名是 `tool_calls` / `invoke` / `parameter`,前缀壳会在解析入口归一化;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123`、`true`、`null`、数组或对象),会按结构化值输出,不再一律当作字符串;显式空字符串和纯空白参数会结构化保留为空字符串,是否拒绝缺参由工具执行侧决定;完整但 malformed 的 wrapper 会作为普通文本释放,不会吞掉或伪造成工具调用;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
|
||||
- `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。
|
||||
- 当上游返回 thinking-only 响应(模型输出了推理链但无可见文本)时,非流式补全会自动重试一次:以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出;重试最大 1 次。
|
||||
- 当上游返回 thinking-only 响应(模型输出了推理链但无可见文本)时,Go 主路径的流式与非流式补全都会先自动重试一次:以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出;同账号重试最大 1 次。若同账号重试后仍即将返回 `429 upstream_empty_output`,托管账号模式会在返回 429 前自动切换到下一个可用账号,新建 session,用原始 payload 再 fresh retry 一次。
|
||||
- 引用标记处理边界:流式输出默认隐藏 `[citation:N]` / `[reference:N]` 这类上游内部占位符;非流式输出默认把 DeepSeek 搜索引用标记转换为 Markdown 引用链接。
|
||||
|
||||
---
|
||||
@@ -86,7 +86,7 @@ Vercel 一键部署可先只填 `DS2API_ADMIN_KEY`,部署后在 `/admin` 导
|
||||
- token 在 `config.keys` 中 → **托管账号模式**,自动轮询选择账号
|
||||
- token 不在 `config.keys` 中 → **直通 token 模式**,直接作为 DeepSeek token 使用
|
||||
|
||||
**可选请求头**:`X-Ds2-Target-Account: <email_or_mobile>` — 指定使用某个托管账号;如果目标账号不存在,或管理账号队列已耗尽,相关业务请求会返回 `429`,当前不会附带 `Retry-After` 头。若账号存在但登录/刷新失败,则返回对应的 `401` 或上游错误。
|
||||
**可选请求头**:`X-Ds2-Target-Account: <email_or_mobile>` — 指定使用某个托管账号;如果目标账号不存在,或管理账号队列已耗尽,相关业务请求会返回 `429`,当前不会附带 `Retry-After` 头。若账号存在但登录/刷新失败,则返回对应的 `401` 或上游错误。未指定目标账号时,托管账号模式的 completion 空输出 429 会先尝试切到另一个可用账号 fresh retry 一次;指定目标账号或无其他可用账号时不会切号。
|
||||
Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=` 作为凭据来源。
|
||||
|
||||
### Admin 接口(`/admin/*`)
|
||||
@@ -172,12 +172,12 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
|
||||
| GET | `/admin/chat-history/{id}` | Admin | 查看单条服务器端对话记录 |
|
||||
| DELETE | `/admin/chat-history/{id}` | Admin | 删除单条服务器端对话记录 |
|
||||
| PUT | `/admin/chat-history/settings` | Admin | 更新对话记录保留条数 |
|
||||
|
||||
服务器端记录本质上是 DeepSeek 上游响应归档:OpenAI Chat、OpenAI Responses、Claude Messages、Gemini GenerateContent 等直连 DeepSeek 的生成接口,在收到上游响应后会于各协议回译/裁剪前写入记录;列表按请求创建时间倒序展示,流式请求会在生成过程中持续刷新状态与详情。WebUI「API 测试」发出的请求也会进入该记录。
|
||||
| GET | `/admin/version` | Admin | 查询当前版本与最新 Release |
|
||||
|
||||
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models`、`/models/{id}`、`/chat/completions`、`/responses`、`/responses/{response_id}`、`/embeddings`、`/files`、`/files/{file_id}`。
|
||||
|
||||
服务器端记录本质上是 DeepSeek 上游响应归档:OpenAI Chat、OpenAI Responses、Claude Messages、Gemini GenerateContent 等直连 DeepSeek 的生成接口,在收到上游响应后会于各协议回译/裁剪前写入记录;列表按请求创建时间倒序展示,流式请求会在生成过程中持续刷新状态与详情。WebUI「API 测试」发出的请求也会进入该记录。
|
||||
|
||||
---
|
||||
|
||||
## 健康检查
|
||||
@@ -231,16 +231,15 @@ OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端
|
||||
1. 先匹配 DeepSeek 原生模型。
|
||||
2. 再匹配 `model_aliases` 精确映射。
|
||||
3. 如果请求名以 `-nothinking` 结尾,则在最终解析出的规范模型上追加对应的无思考变体。
|
||||
4. 未命中时按模型家族规则回退(如 `o*`、`gpt-*`、`claude-*`)。
|
||||
5. 仍未命中则返回 `invalid_request_error`。
|
||||
4. 仍未命中则返回 `invalid_request_error`。当前不会按未知模型家族做启发式兜底;需要新增兼容名时请通过 `model_aliases` 明确配置。
|
||||
|
||||
当前内置默认 alias 来自 `internal/config/models.go`,`config.model_aliases` 会在运行时覆盖或补充同名映射。节选:
|
||||
|
||||
- OpenAI / Codex:`gpt-4o`、`gpt-4.1`、`gpt-5`、`gpt-5.5`、`gpt-5-codex`、`gpt-5.3-codex`、`codex-mini-latest`
|
||||
- OpenAI reasoning:`o1`、`o3`、`o3-deep-research`、`o4-mini`
|
||||
- Claude:`claude-opus-4-6`、`claude-sonnet-4-6`、`claude-haiku-4-5`、`claude-3-5-sonnet-latest`
|
||||
- Gemini:`gemini-2.5-pro`、`gemini-2.5-flash`、`gemini-pro-vision`
|
||||
- 其他兼容族:`llama-*`、`qwen-*`、`mistral-*`、`command-*` 会按家族启发式回退
|
||||
- Gemini:`gemini-2.5-pro`、`gemini-2.5-flash`、`gemini-3.1-pro`、`gemini-3-pro`、`gemini-3-flash`、`gemini-3.1-flash-lite`、`gemini-pro-vision`
|
||||
- 其他内置精确 alias:`llama-3.1-70b-instruct`、`qwen-max`
|
||||
|
||||
上述 alias 若在请求名后追加 `-nothinking` 后缀,也会映射到对应的强制关闭 thinking 版本。
|
||||
当前视觉能力仅对应 `deepseek-v4-vision` / `deepseek-v4-vision-nothinking`,不会解析出独立的 `vision-search` 变体。
|
||||
@@ -249,7 +248,7 @@ OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端
|
||||
|
||||
### `POST /v1/chat/completions`
|
||||
|
||||
> 路径说明:除规范路径 `/v1/chat/completions` 外,也支持根路径快捷别名 `/chat/completions`;在 Vercel Runtime 上,这两个路径的 `stream=true` 请求都会进入 Node 流式桥接逻辑,非流式仍走 Go 主链路。
|
||||
> 路径说明:除规范路径 `/v1/chat/completions` 外,也支持根路径快捷别名 `/chat/completions`。在 Vercel Runtime 上,`vercel.json` 仅把规范路径 `/v1/chat/completions` 重写到 Node 流式桥接;根路径快捷别名仍走 Go 主链路。因此 Vercel 上需要实时流式时请使用 `/v1/chat/completions`。
|
||||
|
||||
**请求头**:
|
||||
|
||||
@@ -262,7 +261,7 @@ Content-Type: application/json
|
||||
|
||||
| 字段 | 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- |
|
||||
| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias(如 `gpt-5.5`、`gpt-5.4-mini`、`gpt-5.3-codex`、`o3`、`claude-opus-4-6`、`claude-sonnet-4-6`、`gemini-2.5-pro`、`gemini-2.5-flash` 等);若模型名带 `-nothinking` 后缀,则强制关闭 thinking / reasoning |
|
||||
| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias(如 `gpt-5.5`、`gpt-5.4-mini`、`gpt-5.3-codex`、`o3`、`claude-opus-4-6`、`claude-sonnet-4-6`、`gemini-2.5-pro`、`gemini-3.1-pro`、`gemini-3-flash` 等);若模型名带 `-nothinking` 后缀,则强制关闭 thinking / reasoning |
|
||||
| `messages` | array | ✅ | OpenAI 风格消息数组 |
|
||||
| `stream` | boolean | ❌ | 默认 `false` |
|
||||
| `tools` | array | ❌ | Function Calling 定义 |
|
||||
@@ -358,7 +357,8 @@ data: [DONE]
|
||||
补充说明:
|
||||
|
||||
- **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call(前后普通文本仍可透传)。
|
||||
- 解析器当前把 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)、`DSML` 与工具标签名黏连的常见 typo(如 `<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;DSML 会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理。
|
||||
- 解析器当前把推荐 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、半角 DSML 外壳(`<|DSML|tool_calls>` / `<|DSML|invoke name="...">` / `<|DSML|parameter name="...">`)、DSML wrapper 别名(`<dsml|tool_calls>`、`<|tool_calls>`、`<|tool_calls>`)、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>` / `<|DSML invoke>` / `<|DSML parameter>`)、`DSML` 与工具标签名黏连的常见 typo(如 `<DSMLtool_calls>` / `<DSMLinvoke>` / `<DSMLparameter>`)、控制分隔符漂移(如 `<DSML␂tool_calls>` / 原始 STX `\x02`)、CJK 尖括号与属性尾部分隔符漂移(如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉`)、任意协议前缀壳(如 `<proto💥tool_calls>`)和旧式 canonical XML 工具块(`<tool_calls>` / `<invoke name="...">` / `<parameter name="...">`)作为可执行调用解析;这些前缀壳会先归一化回 XML,内部仍以 XML 解析语义为准。旧式 `<tools>`、`<tool_call>`、`<tool_name>`、`<param>`、`<function_call>`、`tool_use`、antml 风格与纯 JSON `tool_calls` 片段默认都会按普通文本处理;完整但 malformed 的 wrapper 同样会作为普通文本释放。
|
||||
- 解析层不会因为参数值为空而丢弃工具调用;显式空字符串或纯空白参数会按空字符串进入结构化 `tool_calls`。Prompt 会要求模型不要主动输出空参数,缺参/空命令的拒绝应由工具执行侧或客户端 schema 校验负责。
|
||||
- 当最终可见正文为空但思维链里包含可执行工具调用时,Chat / Responses 会在收尾阶段补发标准 OpenAI `tool_calls` / `function_call` 输出;如果客户端未开启 thinking / reasoning,该思维链只用于检测,不会作为可见正文或 `reasoning_content` 暴露。
|
||||
- Markdown fenced code block(例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。
|
||||
|
||||
@@ -775,6 +775,7 @@ data: {"type":"message_stop"}
|
||||
- `responses` / `embeddings`
|
||||
- `auto_delete`(`mode`:`none` / `single` / `all`;旧配置 `sessions=true` 仍按 `all` 处理)
|
||||
- `current_input_file`(`enabled` 默认返回 `true`、`min_chars`)
|
||||
- `thinking_injection`(`enabled` 默认返回 `true`、`prompt`、`default_prompt`)
|
||||
- `model_aliases`
|
||||
- `env_backed`、`needs_vercel_sync`
|
||||
- `toolcall` 策略已固定为 `feature_match + high`,不再通过 settings 返回或修改
|
||||
@@ -789,6 +790,7 @@ data: {"type":"message_stop"}
|
||||
- `embeddings.provider`
|
||||
- `auto_delete.mode`
|
||||
- `current_input_file.enabled` / `current_input_file.min_chars`
|
||||
- `thinking_injection.enabled` / `thinking_injection.prompt`
|
||||
- `model_aliases`
|
||||
- `toolcall` 策略已固定,不再作为可写入字段
|
||||
|
||||
@@ -1271,7 +1273,7 @@ Gemini 路由使用 Google 风格错误结构:
|
||||
| 状态码 | 说明 |
|
||||
| --- | --- |
|
||||
| `401` | 鉴权失败(key/token 无效,或 Admin JWT 过期) |
|
||||
| `429` | 请求过多(超出并发上限 + 等待队列;当前不附带 `Retry-After` 头) |
|
||||
| `429` | 请求过多(超出并发上限 + 等待队列,或上游账号 thinking-only 后仍无可见输出;托管账号模式会先尝试一次切号 fresh retry;当前不附带 `Retry-After` 头) |
|
||||
| `503` | 模型不可用或上游服务异常 |
|
||||
|
||||
---
|
||||
|
||||
18
README.MD
18
README.MD
@@ -134,7 +134,8 @@ flowchart LR
|
||||
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files`、`GET /v1/files/{file_id}` |
|
||||
| Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`(及快捷路径 `/v1/messages`、`/messages`) |
|
||||
| Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`(及 `/v1/models/{model}:*` 路径) |
|
||||
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/admin/*` 统一走同一套 CORS 策略;Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
|
||||
| Ollama 兼容 | `GET /api/version`、`GET /api/tags`、`POST /api/show` |
|
||||
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/api/*`、`/admin/*` 统一走同一套 CORS 策略;Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
|
||||
| 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 |
|
||||
| 并发队列控制 | 每账号 in-flight 上限 + 等待队列,动态计算建议并发值 |
|
||||
| DeepSeek PoW | 纯 Go 高性能实现(DeepSeekHashV1),毫秒级响应 |
|
||||
@@ -195,11 +196,11 @@ OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/com
|
||||
- `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址(例如 `http://127.0.0.1:5001`),Claude Code 会请求 `/v1/messages?beta=true`。
|
||||
- `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致;建议同时保留常规 key 与 `sk-ant-*` 形态 key,兼容不同客户端校验习惯。
|
||||
- 若系统设置了代理,建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`,避免本地回环请求被代理拦截。
|
||||
- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为推荐的 DSML 工具块:`<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`。兼容层也接受旧式 canonical XML:`<tool_calls><invoke name="..."><parameter name="...">...`;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行。
|
||||
- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为推荐的全角分隔符 DSML 工具块:`<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`。兼容层也接受半角 DSML 与旧式 canonical XML:`<tool_calls><invoke name="..."><parameter name="...">...`;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` 或纯 JSON `tool_calls` 片段不会执行,会作为普通文本处理。
|
||||
|
||||
### Gemini 接口
|
||||
|
||||
Gemini 适配器将模型名通过 `model_aliases` 或内置规则映射到 DeepSeek 原生模型,支持 `generateContent` 和 `streamGenerateContent` 两种调用方式,并完整支持 Tool Calling(`functionDeclarations` → `functionCall` 输出)。若 Gemini 模型名带 `-nothinking` 后缀,例如 `gemini-2.5-pro-nothinking`,会映射到对应的强制关闭思考模型。
|
||||
Gemini 适配器将模型名通过 `model_aliases` 或内置精确 alias 映射到 DeepSeek 原生模型(覆盖 `gemini-2.5-*`、`gemini-3*`、`gemini-pro-vision` 等常见名称),支持 `generateContent` 和 `streamGenerateContent` 两种调用方式,并完整支持 Tool Calling(`functionDeclarations` → `functionCall` 输出)。若 Gemini 模型名带 `-nothinking` 后缀,例如 `gemini-2.5-pro-nothinking`,会映射到对应的强制关闭思考模型。
|
||||
|
||||
## 快速开始
|
||||
|
||||
@@ -295,13 +296,13 @@ cp config.example.json config.json
|
||||
base64 < config.json | tr -d '\n'
|
||||
```
|
||||
|
||||
> **流式说明**:OpenAI Chat 流式在 Vercel 上会由 `api/chat-stream.js`(Node Runtime)承接,支持规范路径 `/v1/chat/completions` 与根路径快捷别名 `/chat/completions`。鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口完成;流式响应(含 `tools`)在 Node 侧执行与 Go 对齐的输出组装与防泄漏处理。虽然这里只有 OpenAI chat 流式走 Node,但 CORS 放行策略仍与 Go 主路由保持一致,统一覆盖第三方客户端预检场景。
|
||||
> **流式说明**:OpenAI Chat 流式在 Vercel 上会由 `api/chat-stream.js`(Node Runtime)承接,但 `vercel.json` 只把规范路径 `/v1/chat/completions` 重写到 Node;根路径快捷别名 `/chat/completions` 仍走 Go 主链路。鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口完成;流式响应(含 `tools`)在 Node 侧执行与 Go 对齐的输出组装与防泄漏处理。Vercel 上需要实时流式时请使用 `/v1/chat/completions`。
|
||||
|
||||
详细部署说明请参阅 [部署指南](docs/DEPLOY.md)。
|
||||
|
||||
### 方式四:本地源码运行
|
||||
|
||||
**前置要求**:Go 1.26+,Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时);同时确保 `npm` 可用,建议 `npm 10+`
|
||||
**前置要求**:Go 1.26+,Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时;CI / Docker 构建使用 Node 24);同时确保 `npm` 可用,建议 `npm 10+`
|
||||
|
||||
```bash
|
||||
# 1. 克隆仓库
|
||||
@@ -320,7 +321,7 @@ go run ./cmd/ds2api
|
||||
|
||||
服务实际绑定:`0.0.0.0:5001`,因此同一局域网设备通常也可以通过你的内网 IP 访问。
|
||||
|
||||
> **WebUI 自动构建**:本地首次启动时,若 `static/admin` 不存在,会自动尝试执行 `npm ci`(仅在缺少依赖时)和 `npm run build -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js 和 npm)。你也可以手动构建:`./scripts/build-webui.sh`
|
||||
> **WebUI 自动构建**:本地首次启动时,若 WebUI 静态目录不存在,会自动尝试执行 `npm ci --prefix webui`(仅在缺少依赖时)和 `npm run build --prefix webui -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js 和 npm;静态目录可用 `DS2API_STATIC_ADMIN_DIR` 覆盖)。你也可以手动构建:`./scripts/build-webui.sh`
|
||||
|
||||
## 配置说明
|
||||
|
||||
@@ -350,6 +351,7 @@ go run ./cmd/ds2api
|
||||
|
||||
可选请求头 `X-Ds2-Target-Account`:指定使用某个托管账号(值为 email 或 mobile)。
|
||||
如果指定账号不存在,或者当前管理账号队列已满,请求会返回 `429`;当前 `429` 不附带 `Retry-After` 头。若账号存在但登录/刷新失败,则返回对应的鉴权错误。
|
||||
未指定目标账号时,如果 completion 因上游 thinking-only 空输出在同账号补偿重试后仍将返回 `429 upstream_empty_output`,托管账号模式会自动切到下一个可用账号,新建 session,并用原始 payload 再 fresh retry 一次。
|
||||
Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `?key=` / `?api_key=` 作为调用方凭据。
|
||||
|
||||
## 并发模型
|
||||
@@ -363,6 +365,7 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `
|
||||
|
||||
- 当 in-flight 槽位满时,请求进入等待队列,**不会立即 429**
|
||||
- 超出总承载上限后才返回 `429 Too Many Requests`,当前响应不附带 `Retry-After`
|
||||
- completion 空输出类 429 会先做同账号补偿重试;托管账号模式还会在最终返回 429 前切到另一个可用账号 fresh retry 一次
|
||||
- `GET /admin/queue/status` 返回实时并发状态
|
||||
|
||||
## Tool Call 适配
|
||||
@@ -370,12 +373,13 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `
|
||||
当请求中带 `tools` 时,DS2API 会做防泄漏处理与结构化转译:
|
||||
|
||||
1. 只在**非代码块上下文**启用执行型 toolcall 识别(代码块示例默认不触发)
|
||||
2. 解析层当前把 DSML 外壳视为推荐可执行调用:`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`。DSML 只是外壳别名,内部仍以 XML 解析语义为准;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理
|
||||
2. 解析层当前把全角分隔符 DSML 外壳视为推荐可执行调用:`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`;兼容半角 DSML、旧式 canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`,以及若干 DSML 前缀/分隔符漂移。DSML 只是外壳别名,内部仍以 XML 解析语义为准;旧式 `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`、`<function_call>`、`tool_use` / antml 变体与纯 JSON `tool_calls` 片段都会按普通文本处理,完整但 malformed 的 wrapper 也会作为普通文本释放
|
||||
3. `responses` 流式严格使用官方 item 生命周期事件(`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`)
|
||||
4. `responses` 支持并执行 `tool_choice`(`auto`/`none`/`required`/强制函数);`required` 违规时非流式返回 `422`,流式返回 `response.failed`
|
||||
5. 客户端请求哪种协议,就按该协议返回工具调用(OpenAI/Claude/Gemini 各自原生结构);模型侧优先约束输出规范 XML,再由兼容层转译
|
||||
|
||||
> 说明:当前版本 parser 层以”尽量解析成功”为优先,所有格式合法的 XML 工具调用都会通过,不做工具名 allow-list 过滤。
|
||||
> 解析层会保留显式空字符串或纯空白参数;Prompt 会要求模型不要主动输出空参数,缺参/空命令的拒绝应由工具执行侧或客户端 schema 校验负责。
|
||||
>
|
||||
> 想评估”把工具调用封装成 XML 再输入模型”的方案,可参考:`docs/toolcall-semantics.md`。
|
||||
|
||||
|
||||
20
README.en.md
20
README.en.md
@@ -131,7 +131,8 @@ For the full module-by-module architecture and directory responsibilities, see [
|
||||
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files`, `GET /v1/files/{file_id}` |
|
||||
| Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) |
|
||||
| Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
|
||||
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
|
||||
| Ollama compatible | `GET /api/version`, `GET /api/tags`, `POST /api/show` |
|
||||
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, `/api/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
|
||||
| Multi-account rotation | Auto token refresh, email/mobile dual login |
|
||||
| Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency |
|
||||
| DeepSeek PoW | Pure Go high-performance solver (DeepSeekHashV1), ms-level response |
|
||||
@@ -184,11 +185,11 @@ Besides the primary aliases above, `/anthropic/v1/models` also returns Claude 4.
|
||||
- Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`.
|
||||
- `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility.
|
||||
- If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,<your_host_ip>` for DS2API to avoid proxy interception of local traffic.
|
||||
- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended DSML block: `<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`. DS2API also accepts legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed.
|
||||
- If tool calls are rendered as plain text and not executed, first verify the model output uses the recommended fullwidth-separator DSML block: `<|DSML|tool_calls><|DSML|invoke name="..."><|DSML|parameter name="...">...`. DS2API also accepts halfwidth DSML and legacy canonical XML: `<tool_calls><invoke name="..."><parameter name="...">...`; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, or standalone JSON `tool_calls` are not executed and stay plain text.
|
||||
|
||||
### Gemini Endpoint
|
||||
|
||||
The Gemini adapter maps model names to DeepSeek native models via `model_aliases` or built-in heuristics, supporting both `generateContent` and `streamGenerateContent` call patterns with full Tool Calling support (`functionDeclarations` → `functionCall` output).
|
||||
The Gemini adapter maps model names to DeepSeek native models via `model_aliases` or exact built-in aliases (covering common `gemini-2.5-*`, `gemini-3*`, and `gemini-pro-vision` names), supporting both `generateContent` and `streamGenerateContent` call patterns with full Tool Calling support (`functionDeclarations` → `functionCall` output). If the Gemini model name has a `-nothinking` suffix, such as `gemini-2.5-pro-nothinking`, it maps to the corresponding forced no-thinking model.
|
||||
|
||||
## Quick Start
|
||||
|
||||
@@ -283,13 +284,13 @@ Recommended: convert `config.json` to Base64 locally, then paste into `DS2API_CO
|
||||
base64 < config.json | tr -d '\n'
|
||||
```
|
||||
|
||||
> **Streaming note**: OpenAI Chat streaming on Vercel is routed to `api/chat-stream.js` (Node Runtime), with both the canonical `/v1/chat/completions` path and the root shortcut `/chat/completions` supported. Auth, account selection, and session/PoW preparation are still handled by the Go internal prepare endpoint; streaming output (including `tools`) is assembled on Node with Go-aligned anti-leak handling. This is the only interface family currently routed through Node, and its CORS allow behavior is kept aligned with the Go router so third-party preflight handling stays unified.
|
||||
> **Streaming note**: OpenAI Chat streaming on Vercel is routed to `api/chat-stream.js` (Node Runtime), but `vercel.json` rewrites only the canonical `/v1/chat/completions` path to Node; the root shortcut `/chat/completions` stays on the Go main path. Auth, account selection, and session/PoW preparation are still handled by the Go internal prepare endpoint; streaming output (including `tools`) is assembled on Node with Go-aligned anti-leak handling. Use `/v1/chat/completions` on Vercel when real-time streaming is required.
|
||||
|
||||
For detailed deployment instructions, see the [Deployment Guide](docs/DEPLOY.en.md).
|
||||
|
||||
### Option 4: Local Run
|
||||
|
||||
**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)
|
||||
**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally; CI / Docker builds use Node 24), and npm available; npm 10+ is recommended
|
||||
|
||||
```bash
|
||||
# 1. Clone
|
||||
@@ -308,7 +309,7 @@ Default local URL: `http://127.0.0.1:5001`
|
||||
|
||||
The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.
|
||||
|
||||
> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`
|
||||
> **WebUI auto-build**: On first local startup, if the WebUI static directory is missing, DS2API auto-runs `npm ci --prefix webui` (only when dependencies are missing) and `npm run build --prefix webui -- --outDir static/admin --emptyOutDir` (requires Node.js; `DS2API_STATIC_ADMIN_DIR` can override the static directory). You can also build manually: `./scripts/build-webui.sh`
|
||||
|
||||
## Configuration
|
||||
|
||||
@@ -336,6 +337,7 @@ For business endpoints (`/v1/*`, `/anthropic/*`, Gemini routes), DS2API supports
|
||||
| **Direct token** | If the token is not in `config.keys`, DS2API treats it as a DeepSeek token directly |
|
||||
|
||||
Optional header `X-Ds2-Target-Account`: Pin a specific managed account (value is email or mobile).
|
||||
When no target account is pinned, if a completion would end as `429 upstream_empty_output` after the same-account empty-output retry, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once.
|
||||
Gemini routes also accept `x-goog-api-key`, or `?key=` / `?api_key=` when no auth header is present.
|
||||
|
||||
## Concurrency Model
|
||||
@@ -348,7 +350,8 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
|
||||
```
|
||||
|
||||
- When inflight slots are full, requests enter a waiting queue — **no immediate 429**
|
||||
- 429 is returned only when total load exceeds inflight + queue capacity
|
||||
- 429 is returned only when total load exceeds inflight + queue capacity; current responses do not include `Retry-After`
|
||||
- Completion empty-output 429s first get the same-account compensation retry; managed-account mode also tries one alternate-account fresh retry before returning the final 429
|
||||
- `GET /admin/queue/status` returns real-time concurrency state
|
||||
|
||||
## Tool Call Adaptation
|
||||
@@ -356,12 +359,13 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
|
||||
When `tools` is present in the request, DS2API performs anti-leak handling:
|
||||
|
||||
1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored)
|
||||
2. The parser now treats the DSML shell as the recommended executable tool-calling syntax: `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; it also accepts legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text
|
||||
2. The parser treats the fullwidth-separator DSML shell as the recommended executable tool-calling syntax: `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; it also accepts halfwidth DSML, legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`, plus common DSML prefix/separator drift. DSML is a shell alias and internal parsing remains XML-based; legacy `<tools>` / `<tool_call>` / `<tool_name>` / `<param>`, `<function_call>`, `tool_use`, antml variants, and standalone JSON `tool_calls` payloads are treated as plain text, and complete but malformed wrappers are released as plain text too
|
||||
3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
|
||||
4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream
|
||||
5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation
|
||||
|
||||
> Note: the current parser still prioritizes “parse successfully whenever possible”; hard allow-list rejection for undeclared tool names is not enabled yet.
|
||||
> Explicit empty strings or whitespace-only parameters are preserved by the parser; prompting tells the model not to emit blank parameters, and missing/empty argument rejection belongs in the tool executor or client schema validation.
|
||||
|
||||
## Local Dev Packet Capture
|
||||
|
||||
|
||||
@@ -27,7 +27,7 @@ ds2api/
|
||||
│ ├── claudeconv/ # Claude message conversion helpers
|
||||
│ ├── compat/ # Compatibility and regression helpers
|
||||
│ ├── assistantturn/ # Upstream output to canonical assistant turn / stream event semantics
|
||||
│ ├── completionruntime/ # Shared Go DeepSeek completion startup, non-stream collection, and retry
|
||||
│ ├── completionruntime/ # Shared Go DeepSeek completion startup, collection, empty-output/account-switch retry
|
||||
│ ├── config/ # Config loading/validation/hot reload
|
||||
│ ├── deepseek/ # DeepSeek upstream client/protocol/transport
|
||||
│ │ ├── client/ # Login/session/completion/upload/delete calls
|
||||
@@ -41,6 +41,7 @@ ds2api/
|
||||
│ │ ├── admin/ # Admin API root assembly and resource packages
|
||||
│ │ ├── claude/ # Claude HTTP protocol adapter
|
||||
│ │ ├── gemini/ # Gemini HTTP protocol adapter
|
||||
│ │ ├── ollama/ # Ollama-compatible model/capability query endpoints
|
||||
│ │ ├── openai/ # OpenAI HTTP surface
|
||||
│ │ │ ├── chat/ # Chat Completions execution entrypoint
|
||||
│ │ │ ├── responses/ # Responses API and response store
|
||||
@@ -57,6 +58,7 @@ ds2api/
|
||||
│ ├── prompt/ # Prompt composition
|
||||
│ ├── promptcompat/ # API request -> DeepSeek web-chat plain-text compatibility
|
||||
│ ├── rawsample/ # Raw sample read/write and management
|
||||
│ ├── responsehistory/ # DeepSeek upstream response archive and session snapshots
|
||||
│ ├── server/ # Router and middleware assembly
|
||||
│ │ └── data/ # Router/runtime helper data
|
||||
│ ├── sse/ # SSE parsing utilities
|
||||
@@ -188,10 +190,11 @@ flowchart LR
|
||||
- `internal/server`: router tree + middlewares (health, protocol routes, Admin/WebUI).
|
||||
- `internal/httpapi/openai/*`: OpenAI HTTP surface split into chat, responses, files, embeddings, history, and shared packages; chat/responses share the promptcompat, stream, and toolcall semantics.
|
||||
- `internal/httpapi/{claude,gemini}`: protocol adapters that normalize into the same prompt compatibility semantics; normal direct paths must share DeepSeek session/PoW/completion execution through `completionruntime`, while `translatorcliproxy` is reserved for Vercel prepare/release, missing-backend fallback, and regression tests.
|
||||
- `internal/httpapi/ollama`: Ollama-compatible model list and capability query endpoints.
|
||||
- `internal/httpapi/requestbody`: shared HTTP body reading, JSON pre-validation, and UTF-8 error helpers across protocol adapters.
|
||||
- `internal/promptcompat`: compatibility core for turning OpenAI/Claude/Gemini requests into DeepSeek web-chat plain-text context.
|
||||
- `internal/assistantturn`: Go output-side canonical semantics, converting DeepSeek SSE collection results and stream finalization state into assistant turns and centralizing thinking, tool call, citation, usage, stop/error behavior.
|
||||
- `internal/completionruntime`: shared Go completion execution helpers for DeepSeek session/PoW/call startup, non-stream collection, and empty-output retry; streaming paths use it to start upstream requests, continue to use `internal/stream` for real-time consumption, and use `assistantturn` during finalization.
|
||||
- `internal/completionruntime`: shared Go completion execution helpers for DeepSeek session/PoW/call startup, non-stream collection, empty-output retry, and one managed-account fresh retry before a final 429; streaming paths use it to start upstream requests, continue to use `internal/stream` for real-time consumption, and use `assistantturn` during finalization.
|
||||
- `internal/translatorcliproxy`: bridge compatibility layer for Claude/Gemini and OpenAI shape translation; it is not the main business protocol conversion center.
|
||||
- `internal/deepseek/{client,protocol,transport}`: upstream requests, sessions, PoW adaptation, protocol constants, and transport details.
|
||||
- `internal/js/chat-stream` + `api/chat-stream.js`: Vercel Node streaming bridge; Go prepare/release owns auth, account lease, and completion payload assembly, while Node relays real-time SSE with Go-aligned finalization and tool sieve semantics.
|
||||
@@ -199,6 +202,7 @@ flowchart LR
|
||||
- `internal/toolcall` + `internal/toolstream`: DSML shell compatibility plus canonical XML tool-call parsing and anti-leak sieve; DSML is normalized back to XML at the entrypoint, and internal parsing remains XML-based.
|
||||
- `internal/httpapi/admin/*`: Admin API root assembly plus auth/accounts/config/settings/proxies/rawsamples/vercel/history/devcapture/version resource packages.
|
||||
- `internal/chathistory`: server-side conversation history persistence, pagination, detail lookup, and retention policy.
|
||||
- `internal/responsehistory`: DeepSeek upstream response archive, saving assistant text, thinking, raw tool-call fragments, and streaming detail before protocol rendering/trimming.
|
||||
- `internal/config`: config loading/validation + runtime settings hot-reload.
|
||||
- `internal/account`: managed account pool, inflight slots, waiting queue.
|
||||
- `internal/textclean`: text cleanup helpers, e.g. stripping `[reference: N]` markers.
|
||||
|
||||
@@ -27,7 +27,7 @@ ds2api/
|
||||
│ ├── claudeconv/ # Claude 消息格式转换工具
|
||||
│ ├── compat/ # 兼容性辅助与回归支持
|
||||
│ ├── assistantturn/ # 上游输出到统一 assistant turn / stream event 的语义层
|
||||
│ ├── completionruntime/ # Go 主路径共享 DeepSeek completion 启动、非流式收集与 retry
|
||||
│ ├── completionruntime/ # Go 主路径共享 DeepSeek completion 启动、收集、空输出/切号 retry
|
||||
│ ├── config/ # 配置加载、校验、热更新
|
||||
│ ├── deepseek/ # DeepSeek 上游 client/protocol/transport
|
||||
│ │ ├── client/ # 登录、会话、completion、上传/删除等上游调用
|
||||
@@ -41,6 +41,7 @@ ds2api/
|
||||
│ │ ├── admin/ # Admin API 根装配与资源子包
|
||||
│ │ ├── claude/ # Claude HTTP 协议适配
|
||||
│ │ ├── gemini/ # Gemini HTTP 协议适配
|
||||
│ │ ├── ollama/ # Ollama 兼容模型/能力查询接口
|
||||
│ │ ├── openai/ # OpenAI HTTP surface
|
||||
│ │ │ ├── chat/ # Chat Completions 执行入口
|
||||
│ │ │ ├── responses/ # Responses API 与 response store
|
||||
@@ -57,6 +58,7 @@ ds2api/
|
||||
│ ├── prompt/ # Prompt 组装
|
||||
│ ├── promptcompat/ # API 请求到 DeepSeek 网页纯文本上下文兼容层
|
||||
│ ├── rawsample/ # raw sample 读写与管理
|
||||
│ ├── responsehistory/ # DeepSeek 上游响应归档与会话快照
|
||||
│ ├── server/ # 路由与中间件装配
|
||||
│ │ └── data/ # 路由/运行时辅助数据
|
||||
│ ├── sse/ # SSE 解析工具
|
||||
@@ -188,10 +190,11 @@ flowchart LR
|
||||
- `internal/server`:路由树和中间件挂载(健康检查、协议入口、Admin/WebUI)。
|
||||
- `internal/httpapi/openai/*`:OpenAI HTTP surface,按 chat、responses、files、embeddings、history、shared 拆分;chat/responses 共享 promptcompat、stream、toolcall 等核心语义。
|
||||
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义;正常直连路径必须通过 `completionruntime` 共享 DeepSeek session/PoW/completion 调用,`translatorcliproxy` 仅保留给 Vercel prepare/release、后端缺失 fallback 和回归测试。
|
||||
- `internal/httpapi/ollama`:Ollama 兼容的模型列表与能力查询入口。
|
||||
- `internal/httpapi/requestbody`:跨协议复用的请求体读取、JSON 解码前置校验与 UTF-8 错误处理辅助。
|
||||
- `internal/promptcompat`:OpenAI/Claude/Gemini 请求到 DeepSeek 网页纯文本上下文的兼容内核。
|
||||
- `internal/assistantturn`:Go 输出侧统一语义层,把 DeepSeek SSE 收集结果和流式收尾状态归一成 assistant turn,集中处理 thinking、tool call、citation、usage、stop/error 语义。
|
||||
- `internal/completionruntime`:Go surface 共享的 completion 执行辅助,负责 DeepSeek session/PoW/call 启动、非流式 collect 和 empty-output retry;流式路径复用它启动上游请求,继续用 `internal/stream` 做实时消费,并在最终收尾阶段接入 `assistantturn`。
|
||||
- `internal/completionruntime`:Go surface 共享的 completion 执行辅助,负责 DeepSeek session/PoW/call 启动、非流式 collect、empty-output retry,以及托管账号在最终 429 前的一次切号 fresh retry;流式路径复用它启动上游请求,继续用 `internal/stream` 做实时消费,并在最终收尾阶段接入 `assistantturn`。
|
||||
- `internal/translatorcliproxy`:Claude/Gemini 与 OpenAI 结构互转的桥接兼容层,不作为主业务协议转换中心。
|
||||
- `internal/deepseek/{client,protocol,transport}`:上游请求、会话、PoW 适配、协议常量与传输层。
|
||||
- `internal/js/chat-stream` + `api/chat-stream.js`:Vercel Node 流式桥;Go prepare/release 管理鉴权、账号租约和 completion payload,Node 侧负责实时 SSE 转发并保持 Go 对齐的终结态和 tool sieve 语义。
|
||||
@@ -199,6 +202,7 @@ flowchart LR
|
||||
- `internal/toolcall` + `internal/toolstream`:DSML 外壳兼容与 canonical XML 工具调用解析、防泄漏筛分;DSML 会在入口归一化回 XML,内部仍按 XML 语义解析。
|
||||
- `internal/httpapi/admin/*`:Admin API 根装配与 auth/accounts/config/settings/proxies/rawsamples/vercel/history/devcapture/version 等资源子包。
|
||||
- `internal/chathistory`:服务器端对话记录持久化、分页、单条详情和保留策略。
|
||||
- `internal/responsehistory`:DeepSeek 上游响应归档,会在协议回译/裁剪前保存 assistant text、thinking、tool-call 原始片段和流式详情。
|
||||
- `internal/config`:配置加载、校验、运行时 settings 热更新。
|
||||
- `internal/account`:托管账号池、并发槽位、等待队列。
|
||||
- `internal/textclean`:文本清洗,移除 `[reference: N]` 标记等噪声。
|
||||
|
||||
@@ -9,8 +9,8 @@ Thanks for your interest in contributing to DS2API!
|
||||
### Prerequisites
|
||||
|
||||
- Go 1.26+
|
||||
- Node.js `20.19+` or `22.12+` (for WebUI development)
|
||||
- npm (bundled with Node.js)
|
||||
- Node.js `20.19+` or `22.12+` (for WebUI development; CI / Docker builds use Node 24)
|
||||
- npm (bundled with Node.js; 10+ recommended)
|
||||
|
||||
### Backend Development
|
||||
|
||||
|
||||
@@ -9,8 +9,8 @@
|
||||
### 前置要求
|
||||
|
||||
- Go 1.26+
|
||||
- Node.js `20.19+` 或 `22.12+`(WebUI 开发时)
|
||||
- npm(随 Node.js 提供)
|
||||
- Node.js `20.19+` 或 `22.12+`(WebUI 开发时;CI / Docker 构建使用 Node 24)
|
||||
- npm(随 Node.js 提供,建议 10+)
|
||||
|
||||
### 后端开发
|
||||
|
||||
|
||||
@@ -39,8 +39,8 @@ Recommended order when choosing a deployment method:
|
||||
| Dependency | Minimum Version | Notes |
|
||||
| --- | --- | --- |
|
||||
| Go | 1.26+ | Build backend |
|
||||
| Node.js | `20.19+` or `22.12+` | Only needed to build WebUI locally |
|
||||
| npm | Bundled with Node.js | Install WebUI dependencies |
|
||||
| Node.js | `20.19+` or `22.12+` (CI / Docker builds use Node 24) | Only needed to build WebUI locally |
|
||||
| npm | Bundled with Node.js; 10+ recommended | Install WebUI dependencies |
|
||||
|
||||
Config source (choose one):
|
||||
|
||||
@@ -299,6 +299,8 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # optional for personal accounts
|
||||
| `DS2API_VERCEL_INTERNAL_SECRET` | Hybrid streaming internal auth | Falls back to `DS2API_ADMIN_KEY` |
|
||||
| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | Stream lease TTL | `900` |
|
||||
| `DS2API_RAW_STREAM_SAMPLE_ROOT` | Raw stream sample root for saving/reading samples | `tests/raw_stream_samples` |
|
||||
| `DS2API_STATIC_ADMIN_DIR` | WebUI static asset directory | `static/admin` |
|
||||
| `DS2API_AUTO_BUILD_WEBUI` | Whether local startup auto-builds missing WebUI assets (`1/true/yes/on` or `0/false/no/off`) | Enabled outside Vercel |
|
||||
| `VERCEL_TOKEN` | Vercel sync token | — |
|
||||
| `VERCEL_PROJECT_ID` | Vercel project ID | — |
|
||||
| `VERCEL_TEAM_ID` | Vercel team ID | — |
|
||||
@@ -321,7 +323,7 @@ Request ──────┐
|
||||
```
|
||||
|
||||
- **Go entry**: `api/index.go` (Serverless Go)
|
||||
- **Stream entry**: `api/chat-stream.js` (Node Runtime for real-time SSE)
|
||||
- **Stream entry**: `api/chat-stream.js` (Node Runtime for real-time SSE; `vercel.json` rewrites only the canonical `/v1/chat/completions` path here, while the root shortcut `/chat/completions` stays on the Go entry)
|
||||
- **Routing**: `vercel.json`
|
||||
- **Build command**: `npm ci --prefix webui && npm run build --prefix webui` (automatic)
|
||||
|
||||
@@ -438,7 +440,7 @@ Default local access URL: `http://127.0.0.1:5001`; the server actually binds to
|
||||
|
||||
### 4.2 WebUI Build
|
||||
|
||||
On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
|
||||
On first local startup, if the WebUI static directory is missing, DS2API automatically attempts to build it (requires Node.js/npm; when dependencies are missing it runs `npm ci --prefix webui`, then `npm run build --prefix webui -- --outDir <static-dir> --emptyOutDir`). The default static directory is `static/admin/`, and `DS2API_STATIC_ADMIN_DIR` can override it.
|
||||
|
||||
Manual build:
|
||||
|
||||
|
||||
@@ -39,8 +39,8 @@
|
||||
| 依赖 | 最低版本 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| Go | 1.26+ | 编译后端 |
|
||||
| Node.js | `20.19+` 或 `22.12+` | 仅在需要本地构建 WebUI 时 |
|
||||
| npm | 随 Node.js 提供 | 安装 WebUI 依赖 |
|
||||
| Node.js | `20.19+` 或 `22.12+`(CI / Docker 构建使用 Node 24) | 仅在需要本地构建 WebUI 时 |
|
||||
| npm | 随 Node.js 提供,建议 10+ | 安装 WebUI 依赖 |
|
||||
|
||||
配置来源(任选其一):
|
||||
|
||||
@@ -299,6 +299,8 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # 个人账号可留空
|
||||
| `DS2API_VERCEL_INTERNAL_SECRET` | 混合流式内部鉴权 | 回退用 `DS2API_ADMIN_KEY` |
|
||||
| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | 流式 lease TTL | `900` |
|
||||
| `DS2API_RAW_STREAM_SAMPLE_ROOT` | raw stream 样本保存/读取根目录 | `tests/raw_stream_samples` |
|
||||
| `DS2API_STATIC_ADMIN_DIR` | WebUI 静态资源目录 | `static/admin` |
|
||||
| `DS2API_AUTO_BUILD_WEBUI` | 本地启动时是否自动构建缺失的 WebUI(`1/true/yes/on` 或 `0/false/no/off`) | 非 Vercel 默认开启 |
|
||||
| `VERCEL_TOKEN` | Vercel 同步 token | — |
|
||||
| `VERCEL_PROJECT_ID` | Vercel 项目 ID | — |
|
||||
| `VERCEL_TEAM_ID` | Vercel 团队 ID | — |
|
||||
@@ -331,7 +333,7 @@ api/index.go api/chat-stream.js
|
||||
```
|
||||
|
||||
- **入口文件**:`api/index.go`(Serverless Go)
|
||||
- **流式入口**:`api/chat-stream.js`(Node Runtime,保证实时 SSE)
|
||||
- **流式入口**:`api/chat-stream.js`(Node Runtime,保证实时 SSE;`vercel.json` 仅把规范路径 `/v1/chat/completions` 重写到这里,根路径快捷别名 `/chat/completions` 仍走 Go 入口)
|
||||
- **路由重写**:`vercel.json`
|
||||
- **构建命令**:`npm ci --prefix webui && npm run build --prefix webui`(自动执行)
|
||||
|
||||
@@ -448,7 +450,7 @@ go run ./cmd/ds2api
|
||||
|
||||
### 4.2 WebUI 构建
|
||||
|
||||
本地首次启动时,若 `static/admin/` 不存在,服务会自动尝试构建 WebUI(需要 Node.js/npm;缺依赖时会先执行 `npm ci`,再执行 `npm run build -- --outDir static/admin --emptyOutDir`)。
|
||||
本地首次启动时,若 WebUI 静态目录不存在,服务会自动尝试构建 WebUI(需要 Node.js/npm;缺依赖时会先执行 `npm ci --prefix webui`,再执行 `npm run build --prefix webui -- --outDir <静态目录> --emptyOutDir`)。默认静态目录为 `static/admin/`,可用 `DS2API_STATIC_ADMIN_DIR` 覆盖。
|
||||
|
||||
你也可以手动构建:
|
||||
|
||||
|
||||
@@ -81,7 +81,7 @@ Tool call 问题优先跑:
|
||||
|
||||
```bash
|
||||
go test -v ./internal/toolcall ./internal/toolstream -count=1
|
||||
node --test tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js
|
||||
./tests/scripts/run-unit-node.sh
|
||||
```
|
||||
|
||||
## 5. 测试选择
|
||||
|
||||
@@ -75,7 +75,7 @@ npm run build --prefix webui
|
||||
1. **Preflight 检查**:
|
||||
- `go test ./... -count=1`(单元测试)
|
||||
- `./tests/scripts/check-node-split-syntax.sh`(Node 拆分模块语法门禁)
|
||||
- `node --test tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/js_compat_test.js`
|
||||
- `node --test --test-concurrency=1 tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/chat-history-utils.test.js tests/node/js_compat_test.js`
|
||||
- `npm run build --prefix webui`(WebUI 构建检查)
|
||||
|
||||
2. **隔离启动**:复制 `config.json` 到临时目录,启动独立服务进程
|
||||
@@ -203,10 +203,10 @@ go test ./...
|
||||
|
||||
```bash
|
||||
# 运行 tool calls 相关测试(推荐用于调试 tool call 解析问题)
|
||||
go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
|
||||
go test -v -run 'TestParseToolCalls|TestProcessToolSieve|TestRepair' ./internal/toolcall ./internal/toolstream
|
||||
|
||||
# 运行单个测试用例
|
||||
go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/
|
||||
go test -v -run TestParseToolCallsAllowsAllEmptyParameterPayload ./internal/toolcall
|
||||
|
||||
# 运行 format 相关测试
|
||||
go test -v ./internal/format/...
|
||||
@@ -221,23 +221,23 @@ go test -v ./internal/httpapi/openai/...
|
||||
|
||||
```bash
|
||||
# 1. 运行 tool calls 相关的所有测试
|
||||
go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
|
||||
go test -v -run 'TestParseToolCalls|TestProcessToolSieve|TestRepair' ./internal/toolcall ./internal/toolstream
|
||||
|
||||
# 2. 查看测试输出中的详细调试信息
|
||||
go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/ 2>&1
|
||||
go test -v -run TestProcessToolSieveReleasesMalformedExecutableXMLBlock ./internal/toolstream 2>&1
|
||||
|
||||
# 3. 检查具体测试用例的修复效果
|
||||
# 测试用例位于 internal/toolcall/toolcalls_test.go,包含:
|
||||
# - TestParseToolCallsWithDeepSeekHallucination: DeepSeek 典型幻觉输出
|
||||
# 重点测试位于 internal/toolcall/toolcalls_test.go 与 internal/toolstream/tool_sieve_xml_test.go,包含:
|
||||
# - TestParseToolCallsAllowsAllEmptyParameterPayload: 空参数结构化保留
|
||||
# - TestProcessToolSieveReleasesMalformedExecutableXMLBlock: malformed XML wrapper 释放为文本
|
||||
# - TestRepairLooseJSONWithNestedObjects: 嵌套对象的方括号修复
|
||||
# - TestParseToolCallsWithMixedWindowsPaths: Windows 路径处理
|
||||
```
|
||||
|
||||
### 运行 Node.js 测试
|
||||
|
||||
```bash
|
||||
# 运行 Node 测试
|
||||
node --test tests/node/stream-tool-sieve.test.js
|
||||
node --test --test-concurrency=1 tests/node/stream-tool-sieve.test.js tests/node/chat-stream.test.js tests/node/chat-history-utils.test.js tests/node/js_compat_test.js
|
||||
|
||||
# 或使用脚本
|
||||
./tests/scripts/run-unit-node.sh
|
||||
|
||||
@@ -111,8 +111,8 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
|
||||
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装;Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`。Go 主服务新增 `completionruntime` 启动层,统一执行 DeepSeek session/PoW/call;输出侧新增 `assistantturn` 语义层:非流式 OpenAI Chat / Responses / Claude / Gemini 会把 DeepSeek SSE 收集结果先归一成同一份 assistant turn,再分别渲染成各协议原生外形;流式 OpenAI Chat / Responses / Claude / Gemini 继续保持各协议实时 SSE framing,但最终收尾的 tool fallback、schema 归一、usage、empty-output / content-filter 错误语义同样由 `assistantturn` 判定。Claude / Gemini 的常规 Go 主路径不再依赖内部 `httptest` 转发到 OpenAI handler;`translatorcliproxy` 仅保留用于 Vercel bridge、后端缺失 fallback 和回归测试,不作为主业务协议转换中心。
|
||||
- Vercel Node 流式路径本轮不迁移,仍使用现有 Node bridge / stream-tool-sieve 实现;后续若变更 Node 流式语义,需要按 `assistantturn` 的 Go canonical 输出语义同步对齐。
|
||||
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking,优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。未显式关闭时,各 surface 会按解析后的 DeepSeek 模型默认能力开启 thinking,并用各自协议的原生形态暴露:OpenAI Chat 为 `reasoning_content`,OpenAI Responses 为 `response.reasoning.delta` / `reasoning` content,Claude 为 `thinking` block / `thinking_delta`,Gemini 为 `thought: true` part。
|
||||
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本,因此即使最终可见层会剥离完整 leaked DSML / XML `tool_calls` wrapper、并抑制全空参数或无效 wrapper 块,也不会影响真实工具调用转成结构化 `tool_calls` / `function_call`。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning,思维链只用于检测,不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
|
||||
- OpenAI Chat / Responses 的空回复错误处理之前会默认做一次内部补偿重试:第一次上游完整结束后,如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用,并且终止原因不是 `content_filter`,兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略,把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。重试遵循 DeepSeek 多轮对话协议:从第一次上游 SSE 流中提取 `response_message_id`,并在重试 payload 中设置 `parent_message_id` 为该值,使重试成为同一会话的后续轮次而非断裂的根消息;同时重新获取一次 PoW(若 PoW 获取失败则回退到原始 PoW)。该重试不会重新标准化消息、不会新建 session、不会切换账号,也不会向流式客户端插入重试标记;第二次 thinking / reasoning 会按正常增量直接接到第一次之后,并继续使用 overlap trim 去重。若第二次仍为空,终端错误码仍保持现有 `upstream_empty_output`;若任一尝试触发空 `content_filter`,不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`,但因无法直接调用 PoW API 而复用原始 PoW。
|
||||
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本。最终可见层会剥离已经成功解析成工具调用的完整 leaked DSML / XML `tool_calls` wrapper;如果遇到完整 wrapper 但内部形态不符合可执行工具调用语义(例如 `<param>` 这类 malformed XML 工具壳),流式 sieve 会把该块作为普通文本释放,而不是吞掉或伪造成工具调用。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning,思维链只用于检测,不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
|
||||
- OpenAI Chat / Responses、Claude Messages、Gemini generateContent 的空回复错误处理之前会默认做一次内部补偿重试:第一次上游完整结束后,如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用,并且终止原因不是 `content_filter`,兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略,把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。Go 主路径的非流式重试由 `completionruntime.ExecuteNonStreamWithRetry` 统一处理;流式重试由 `completionruntime.ExecuteStreamWithRetry` 统一处理,各协议 runtime 只负责消费/渲染本协议 SSE framing。重试遵循 DeepSeek 多轮对话协议:从第一次上游 SSE 流中提取 `response_message_id`,并在重试 payload 中设置 `parent_message_id` 为该值,使重试成为同一会话的后续轮次而非断裂的根消息;同时重新获取一次 PoW(若 PoW 获取失败则回退到原始 PoW)。该同账号重试不会重新标准化消息、不会新建 session,也不会向流式客户端插入重试标记;第二次 thinking / reasoning 会按正常增量直接接到第一次之后,并继续使用 overlap trim 去重。若同账号补偿重试后即将返回 429 `upstream_empty_output`,并且当前是托管账号模式,Go 主路径会在返回 429 前切换到下一个可用账号,新建 `chat_session_id`,使用原始 completion payload 再做一次 fresh retry;该切号重试不携带空回复 prompt 后缀,也不设置上一账号的 `parent_message_id`。如果没有可切换账号,或切号后的 fresh retry 仍没有可见正文或工具调用,则继续按原错误返回:无任何输出为 503 `upstream_unavailable`,有 reasoning 但没有可见正文或工具调用为 429 `upstream_empty_output`。若任一尝试触发空 `content_filter`,不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`,但因无法直接调用 PoW API 而复用原始 PoW;切号 fresh retry 目前由 Go 主路径提供。
|
||||
|
||||
- 非流式 OpenAI Chat / Responses、Claude Messages、Gemini generateContent 在最终可见正文渲染阶段,会把 DeepSeek 搜索返回中的 `[citation:N]` / `[reference:N]` 标记替换成对应 Markdown 链接。`citation` 标记按一基序号解析;`reference` 标记只有在同一段正文中出现 `[reference:0]`(允许冒号后有空格)时才按零基序号映射,并且不会影响同段正文里的 `citation` 标记。
|
||||
- 流式输出仍默认隐藏 `[citation:N]` / `[reference:N]` 这类上游内部标记,避免分片输出中泄漏尚未完成映射的引用占位符。
|
||||
@@ -167,14 +167,15 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
|
||||
3. 再附上统一的 DSML tool call 外壳格式约束。
|
||||
4. 把这整段内容并入 system prompt。
|
||||
|
||||
工具调用正例现在优先示范官方 DSML 风格:`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`。
|
||||
兼容层仍接受旧式纯 `<tool_calls>` wrapper,并会容错若干 DSML 标签变体,包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`;但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意:这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现;DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
|
||||
工具调用正例现在优先示范全角分隔符 DSML 风格:`<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`。
|
||||
兼容层仍接受旧式纯 `<tool_calls>` wrapper,并会容错若干 DSML 标签变体,包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`、下划线形式 `<dsml_tool_calls>` / `<dsml_invoke>` / `<dsml_parameter>`,以及其他前缀分隔形态如 `<vendor|tool_calls>` / `<vendor_tool_calls>` / `<vendor - tool_calls>`;标签壳扫描还会把全角 ASCII 漂移归一化,例如 `<dSML|tool_calls>` 与全角 `>` 结束符,也会容错 CJK 尖括号和属性尾部分隔符漂移,例如 `<DSM|parameter name="command"|>...〈/DSM|parameter〉`。更一般地,Go / Node tag 扫描以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准,标签名前任意协议前缀壳都会在解析入口剥离,例如 `<DSML␂tool_calls>`、`<proto💥tool_calls>` 这类控制符或非 ASCII 分隔符漂移也会归一化回现有 XML 标签后继续走同一套 parser。但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意:这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现。解析器会先截获非代码块中的疑似工具 wrapper,完整解析失败或工具语义无效时再按普通文本放行。
|
||||
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时,Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外,解析器还会回收一些更松散的列表写法,例如 JSON array 字面量或逗号分隔的 JSON 项序列,只要它们足够明确;但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA,兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过,如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array;只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。对 `command` / `content` 等长文本参数,CDATA 内部的 Markdown fenced DSML / XML 示例会作为原文保护;示例里的 `]]></parameter>` 或 `</tool_calls>` 不会截断外层工具调用,解析器会继续等待围栏外真正的参数 / wrapper 结束标签。
|
||||
Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限;当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时,非流式收集、流式解析和 auto-continue 透传都会保留完整行,再进入同一套工具解析与序列化流程。
|
||||
在 assistant 最终回包阶段,如果某个 tool 参数在声明 schema 中明确是 `string`,兼容层会在把解析后的 `tool_calls` / `function_call` 重新序列化成 OpenAI / Responses / Claude 可见参数前,递归把该路径上的 number / bool / object / array 统一转成字符串;其中 object / array 会压成紧凑 JSON 字符串。这个保护只对 schema 明确声明为 string 的路径生效,不会改写本来就是 `number` / `boolean` / `object` / `array` 的参数。这样可以兼容 DeepSeek 输出了结构化片段、但上游客户端工具 schema 又严格要求字符串参数的场景(例如 `content`、`prompt`、`path`、`taskId` 等)。
|
||||
工具 schema 的权威来源始终是**当前请求实际携带的 schema**,而不是同名工具在其他 runtime(Claude Code / OpenCode / Codex 等)里的默认印象。兼容层现在会同时兼容 OpenAI 风格 `function.parameters`、直接工具对象上的 `parameters` / `input_schema`、以及 camelCase 的 `inputSchema` / `schema`,并在最终输出阶段按这份请求内 schema 决定是保留 array/object,还是仅对明确声明为 `string` 的路径做字符串化。该规则同样适用于 Claude 的流式收尾和 Vercel Node 流式 tool-call formatter,避免不同 runtime 因 schema shape 差异而出现同名工具参数类型漂移。
|
||||
正例中的工具名只会来自当前请求实际声明的工具;如果当前请求没有足够的已知工具形态,就省略对应的单工具、多工具或嵌套示例,避免把不可用工具名写进 prompt。
|
||||
对执行类工具,脚本内容必须进入执行参数本身:`Bash` / `execute_command` 使用 `command`,`exec_command` 使用 `cmd`;不要把脚本示范成 `path` / `content` 文件写入参数。
|
||||
工具提示词也会明确要求模型按本次调用实际需要填写参数,禁止输出 placeholder、空字符串或纯空白参数;如果必填参数未知,应先追问用户或正常文字回复,而不是输出空工具壳。对 `Bash` / `execute_command` 这类 shell 工具,命令或脚本必须写入 `command` 参数。解析层仍会把空字符串参数结构化返回;是否拒绝空 `command` 由后续工具执行侧 / 客户端 schema 校验决定。
|
||||
如果当前请求声明了 `Read` / `read_file` 这类读取工具,兼容层会额外注入一条 read-tool cache guard:当读取结果只表示“文件未变更 / 已在历史中 / 请引用先前上下文 / 没有正文内容”时,模型必须把它视为内容不可用,不能反复调用同一个无正文读取;应改为请求完整正文读取能力,或向用户说明需要重新提供文件内容。这个约束只缓解客户端缓存返回空内容导致的死循环,DS2API 不会也无法凭空恢复客户端本地文件正文。
|
||||
|
||||
OpenAI 路径实现:
|
||||
@@ -205,16 +206,20 @@ assistant 的 reasoning 会变成一个显式标签块:
|
||||
|
||||
然后再接可见回答正文。
|
||||
|
||||
对最终返回给客户端的 assistant 轮次,reasoning 不会因为本轮输出了工具调用而被丢弃。OpenAI Chat 会在同一个 assistant message 上同时返回 `reasoning_content` 和 `tool_calls`;OpenAI Responses 会先返回一个包含 `reasoning` content 的 assistant message item,再返回后续 `function_call` item;Claude / Gemini 也会在各自原生 thinking / thought 结构后继续返回 tool_use / functionCall。
|
||||
|
||||
对进入后续 prompt / `DS2API_HISTORY.txt` 的历史轮次,兼容层也会把同一轮工具调用前的 reasoning 绑定到 assistant tool call 历史上。OpenAI Chat 原生 `reasoning_content + tool_calls` 会直接保留;OpenAI Responses 若以 `reasoning` message item 后接 `function_call` item 的形式回放历史,会在归一化时合并为同一个 assistant 历史块;Claude 的 `thinking` block 会绑定到后续 `tool_use`;Gemini 的 `thought: true` part 会绑定到后续 `functionCall`。最终 prompt 中的顺序固定为 `[reasoning_content]...[/reasoning_content]`,再接 DSML tool call 外壳。
|
||||
|
||||
### 7.2 历史 tool_calls 保留方式
|
||||
|
||||
assistant 历史 `tool_calls` 不会保留成 OpenAI 原生 JSON,而会转成 prompt 可见的 DSML 外壳:
|
||||
|
||||
```xml
|
||||
<|DSML|tool_calls>
|
||||
<|DSML|invoke name="read_file">
|
||||
<|DSML|parameter name="path"><![CDATA[src/main.go]]></|DSML|parameter>
|
||||
</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
<|DSML|tool_calls>
|
||||
<|DSML|invoke name="read_file">
|
||||
<|DSML|parameter name="path"><![CDATA[src/main.go]]></|DSML|parameter>
|
||||
</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
```
|
||||
|
||||
解析层同时兼容旧式纯 XML 形态:`<tool_calls>` / `<invoke>` / `<parameter>`。两者都会先归一到现有 XML 解析语义;其他旧格式都会作为普通文本保留,不会作为可执行调用语法。
|
||||
@@ -419,7 +424,8 @@ Prior conversation history and tool progress.
|
||||
如果改的是 tool call 相关兼容语义,还应同时检查:
|
||||
|
||||
- `go test ./internal/toolcall/...`
|
||||
- `node --test tests/node/stream-tool-sieve.test.js`
|
||||
- `go test ./internal/toolstream/...`
|
||||
- `./tests/scripts/run-unit-node.sh`
|
||||
|
||||
## 14. 文档同步约定
|
||||
|
||||
|
||||
@@ -6,14 +6,14 @@
|
||||
|
||||
## 1) 当前可执行格式
|
||||
|
||||
当前版本推荐模型输出 DSML 外壳:
|
||||
当前版本推荐模型输出全角分隔符 DSML 外壳:
|
||||
|
||||
```xml
|
||||
<|DSML|tool_calls>
|
||||
<|DSML|invoke name="read_file">
|
||||
<|DSML|parameter name="path"><![CDATA[README.MD]]></|DSML|parameter>
|
||||
</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
<|DSML|tool_calls>
|
||||
<|DSML|invoke name="read_file">
|
||||
<|DSML|parameter name="path"><![CDATA[README.MD]]></|DSML|parameter>
|
||||
</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
```
|
||||
|
||||
兼容层仍接受旧式 canonical XML:
|
||||
@@ -30,17 +30,17 @@
|
||||
|
||||
约束:
|
||||
|
||||
- 必须有 `<|DSML|tool_calls>...</|DSML|tool_calls>` 或 `<tool_calls>...</tool_calls>` wrapper
|
||||
- 每个调用必须在 `<|DSML|invoke name="...">...</|DSML|invoke>` 或 `<invoke name="...">...</invoke>` 内
|
||||
- 必须有 `<|DSML|tool_calls>...</|DSML|tool_calls>` 或 `<tool_calls>...</tool_calls>` wrapper
|
||||
- 每个调用必须在 `<|DSML|invoke name="...">...</|DSML|invoke>` 或 `<invoke name="...">...</invoke>` 内
|
||||
- 工具名必须放在 `invoke` 的 `name` 属性
|
||||
- 参数必须使用 `<|DSML|parameter name="...">...</|DSML|parameter>` 或 `<parameter name="...">...</parameter>`
|
||||
- 参数必须使用 `<|DSML|parameter name="...">...</|DSML|parameter>` 或 `<parameter name="...">...</parameter>`
|
||||
- 同一个工具块内不要混用 DSML 标签和旧 XML 工具标签;混搭会被视为非法工具块
|
||||
|
||||
兼容修复:
|
||||
|
||||
- 如果模型漏掉 opening wrapper,但后面仍输出了一个或多个 invoke 并以 closing wrapper 收尾,Go 解析链路会在解析前补回缺失的 opening wrapper。
|
||||
- Go / Node 解析层不再枚举每一种 DSML typo。它会把工具标签名前的 `DSML`、管道符 `|` / `|`、空白、重复 leading `<` 视为可容忍的协议噪声,然后只匹配固定本地标签名 `tool_calls` / `invoke` / `parameter`。例如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<<DSML|DSML|tool_calls>` 都会归一化;相似但非固定标签名(如 `tool_calls_extra`)仍按普通文本处理。
|
||||
- 如果模型在固定工具标签名后多输出一个尾部管道符,例如 `<|DSML|tool_calls|` / `<|DSML|invoke|` / `<|DSML|parameter|`,兼容层会把这个尾部 `|` 当作异常标签终止符并补齐缺失的 `>`;如果后面已经有 `>`,也会消费这个多余 `|` 后再归一化。
|
||||
- Go / Node 解析层不再枚举每一种 DSML typo。它以固定本地标签名 `tool_calls` / `invoke` / `parameter` 为准,把标签名前的任意协议前缀壳视为可容忍噪声,并继续兼容管道符 `|` / `|`、空白、重复 leading `<`、可视控制符 `␂`、原始 STX `\x02`、非 ASCII 分隔符、CJK 尖括号 `〈` / `〉` 等漂移。例如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<<DSML|DSML|tool_calls>`、`<DSML␂tool_calls>`、`<proto💥tool_calls>`、`<DSM|tool_calls>...〈/DSM|tool_calls〉` 都会归一化;相似但非固定标签名(如 `tool_calls_extra`)仍按普通文本处理。
|
||||
- 如果模型在固定工具标签名后多输出一个尾部管道符,例如 `<|DSML|tool_calls|` / `<|DSML|invoke|` / `<|DSML|parameter|`,或在带属性标签的结束符前多输出一个尾部管道符(如 `<DSM|parameter name="command"|>`),兼容层会把这个尾部 `|` / `|` 当作异常标签终止符并补齐或归一化;如果后面已经有 `>` / `〉`,也会消费这个多余分隔符后再归一化。
|
||||
- 这是一个针对常见模型失误的窄修复,不改变推荐输出格式;prompt 仍要求模型直接输出完整 DSML 外壳。
|
||||
- 裸 `<invoke ...>` / `<parameter ...>` 不会被当成“已支持的工具语法”;只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 才会进入工具调用路径。
|
||||
|
||||
@@ -54,7 +54,7 @@
|
||||
|
||||
在流式链路中(Go / Node 一致):
|
||||
|
||||
- DSML `<|DSML|tool_calls>` wrapper、短横线形式(如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
|
||||
- DSML `<|DSML|tool_calls>` wrapper、短横线形式(如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
|
||||
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapper,Go 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
|
||||
- 已识别成功的工具调用不会再次回流到普通文本
|
||||
- 不符合新格式的块不会执行,并继续按原样文本透传
|
||||
@@ -78,11 +78,16 @@
|
||||
- `rejectedByPolicy`:当前固定为 `false`
|
||||
- `rejectedToolNames`:当前固定为空数组
|
||||
|
||||
解析层不会因为参数值为空而丢弃工具调用。若模型输出了显式空字符串或纯空白参数,它们会按空字符串进入结构化 `tool_calls`;是否拒绝缺参或空命令应由后续工具执行侧 / 客户端 schema 校验决定。Prompt 层仍会要求模型不要主动输出空参数。
|
||||
|
||||
完整的 DSML / XML wrapper 只有在成功解析出有效 `invoke name`,并且参数节点(如存在)符合 `parameter` 语义后,才会变成结构化工具调用;真正的零参数工具调用仍然有效。如果 wrapper 完整但内部不是可执行工具调用形态(例如使用 `<param>`、缺少有效 `invoke name`、或其他 malformed XML 工具壳),流式 sieve 会把原始 wrapper 作为普通文本释放,不会吞掉内容,也不会生成空的工具调用。
|
||||
|
||||
## 5) 落地建议
|
||||
|
||||
1. Prompt 里只示范 DSML 外壳语法。
|
||||
2. 上游客户端应直接输出完整 DSML 外壳;DS2API 兼容旧式 canonical XML,并只对“closing tag 在、opening tag 漏掉”的常见失误做窄修复,不会泛化接受其他旧格式。
|
||||
3. 不要依赖 parser 做安全控制;执行器侧仍应做工具名和参数校验。
|
||||
3. 模型只有在知道本次调用所需参数值时才应输出工具调用;不要输出 placeholder、空字符串或纯空白参数。对 `Bash` / `execute_command`,实际命令必须在 `command` 参数里。
|
||||
4. 不要依赖 parser 做安全控制;执行器侧仍应做工具名和参数校验。
|
||||
|
||||
## 6) 回归验证
|
||||
|
||||
@@ -90,17 +95,18 @@
|
||||
|
||||
```bash
|
||||
go test -v -run 'TestParseToolCalls|TestProcessToolSieve' ./internal/toolcall ./internal/toolstream ./internal/httpapi/openai/...
|
||||
node --test tests/node/stream-tool-sieve.test.js
|
||||
./tests/scripts/run-unit-node.sh
|
||||
```
|
||||
|
||||
重点覆盖:
|
||||
|
||||
- DSML `<|DSML|tool_calls>` wrapper 正常解析
|
||||
- DSML `<|DSML|tool_calls>` wrapper 正常解析
|
||||
- legacy canonical `<tool_calls>` wrapper 正常解析
|
||||
- 固定本地标签名的 DSML 噪声容错形态(如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<<DSML|DSML|tool_calls>`)正常解析
|
||||
- 固定本地标签名的 DSML 噪声容错形态(如 `<DSML|tool_calls>`、`<<|DSML|tool_calls>`、`<|DSML tool_calls>`、`<DSMLtool_calls>`、`<<DSML|DSML|tool_calls>`、`<DSM|tool_calls>...〈/DSM|tool_calls〉`)正常解析
|
||||
- 混搭标签(DSML wrapper + canonical inner)归一化后正常解析
|
||||
- 波浪线围栏 `~~~` 内的示例不执行
|
||||
- 嵌套围栏(4 反引号嵌套 3 反引号)内的示例不执行
|
||||
- 文本 mention 标签名后紧跟真正工具调用的场景(含同一 wrapper 变体)
|
||||
- 空参数结构化保留,malformed executable-looking XML wrapper 作为文本释放
|
||||
- 非兼容内容按普通文本透传
|
||||
- 代码块示例不执行
|
||||
|
||||
@@ -218,7 +218,7 @@ func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int,
|
||||
if strings.TrimSpace(thinking) != "" {
|
||||
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned reasoning without visible output.", "upstream_empty_output"
|
||||
}
|
||||
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned empty output.", "upstream_empty_output"
|
||||
return http.StatusServiceUnavailable, "Upstream service is unavailable and returned no output.", "upstream_unavailable"
|
||||
}
|
||||
|
||||
// ShouldRetryEmptyOutput returns true when the turn produced no visible text
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
package assistantturn
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"testing"
|
||||
|
||||
"ds2api/internal/promptcompat"
|
||||
@@ -70,6 +71,13 @@ func TestBuildTurnFromCollectedThinkingOnlyIsEmptyOutput(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildTurnFromCollectedPureEmptyOutputIsUpstreamUnavailable(t *testing.T) {
|
||||
turn := BuildTurnFromCollected(sse.CollectResult{}, BuildOptions{})
|
||||
if turn.Error == nil || turn.Error.Status != http.StatusServiceUnavailable || turn.Error.Code != "upstream_unavailable" {
|
||||
t.Fatalf("expected upstream unavailable error, got %#v", turn.Error)
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildTurnFromCollectedToolChoiceRequired(t *testing.T) {
|
||||
turn := BuildTurnFromCollected(sse.CollectResult{Text: "hello"}, BuildOptions{
|
||||
ToolChoice: promptcompat.ToolChoicePolicy{Mode: promptcompat.ToolChoiceRequired},
|
||||
|
||||
@@ -241,6 +241,36 @@ func TestSwitchAccountSkipsLoginFailureAndContinues(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestSwitchAccountRespectsPinnedTargetAccount(t *testing.T) {
|
||||
t.Setenv("DS2API_CONFIG_JSON", `{
|
||||
"keys":["managed-key"],
|
||||
"accounts":[
|
||||
{"email":"acc1@test.com","token":"t1"},
|
||||
{"email":"acc2@test.com","token":"t2"}
|
||||
]
|
||||
}`)
|
||||
store := config.LoadStore()
|
||||
pool := account.NewPool(store)
|
||||
r := NewResolver(store, pool, func(_ context.Context, _ config.Account) (string, error) {
|
||||
return "new-token", nil
|
||||
})
|
||||
|
||||
req, _ := http.NewRequest("POST", "/", nil)
|
||||
req.Header.Set("Authorization", "Bearer managed-key")
|
||||
req.Header.Set("X-Ds2-Target-Account", "acc1@test.com")
|
||||
a, err := r.Determine(req)
|
||||
if err != nil {
|
||||
t.Fatalf("determine failed: %v", err)
|
||||
}
|
||||
defer r.Release(a)
|
||||
if r.SwitchAccount(context.Background(), a) {
|
||||
t.Fatal("expected switch to be disabled for pinned target account")
|
||||
}
|
||||
if a.AccountID != "acc1@test.com" {
|
||||
t.Fatalf("expected pinned account to remain selected, got %q", a.AccountID)
|
||||
}
|
||||
}
|
||||
|
||||
// ─── Release edge cases ─────────────────────────────────────────────
|
||||
|
||||
func TestReleaseNilAuth(t *testing.T) {
|
||||
|
||||
@@ -28,6 +28,7 @@ type RequestAuth struct {
|
||||
DeepSeekToken string
|
||||
CallerID string
|
||||
AccountID string
|
||||
TargetAccount string
|
||||
Account config.Account
|
||||
TriedAccounts map[string]bool
|
||||
resolver *Resolver
|
||||
@@ -99,6 +100,7 @@ func (r *Resolver) acquireManagedRequestAuth(ctx context.Context, callerID, targ
|
||||
UseConfigToken: true,
|
||||
CallerID: callerID,
|
||||
AccountID: acc.Identifier(),
|
||||
TargetAccount: target,
|
||||
Account: acc,
|
||||
TriedAccounts: tried,
|
||||
resolver: r,
|
||||
@@ -185,6 +187,9 @@ func (r *Resolver) SwitchAccount(ctx context.Context, a *RequestAuth) bool {
|
||||
if !a.UseConfigToken {
|
||||
return false
|
||||
}
|
||||
if strings.TrimSpace(a.TargetAccount) != "" {
|
||||
return false
|
||||
}
|
||||
if a.TriedAccounts == nil {
|
||||
a.TriedAccounts = map[string]bool{}
|
||||
}
|
||||
@@ -208,6 +213,13 @@ func (r *Resolver) SwitchAccount(ctx context.Context, a *RequestAuth) bool {
|
||||
}
|
||||
}
|
||||
|
||||
func (a *RequestAuth) SwitchAccount(ctx context.Context) bool {
|
||||
if a == nil || a.resolver == nil {
|
||||
return false
|
||||
}
|
||||
return a.resolver.SwitchAccount(ctx, a)
|
||||
}
|
||||
|
||||
func (r *Resolver) Release(a *RequestAuth) {
|
||||
if a == nil || !a.UseConfigToken || a.AccountID == "" {
|
||||
return
|
||||
|
||||
@@ -90,7 +90,11 @@ func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.R
|
||||
if startErr != nil {
|
||||
return NonStreamResult{SessionID: start.SessionID, Payload: start.Payload}, startErr
|
||||
}
|
||||
stdReq = start.Request
|
||||
return ExecuteNonStreamStartedWithRetry(ctx, ds, a, start, opts)
|
||||
}
|
||||
|
||||
func ExecuteNonStreamStartedWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, start StartResult, opts Options) (NonStreamResult, *assistantturn.OutputError) {
|
||||
stdReq := start.Request
|
||||
maxAttempts := opts.MaxAttempts
|
||||
if maxAttempts <= 0 {
|
||||
maxAttempts = 3
|
||||
@@ -100,6 +104,7 @@ func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.R
|
||||
pow := start.Pow
|
||||
|
||||
attempts := 0
|
||||
accountSwitchAttempted := false
|
||||
currentResp := start.Response
|
||||
usagePrompt := stdReq.PromptTokenText
|
||||
accumulatedThinking := ""
|
||||
@@ -108,6 +113,24 @@ func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.R
|
||||
for {
|
||||
turn, outErr := collectAttempt(currentResp, stdReq, usagePrompt, opts)
|
||||
if outErr != nil {
|
||||
if canRetryOnAlternateAccount(ctx, a, outErr, opts.RetryEnabled, &accountSwitchAttempted) {
|
||||
switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, maxAttempts)
|
||||
if switchErr != nil {
|
||||
return NonStreamResult{SessionID: sessionID, Payload: payload, Attempts: attempts}, switchErr
|
||||
}
|
||||
if switched.Response != nil {
|
||||
config.Logger.Info("[completion_runtime_account_switch_retry] retrying after 429", "surface", stdReq.Surface, "stream", false, "account", a.AccountID)
|
||||
sessionID = switched.SessionID
|
||||
payload = switched.Payload
|
||||
pow = switched.Pow
|
||||
currentResp = switched.Response
|
||||
usagePrompt = stdReq.PromptTokenText
|
||||
accumulatedThinking = ""
|
||||
accumulatedRawThinking = ""
|
||||
accumulatedToolDetectionThinking = ""
|
||||
continue
|
||||
}
|
||||
}
|
||||
return NonStreamResult{SessionID: sessionID, Payload: payload, Attempts: attempts}, outErr
|
||||
}
|
||||
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, turn.Thinking)
|
||||
@@ -130,6 +153,24 @@ func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.R
|
||||
retryMax = shared.EmptyOutputRetryMaxAttempts()
|
||||
}
|
||||
if !opts.RetryEnabled || !assistantturn.ShouldRetryEmptyOutput(turn, attempts, retryMax) {
|
||||
if canRetryOnAlternateAccount(ctx, a, turn.Error, opts.RetryEnabled, &accountSwitchAttempted) {
|
||||
switched, switchErr := startStandardCompletionOnAlternateAccount(ctx, ds, a, stdReq, maxAttempts)
|
||||
if switchErr != nil {
|
||||
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, switchErr
|
||||
}
|
||||
if switched.Response != nil {
|
||||
config.Logger.Info("[completion_runtime_account_switch_retry] retrying after 429", "surface", stdReq.Surface, "stream", false, "account", a.AccountID)
|
||||
sessionID = switched.SessionID
|
||||
payload = switched.Payload
|
||||
pow = switched.Pow
|
||||
currentResp = switched.Response
|
||||
usagePrompt = stdReq.PromptTokenText
|
||||
accumulatedThinking = ""
|
||||
accumulatedRawThinking = ""
|
||||
accumulatedToolDetectionThinking = ""
|
||||
continue
|
||||
}
|
||||
}
|
||||
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, turn.Error
|
||||
}
|
||||
|
||||
@@ -150,6 +191,37 @@ func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.R
|
||||
}
|
||||
}
|
||||
|
||||
func canRetryOnAlternateAccount(ctx context.Context, a *auth.RequestAuth, outErr *assistantturn.OutputError, retryEnabled bool, attempted *bool) bool {
|
||||
if outErr == nil || outErr.Status != http.StatusTooManyRequests {
|
||||
return false
|
||||
}
|
||||
if !retryEnabled || attempted == nil || *attempted {
|
||||
return false
|
||||
}
|
||||
if a == nil || !a.UseConfigToken {
|
||||
return false
|
||||
}
|
||||
*attempted = true
|
||||
return a.SwitchAccount(ctx)
|
||||
}
|
||||
|
||||
func startStandardCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, maxAttempts int) (StartResult, *assistantturn.OutputError) {
|
||||
sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{}, authOutputError(a)
|
||||
}
|
||||
pow, err := ds.GetPow(ctx, a, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{SessionID: sessionID}, &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Failed to get PoW (invalid token or unknown error).", Code: "error"}
|
||||
}
|
||||
payload := stdReq.CompletionPayload(sessionID)
|
||||
resp, err := ds.CallCompletion(ctx, a, payload, pow, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
|
||||
}
|
||||
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Response: resp, Request: stdReq}, nil
|
||||
}
|
||||
|
||||
func collectAttempt(resp *http.Response, stdReq promptcompat.StandardRequest, usagePrompt string, opts Options) (assistantturn.Turn, *assistantturn.OutputError) {
|
||||
defer func() {
|
||||
if err := resp.Body.Close(); err != nil {
|
||||
|
||||
@@ -7,15 +7,19 @@ import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"ds2api/internal/account"
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/config"
|
||||
dsclient "ds2api/internal/deepseek/client"
|
||||
"ds2api/internal/promptcompat"
|
||||
)
|
||||
|
||||
type fakeDeepSeekCaller struct {
|
||||
responses []*http.Response
|
||||
payloads []map[string]any
|
||||
uploads []dsclient.UploadFileRequest
|
||||
responses []*http.Response
|
||||
payloads []map[string]any
|
||||
uploads []dsclient.UploadFileRequest
|
||||
completionAccounts []string
|
||||
sessionByAccount bool
|
||||
}
|
||||
|
||||
type currentInputRuntimeConfig struct{}
|
||||
@@ -23,7 +27,10 @@ type currentInputRuntimeConfig struct{}
|
||||
func (currentInputRuntimeConfig) CurrentInputFileEnabled() bool { return true }
|
||||
func (currentInputRuntimeConfig) CurrentInputFileMinChars() int { return 0 }
|
||||
|
||||
func (f *fakeDeepSeekCaller) CreateSession(context.Context, *auth.RequestAuth, int) (string, error) {
|
||||
func (f *fakeDeepSeekCaller) CreateSession(_ context.Context, a *auth.RequestAuth, _ int) (string, error) {
|
||||
if f.sessionByAccount && a != nil && a.AccountID != "" {
|
||||
return "session-" + a.AccountID, nil
|
||||
}
|
||||
return "session-1", nil
|
||||
}
|
||||
|
||||
@@ -36,8 +43,11 @@ func (f *fakeDeepSeekCaller) UploadFile(_ context.Context, _ *auth.RequestAuth,
|
||||
return &dsclient.UploadFileResult{ID: "file-runtime-1"}, nil
|
||||
}
|
||||
|
||||
func (f *fakeDeepSeekCaller) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
|
||||
func (f *fakeDeepSeekCaller) CallCompletion(_ context.Context, a *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
|
||||
f.payloads = append(f.payloads, payload)
|
||||
if a != nil {
|
||||
f.completionAccounts = append(f.completionAccounts, a.AccountID)
|
||||
}
|
||||
if len(f.responses) == 0 {
|
||||
return sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"fallback"}`), nil
|
||||
}
|
||||
@@ -89,9 +99,72 @@ func TestExecuteNonStreamWithRetryBuildsCanonicalTurn(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestExecuteNonStreamWithRetrySwitchesManagedAccountBeforeFinal429(t *testing.T) {
|
||||
t.Setenv("DS2API_CONFIG_JSON", `{
|
||||
"keys":["managed-key"],
|
||||
"accounts":[
|
||||
{"email":"acc1@test.com","password":"pwd"},
|
||||
{"email":"acc2@test.com","password":"pwd"}
|
||||
]
|
||||
}`)
|
||||
store := config.LoadStore()
|
||||
resolver := auth.NewResolver(store, account.NewPool(store), func(_ context.Context, acc config.Account) (string, error) {
|
||||
return "token-" + acc.Identifier(), nil
|
||||
})
|
||||
req, _ := http.NewRequest(http.MethodPost, "/", nil)
|
||||
req.Header.Set("Authorization", "Bearer managed-key")
|
||||
a, err := resolver.Determine(req)
|
||||
if err != nil {
|
||||
t.Fatalf("determine failed: %v", err)
|
||||
}
|
||||
defer resolver.Release(a)
|
||||
|
||||
ds := &fakeDeepSeekCaller{
|
||||
sessionByAccount: true,
|
||||
responses: []*http.Response{
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":11,"p":"response/thinking_content","v":"first empty"}`),
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":12,"p":"response/thinking_content","v":"retry empty"}`),
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":21,"p":"response/content","v":"ok from second account"}`),
|
||||
},
|
||||
}
|
||||
stdReq := promptcompat.StandardRequest{
|
||||
Surface: "test",
|
||||
ResponseModel: "deepseek-v4-flash",
|
||||
PromptTokenText: "prompt",
|
||||
FinalPrompt: "final prompt",
|
||||
Thinking: true,
|
||||
}
|
||||
|
||||
result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, a, stdReq, Options{RetryEnabled: true})
|
||||
if outErr != nil {
|
||||
t.Fatalf("unexpected output error after account switch retry: %#v", outErr)
|
||||
}
|
||||
if result.Turn.Text != "ok from second account" {
|
||||
t.Fatalf("text mismatch after switch retry: %q", result.Turn.Text)
|
||||
}
|
||||
if result.SessionID != "session-acc2@test.com" {
|
||||
t.Fatalf("expected switched account session, got %q", result.SessionID)
|
||||
}
|
||||
wantAccounts := []string{"acc1@test.com", "acc1@test.com", "acc2@test.com"}
|
||||
if len(ds.completionAccounts) != len(wantAccounts) {
|
||||
t.Fatalf("completion account count mismatch: got %v want %v", ds.completionAccounts, wantAccounts)
|
||||
}
|
||||
for i, want := range wantAccounts {
|
||||
if ds.completionAccounts[i] != want {
|
||||
t.Fatalf("completion account %d = %q want %q (all=%v)", i, ds.completionAccounts[i], want, ds.completionAccounts)
|
||||
}
|
||||
}
|
||||
if got := ds.payloads[2]["chat_session_id"]; got != "session-acc2@test.com" {
|
||||
t.Fatalf("switched payload session mismatch: %#v", got)
|
||||
}
|
||||
if prompt, _ := ds.payloads[2]["prompt"].(string); strings.Contains(prompt, "Previous reply had no visible output") {
|
||||
t.Fatalf("expected fresh switched-account prompt without empty-output suffix, got %q", prompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExecuteNonStreamWithRetryUsesParentMessageForEmptyRetry(t *testing.T) {
|
||||
ds := &fakeDeepSeekCaller{responses: []*http.Response{
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/status","v":"FINISHED"}`),
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/thinking_content","v":"plan"}`),
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":78,"p":"response/content","v":"ok"}`),
|
||||
}}
|
||||
stdReq := promptcompat.StandardRequest{
|
||||
|
||||
179
internal/completionruntime/stream_retry.go
Normal file
179
internal/completionruntime/stream_retry.go
Normal file
@@ -0,0 +1,179 @@
|
||||
package completionruntime
|
||||
|
||||
import (
|
||||
"context"
|
||||
"io"
|
||||
"net/http"
|
||||
"strings"
|
||||
|
||||
"ds2api/internal/assistantturn"
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/config"
|
||||
"ds2api/internal/httpapi/openai/shared"
|
||||
)
|
||||
|
||||
type StreamRetryOptions struct {
|
||||
Surface string
|
||||
Stream bool
|
||||
RetryEnabled bool
|
||||
RetryMaxAttempts int
|
||||
MaxAttempts int
|
||||
UsagePrompt string
|
||||
}
|
||||
|
||||
type StreamRetryHooks struct {
|
||||
ConsumeAttempt func(resp *http.Response, allowDeferEmpty bool) (terminalWritten bool, retryable bool)
|
||||
Finalize func(attempts int)
|
||||
ParentMessageID func() int
|
||||
OnRetry func(attempts int)
|
||||
OnRetryPrompt func(prompt string)
|
||||
OnRetryFailure func(status int, message, code string)
|
||||
OnAccountSwitch func(sessionID string)
|
||||
OnTerminal func(attempts int)
|
||||
}
|
||||
|
||||
func ExecuteStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, initialResp *http.Response, payload map[string]any, pow string, opts StreamRetryOptions, hooks StreamRetryHooks) {
|
||||
if hooks.ConsumeAttempt == nil {
|
||||
return
|
||||
}
|
||||
surface := strings.TrimSpace(opts.Surface)
|
||||
if surface == "" {
|
||||
surface = "completion"
|
||||
}
|
||||
maxAttempts := opts.MaxAttempts
|
||||
if maxAttempts <= 0 {
|
||||
maxAttempts = 3
|
||||
}
|
||||
retryMax := opts.RetryMaxAttempts
|
||||
if retryMax <= 0 {
|
||||
retryMax = shared.EmptyOutputRetryMaxAttempts()
|
||||
}
|
||||
|
||||
attempts := 0
|
||||
accountSwitchAttempted := false
|
||||
currentResp := initialResp
|
||||
currentPayload := clonePayload(payload)
|
||||
for {
|
||||
allowAccountSwitch := opts.RetryEnabled && attempts >= retryMax && !accountSwitchAttempted && a != nil && a.UseConfigToken
|
||||
terminalWritten, retryable := hooks.ConsumeAttempt(currentResp, opts.RetryEnabled && (attempts < retryMax || allowAccountSwitch))
|
||||
if terminalWritten {
|
||||
if hooks.OnTerminal != nil {
|
||||
hooks.OnTerminal(attempts)
|
||||
}
|
||||
return
|
||||
}
|
||||
if !retryable || !opts.RetryEnabled {
|
||||
if hooks.Finalize != nil {
|
||||
hooks.Finalize(attempts)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
if attempts >= retryMax {
|
||||
if canRetryOnAlternateAccount(ctx, a, &assistantturn.OutputError{Status: http.StatusTooManyRequests}, opts.RetryEnabled, &accountSwitchAttempted) {
|
||||
switched, switchErr := startPayloadCompletionOnAlternateAccount(ctx, ds, a, payload, maxAttempts)
|
||||
if switchErr != nil {
|
||||
if hooks.OnRetryFailure != nil {
|
||||
hooks.OnRetryFailure(switchErr.Status, switchErr.Message, switchErr.Code)
|
||||
}
|
||||
return
|
||||
}
|
||||
if switched.Response != nil {
|
||||
config.Logger.Info("[completion_runtime_account_switch_retry] retrying after 429", "surface", surface, "stream", opts.Stream, "account", a.AccountID)
|
||||
currentResp = switched.Response
|
||||
currentPayload = switched.Payload
|
||||
pow = switched.Pow
|
||||
if hooks.OnAccountSwitch != nil {
|
||||
hooks.OnAccountSwitch(switched.SessionID)
|
||||
}
|
||||
if hooks.OnRetryPrompt != nil {
|
||||
hooks.OnRetryPrompt(opts.UsagePrompt)
|
||||
}
|
||||
continue
|
||||
}
|
||||
}
|
||||
if hooks.Finalize != nil {
|
||||
hooks.Finalize(attempts)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
attempts++
|
||||
parentMessageID := 0
|
||||
if hooks.ParentMessageID != nil {
|
||||
parentMessageID = hooks.ParentMessageID()
|
||||
}
|
||||
config.Logger.Info("[completion_runtime_empty_retry] attempting synthetic retry", "surface", surface, "stream", opts.Stream, "retry_attempt", attempts, "parent_message_id", parentMessageID)
|
||||
retryPow, powErr := ds.GetPow(ctx, a, maxAttempts)
|
||||
if powErr != nil {
|
||||
config.Logger.Warn("[completion_runtime_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", surface, "stream", opts.Stream, "retry_attempt", attempts, "error", powErr)
|
||||
retryPow = pow
|
||||
}
|
||||
nextResp, err := ds.CallCompletion(ctx, a, shared.ClonePayloadForEmptyOutputRetry(currentPayload, parentMessageID), retryPow, maxAttempts)
|
||||
if err != nil {
|
||||
if hooks.OnRetryFailure != nil {
|
||||
hooks.OnRetryFailure(http.StatusInternalServerError, "Failed to get completion.", "error")
|
||||
}
|
||||
config.Logger.Warn("[completion_runtime_empty_retry] retry request failed", "surface", surface, "stream", opts.Stream, "retry_attempt", attempts, "error", err)
|
||||
return
|
||||
}
|
||||
if nextResp.StatusCode != http.StatusOK {
|
||||
body, readErr := io.ReadAll(nextResp.Body)
|
||||
if readErr != nil {
|
||||
config.Logger.Warn("[completion_runtime_empty_retry] retry error body read failed", "surface", surface, "stream", opts.Stream, "retry_attempt", attempts, "error", readErr)
|
||||
}
|
||||
closeRetryBody(surface, nextResp.Body)
|
||||
msg := strings.TrimSpace(string(body))
|
||||
if msg == "" {
|
||||
msg = http.StatusText(nextResp.StatusCode)
|
||||
}
|
||||
if hooks.OnRetryFailure != nil {
|
||||
hooks.OnRetryFailure(nextResp.StatusCode, msg, "error")
|
||||
}
|
||||
return
|
||||
}
|
||||
if hooks.OnRetry != nil {
|
||||
hooks.OnRetry(attempts)
|
||||
}
|
||||
if hooks.OnRetryPrompt != nil {
|
||||
hooks.OnRetryPrompt(shared.UsagePromptWithEmptyOutputRetry(opts.UsagePrompt, attempts))
|
||||
}
|
||||
currentResp = nextResp
|
||||
}
|
||||
}
|
||||
|
||||
func startPayloadCompletionOnAlternateAccount(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, payload map[string]any, maxAttempts int) (StartResult, *assistantturn.OutputError) {
|
||||
sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{}, authOutputError(a)
|
||||
}
|
||||
pow, err := ds.GetPow(ctx, a, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{SessionID: sessionID}, &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Failed to get PoW (invalid token or unknown error).", Code: "error"}
|
||||
}
|
||||
nextPayload := clonePayload(payload)
|
||||
nextPayload["chat_session_id"] = sessionID
|
||||
delete(nextPayload, "parent_message_id")
|
||||
resp, err := ds.CallCompletion(ctx, a, nextPayload, pow, maxAttempts)
|
||||
if err != nil {
|
||||
return StartResult{SessionID: sessionID, Payload: nextPayload, Pow: pow}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
|
||||
}
|
||||
return StartResult{SessionID: sessionID, Payload: nextPayload, Pow: pow, Response: resp}, nil
|
||||
}
|
||||
|
||||
func clonePayload(payload map[string]any) map[string]any {
|
||||
clone := make(map[string]any, len(payload))
|
||||
for k, v := range payload {
|
||||
clone[k] = v
|
||||
}
|
||||
return clone
|
||||
}
|
||||
|
||||
func closeRetryBody(surface string, body io.Closer) {
|
||||
if body == nil {
|
||||
return
|
||||
}
|
||||
if err := body.Close(); err != nil {
|
||||
config.Logger.Warn("[completion_runtime_empty_retry] retry response body close failed", "surface", surface, "error", err)
|
||||
}
|
||||
}
|
||||
150
internal/completionruntime/stream_retry_test.go
Normal file
150
internal/completionruntime/stream_retry_test.go
Normal file
@@ -0,0 +1,150 @@
|
||||
package completionruntime
|
||||
|
||||
import (
|
||||
"context"
|
||||
"io"
|
||||
"net/http"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"ds2api/internal/account"
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/config"
|
||||
"ds2api/internal/httpapi/openai/shared"
|
||||
)
|
||||
|
||||
func TestExecuteStreamWithRetryUsesSharedRetryPayloadAndUsagePrompt(t *testing.T) {
|
||||
ds := &fakeDeepSeekCaller{responses: []*http.Response{
|
||||
sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"ok"}`),
|
||||
}}
|
||||
initial := sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/thinking_content","v":"plan"}`)
|
||||
payload := map[string]any{"prompt": "original prompt"}
|
||||
attemptsSeen := 0
|
||||
retryPrompt := ""
|
||||
|
||||
ExecuteStreamWithRetry(context.Background(), ds, &auth.RequestAuth{}, initial, payload, "pow", StreamRetryOptions{
|
||||
Surface: "test.stream",
|
||||
Stream: true,
|
||||
RetryEnabled: true,
|
||||
UsagePrompt: "original prompt",
|
||||
}, StreamRetryHooks{
|
||||
ConsumeAttempt: func(resp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
defer func() {
|
||||
if err := resp.Body.Close(); err != nil {
|
||||
t.Fatalf("close failed: %v", err)
|
||||
}
|
||||
}()
|
||||
_, _ = io.ReadAll(resp.Body)
|
||||
attemptsSeen++
|
||||
return attemptsSeen == 2, attemptsSeen == 1 && allowDeferEmpty
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return 77
|
||||
},
|
||||
OnRetryPrompt: func(prompt string) {
|
||||
retryPrompt = prompt
|
||||
},
|
||||
})
|
||||
|
||||
if attemptsSeen != 2 {
|
||||
t.Fatalf("expected two stream attempts, got %d", attemptsSeen)
|
||||
}
|
||||
if len(ds.payloads) != 1 {
|
||||
t.Fatalf("expected one retry completion call, got %d", len(ds.payloads))
|
||||
}
|
||||
if got := ds.payloads[0]["parent_message_id"]; got != 77 {
|
||||
t.Fatalf("retry parent_message_id mismatch: %#v", got)
|
||||
}
|
||||
if prompt, _ := ds.payloads[0]["prompt"].(string); !strings.Contains(prompt, shared.EmptyOutputRetrySuffix) {
|
||||
t.Fatalf("expected retry suffix in payload prompt, got %q", prompt)
|
||||
}
|
||||
if !strings.Contains(retryPrompt, shared.EmptyOutputRetrySuffix) {
|
||||
t.Fatalf("expected retry suffix in usage prompt, got %q", retryPrompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExecuteStreamWithRetrySwitchesManagedAccountBeforeFinal429(t *testing.T) {
|
||||
t.Setenv("DS2API_CONFIG_JSON", `{
|
||||
"keys":["managed-key"],
|
||||
"accounts":[
|
||||
{"email":"acc1@test.com","password":"pwd"},
|
||||
{"email":"acc2@test.com","password":"pwd"}
|
||||
]
|
||||
}`)
|
||||
store := config.LoadStore()
|
||||
resolver := auth.NewResolver(store, account.NewPool(store), func(_ context.Context, acc config.Account) (string, error) {
|
||||
return "token-" + acc.Identifier(), nil
|
||||
})
|
||||
req, _ := http.NewRequest(http.MethodPost, "/", nil)
|
||||
req.Header.Set("Authorization", "Bearer managed-key")
|
||||
a, err := resolver.Determine(req)
|
||||
if err != nil {
|
||||
t.Fatalf("determine failed: %v", err)
|
||||
}
|
||||
defer resolver.Release(a)
|
||||
|
||||
ds := &fakeDeepSeekCaller{
|
||||
sessionByAccount: true,
|
||||
responses: []*http.Response{
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":12,"p":"response/thinking_content","v":"retry empty"}`),
|
||||
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":21,"p":"response/content","v":"ok from second account"}`),
|
||||
},
|
||||
}
|
||||
initial := sseHTTPResponse(http.StatusOK, `data: {"response_message_id":11,"p":"response/thinking_content","v":"first empty"}`)
|
||||
payload := map[string]any{"prompt": "original prompt", "chat_session_id": "session-acc1@test.com"}
|
||||
attemptsSeen := 0
|
||||
switchedSession := ""
|
||||
|
||||
ExecuteStreamWithRetry(context.Background(), ds, a, initial, payload, "pow", StreamRetryOptions{
|
||||
Surface: "test.stream",
|
||||
Stream: true,
|
||||
RetryEnabled: true,
|
||||
RetryMaxAttempts: 1,
|
||||
UsagePrompt: "original prompt",
|
||||
}, StreamRetryHooks{
|
||||
ConsumeAttempt: func(resp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
defer func() {
|
||||
if err := resp.Body.Close(); err != nil {
|
||||
t.Fatalf("close failed: %v", err)
|
||||
}
|
||||
}()
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
attemptsSeen++
|
||||
if strings.Contains(string(body), "ok from second account") {
|
||||
return true, false
|
||||
}
|
||||
if !allowDeferEmpty {
|
||||
t.Fatalf("expected empty attempt %d to be deferred before final 429", attemptsSeen)
|
||||
}
|
||||
return false, true
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return 11 + attemptsSeen
|
||||
},
|
||||
OnAccountSwitch: func(sessionID string) {
|
||||
switchedSession = sessionID
|
||||
},
|
||||
})
|
||||
|
||||
if attemptsSeen != 3 {
|
||||
t.Fatalf("expected three stream attempts, got %d", attemptsSeen)
|
||||
}
|
||||
if switchedSession != "session-acc2@test.com" {
|
||||
t.Fatalf("expected switched session id, got %q", switchedSession)
|
||||
}
|
||||
wantAccounts := []string{"acc1@test.com", "acc2@test.com"}
|
||||
if len(ds.completionAccounts) != len(wantAccounts) {
|
||||
t.Fatalf("completion accounts mismatch: got %v want %v", ds.completionAccounts, wantAccounts)
|
||||
}
|
||||
for i, want := range wantAccounts {
|
||||
if ds.completionAccounts[i] != want {
|
||||
t.Fatalf("completion account %d = %q want %q (all=%v)", i, ds.completionAccounts[i], want, ds.completionAccounts)
|
||||
}
|
||||
}
|
||||
if got := ds.payloads[1]["chat_session_id"]; got != "session-acc2@test.com" {
|
||||
t.Fatalf("switched payload session mismatch: %#v", got)
|
||||
}
|
||||
if prompt, _ := ds.payloads[1]["prompt"].(string); strings.Contains(prompt, shared.EmptyOutputRetrySuffix) {
|
||||
t.Fatalf("expected switched-account prompt without empty-output suffix, got %q", prompt)
|
||||
}
|
||||
}
|
||||
@@ -21,6 +21,18 @@ func BuildResponseObjectWithToolCalls(responseID, model, finalPrompt, finalThink
|
||||
output := make([]any, 0, 2)
|
||||
if len(detected) > 0 {
|
||||
exposedOutputText = ""
|
||||
if strings.TrimSpace(finalThinking) != "" {
|
||||
output = append(output, map[string]any{
|
||||
"type": "message",
|
||||
"id": "msg_" + strings.ReplaceAll(uuid.NewString(), "-", ""),
|
||||
"role": "assistant",
|
||||
"status": "completed",
|
||||
"content": []any{map[string]any{
|
||||
"type": "reasoning",
|
||||
"text": finalThinking,
|
||||
}},
|
||||
})
|
||||
}
|
||||
output = append(output, toResponsesFunctionCallItems(detected, toolsRaw)...)
|
||||
} else {
|
||||
content := make([]any, 0, 2)
|
||||
|
||||
@@ -85,12 +85,24 @@ func TestBuildResponseObjectPromotesToolCallFromThinkingWhenTextEmpty(t *testing
|
||||
)
|
||||
|
||||
output, _ := obj["output"].([]any)
|
||||
if len(output) != 1 {
|
||||
t.Fatalf("expected one output item, got %#v", obj["output"])
|
||||
if len(output) != 2 {
|
||||
t.Fatalf("expected reasoning message plus function_call output, got %#v", obj["output"])
|
||||
}
|
||||
first, _ := output[0].(map[string]any)
|
||||
if first["type"] != "function_call" {
|
||||
t.Fatalf("expected function_call output, got %#v", first["type"])
|
||||
if first["type"] != "message" {
|
||||
t.Fatalf("expected reasoning message output first, got %#v", first["type"])
|
||||
}
|
||||
content, _ := first["content"].([]any)
|
||||
if len(content) != 1 {
|
||||
t.Fatalf("expected reasoning content, got %#v", first["content"])
|
||||
}
|
||||
block0, _ := content[0].(map[string]any)
|
||||
if block0["type"] != "reasoning" {
|
||||
t.Fatalf("expected reasoning block, got %#v", block0["type"])
|
||||
}
|
||||
second, _ := output[1].(map[string]any)
|
||||
if second["type"] != "function_call" {
|
||||
t.Fatalf("expected function_call output, got %#v", second["type"])
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -145,7 +145,7 @@ func (h *Handler) handleClaudeDirectStream(w http.ResponseWriter, r *http.Reques
|
||||
return
|
||||
}
|
||||
streamReq := start.Request
|
||||
h.handleClaudeStreamRealtime(w, r, start.Response, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, historySession)
|
||||
h.handleClaudeStreamRealtimeWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.PromptTokenText, historySession)
|
||||
}
|
||||
|
||||
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store ConfigReader) bool {
|
||||
@@ -360,3 +360,112 @@ func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Requ
|
||||
OnFinalize: streamRuntime.onFinalize,
|
||||
})
|
||||
}
|
||||
|
||||
func (h *Handler) handleClaudeStreamRealtimeWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, promptTokenText string, historySession *responsehistory.Session) {
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
if historySession != nil {
|
||||
historySession.Error(resp.StatusCode, strings.TrimSpace(string(body)), "error", "", "")
|
||||
}
|
||||
writeClaudeError(w, http.StatusInternalServerError, string(body))
|
||||
return
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "text/event-stream")
|
||||
w.Header().Set("Cache-Control", "no-cache, no-transform")
|
||||
w.Header().Set("Connection", "keep-alive")
|
||||
w.Header().Set("X-Accel-Buffering", "no")
|
||||
rc := http.NewResponseController(w)
|
||||
_, canFlush := w.(http.Flusher)
|
||||
if !canFlush {
|
||||
config.Logger.Warn("[claude_stream] response writer does not support flush; streaming may be buffered")
|
||||
}
|
||||
|
||||
streamRuntime := newClaudeStreamRuntime(
|
||||
w,
|
||||
rc,
|
||||
canFlush,
|
||||
model,
|
||||
messages,
|
||||
thinkingEnabled,
|
||||
searchEnabled,
|
||||
stripReferenceMarkersEnabled(),
|
||||
toolNames,
|
||||
toolsRaw,
|
||||
promptTokenText,
|
||||
historySession,
|
||||
)
|
||||
streamRuntime.sendMessageStart()
|
||||
|
||||
completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
|
||||
Surface: "claude.messages",
|
||||
Stream: true,
|
||||
RetryEnabled: true,
|
||||
MaxAttempts: 3,
|
||||
UsagePrompt: promptTokenText,
|
||||
}, completionruntime.StreamRetryHooks{
|
||||
ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
return h.consumeClaudeStreamAttempt(r, currentResp, streamRuntime, thinkingEnabled, allowDeferEmpty)
|
||||
},
|
||||
Finalize: func(_ int) {
|
||||
streamRuntime.finalize("end_turn", false)
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return streamRuntime.responseMessageID
|
||||
},
|
||||
OnRetryPrompt: func(prompt string) {
|
||||
streamRuntime.promptTokenText = prompt
|
||||
},
|
||||
OnRetryFailure: func(status int, message, code string) {
|
||||
streamRuntime.sendErrorWithCode(status, strings.TrimSpace(message), code)
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
func (h *Handler) consumeClaudeStreamAttempt(r *http.Request, resp *http.Response, streamRuntime *claudeStreamRuntime, thinkingEnabled bool, allowDeferEmpty bool) (bool, bool) {
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
initialType := "text"
|
||||
if thinkingEnabled {
|
||||
initialType = "thinking"
|
||||
}
|
||||
finalReason := streamengine.StopReason("")
|
||||
var scannerErr error
|
||||
streamengine.ConsumeSSE(streamengine.ConsumeConfig{
|
||||
Context: r.Context(),
|
||||
Body: resp.Body,
|
||||
ThinkingEnabled: thinkingEnabled,
|
||||
InitialType: initialType,
|
||||
KeepAliveInterval: claudeStreamPingInterval,
|
||||
IdleTimeout: claudeStreamIdleTimeout,
|
||||
MaxKeepAliveNoInput: claudeStreamMaxKeepaliveCnt,
|
||||
}, streamengine.ConsumeHooks{
|
||||
OnKeepAlive: func() {
|
||||
streamRuntime.sendPing()
|
||||
},
|
||||
OnParsed: streamRuntime.onParsed,
|
||||
OnFinalize: func(reason streamengine.StopReason, err error) {
|
||||
finalReason = reason
|
||||
scannerErr = err
|
||||
},
|
||||
})
|
||||
if string(finalReason) == "upstream_error" {
|
||||
if streamRuntime.history != nil {
|
||||
streamRuntime.history.Error(500, streamRuntime.upstreamErr, "upstream_error", responsehistory.ThinkingForArchive(streamRuntime.rawThinking.String(), streamRuntime.toolDetectionThinking.String(), streamRuntime.thinking.String()), responsehistory.TextForArchive(streamRuntime.rawText.String(), streamRuntime.text.String()))
|
||||
}
|
||||
streamRuntime.sendError(streamRuntime.upstreamErr)
|
||||
return true, false
|
||||
}
|
||||
if scannerErr != nil {
|
||||
if streamRuntime.history != nil {
|
||||
streamRuntime.history.Error(500, scannerErr.Error(), "error", responsehistory.ThinkingForArchive(streamRuntime.rawThinking.String(), streamRuntime.toolDetectionThinking.String(), streamRuntime.thinking.String()), responsehistory.TextForArchive(streamRuntime.rawText.String(), streamRuntime.text.String()))
|
||||
}
|
||||
streamRuntime.sendError(scannerErr.Error())
|
||||
return true, false
|
||||
}
|
||||
terminalWritten := streamRuntime.finalize("end_turn", allowDeferEmpty)
|
||||
if terminalWritten {
|
||||
return true, false
|
||||
}
|
||||
return false, true
|
||||
}
|
||||
|
||||
@@ -93,14 +93,51 @@ func TestNormalizeClaudeMessagesToolUseToAssistantToolCalls(t *testing.T) {
|
||||
t.Fatalf("expected call id preserved, got %#v", call)
|
||||
}
|
||||
content, _ := m["content"].(string)
|
||||
if !containsStr(content, "<|DSML|tool_calls>") || !containsStr(content, `<|DSML|invoke name="search_web">`) {
|
||||
if !containsStr(content, "<|DSML|tool_calls>") || !containsStr(content, `<|DSML|invoke name="search_web">`) {
|
||||
t.Fatalf("expected assistant content to include DSML tool call history, got %q", content)
|
||||
}
|
||||
if !containsStr(content, `<|DSML|parameter name="query"><![CDATA[latest]]></|DSML|parameter>`) {
|
||||
if !containsStr(content, `<|DSML|parameter name="query"><![CDATA[latest]]></|DSML|parameter>`) {
|
||||
t.Fatalf("expected assistant content to include serialized parameters, got %q", content)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizeClaudeMessagesPreservesThinkingOnToolUseHistory(t *testing.T) {
|
||||
msgs := []any{
|
||||
map[string]any{
|
||||
"role": "assistant",
|
||||
"content": []any{
|
||||
map[string]any{"type": "thinking", "thinking": "need live search before answering"},
|
||||
map[string]any{
|
||||
"type": "tool_use",
|
||||
"id": "call_1",
|
||||
"name": "search_web",
|
||||
"input": map[string]any{"query": "latest"},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
got := normalizeClaudeMessages(msgs)
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected one normalized tool-call message, got %#v", got)
|
||||
}
|
||||
m := got[0].(map[string]any)
|
||||
if m["reasoning_content"] != "need live search before answering" {
|
||||
t.Fatalf("expected thinking preserved as reasoning_content, got %#v", m)
|
||||
}
|
||||
tc, _ := m["tool_calls"].([]any)
|
||||
if len(tc) != 1 {
|
||||
t.Fatalf("expected one tool call, got %#v", m["tool_calls"])
|
||||
}
|
||||
prompt := buildClaudePromptTokenText(got, true)
|
||||
if !containsStr(prompt, "[reasoning_content]\nneed live search before answering\n[/reasoning_content]") {
|
||||
t.Fatalf("expected thinking in prompt history, got %q", prompt)
|
||||
}
|
||||
if !containsStr(prompt, `<|DSML|invoke name="search_web">`) {
|
||||
t.Fatalf("expected tool call in prompt history, got %q", prompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizeClaudeMessagesDoesNotPromoteUserToolUse(t *testing.T) {
|
||||
msgs := []any{
|
||||
map[string]any{
|
||||
|
||||
@@ -25,14 +25,21 @@ func normalizeClaudeMessages(messages []any) []any {
|
||||
switch content := msg["content"].(type) {
|
||||
case []any:
|
||||
textParts := make([]string, 0, len(content))
|
||||
pendingThinking := ""
|
||||
flushText := func() {
|
||||
if len(textParts) == 0 {
|
||||
return
|
||||
}
|
||||
out = append(out, map[string]any{
|
||||
message := map[string]any{
|
||||
"role": role,
|
||||
"content": strings.Join(textParts, "\n"),
|
||||
})
|
||||
}
|
||||
if role == "assistant" && strings.TrimSpace(pendingThinking) != "" {
|
||||
message["reasoning_content"] = pendingThinking
|
||||
message["content"] = prependClaudeReasoningForPrompt(pendingThinking, safeStringValue(message["content"]))
|
||||
pendingThinking = ""
|
||||
}
|
||||
out = append(out, message)
|
||||
textParts = textParts[:0]
|
||||
}
|
||||
for _, block := range content {
|
||||
@@ -46,10 +53,29 @@ func normalizeClaudeMessages(messages []any) []any {
|
||||
if t, ok := b["text"].(string); ok {
|
||||
textParts = append(textParts, t)
|
||||
}
|
||||
case "thinking":
|
||||
if role == "assistant" {
|
||||
if thinking := extractClaudeThinkingBlockText(b); thinking != "" {
|
||||
if pendingThinking == "" {
|
||||
pendingThinking = thinking
|
||||
} else {
|
||||
pendingThinking += "\n" + thinking
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
if raw := strings.TrimSpace(formatClaudeUnknownBlockForPrompt(b)); raw != "" {
|
||||
textParts = append(textParts, raw)
|
||||
}
|
||||
case "tool_use":
|
||||
if role == "assistant" {
|
||||
flushText()
|
||||
if toolMsg := normalizeClaudeToolUseToAssistant(b, state); toolMsg != nil {
|
||||
if strings.TrimSpace(pendingThinking) != "" {
|
||||
toolMsg["reasoning_content"] = pendingThinking
|
||||
toolMsg["content"] = prependClaudeReasoningForPrompt(pendingThinking, safeStringValue(toolMsg["content"]))
|
||||
pendingThinking = ""
|
||||
}
|
||||
out = append(out, toolMsg)
|
||||
}
|
||||
continue
|
||||
@@ -69,6 +95,13 @@ func normalizeClaudeMessages(messages []any) []any {
|
||||
}
|
||||
}
|
||||
flushText()
|
||||
if role == "assistant" && strings.TrimSpace(pendingThinking) != "" {
|
||||
out = append(out, map[string]any{
|
||||
"role": "assistant",
|
||||
"reasoning_content": pendingThinking,
|
||||
"content": formatClaudeReasoningForPrompt(pendingThinking),
|
||||
})
|
||||
}
|
||||
default:
|
||||
copied := cloneMap(msg)
|
||||
out = append(out, copied)
|
||||
@@ -77,6 +110,39 @@ func normalizeClaudeMessages(messages []any) []any {
|
||||
return out
|
||||
}
|
||||
|
||||
func prependClaudeReasoningForPrompt(reasoning, content string) string {
|
||||
reasoning = strings.TrimSpace(reasoning)
|
||||
content = strings.TrimSpace(content)
|
||||
if reasoning == "" {
|
||||
return content
|
||||
}
|
||||
block := formatClaudeReasoningForPrompt(reasoning)
|
||||
if content == "" {
|
||||
return block
|
||||
}
|
||||
return block + "\n\n" + content
|
||||
}
|
||||
|
||||
func formatClaudeReasoningForPrompt(reasoning string) string {
|
||||
reasoning = strings.TrimSpace(reasoning)
|
||||
if reasoning == "" {
|
||||
return ""
|
||||
}
|
||||
return "[reasoning_content]\n" + reasoning + "\n[/reasoning_content]"
|
||||
}
|
||||
|
||||
func extractClaudeThinkingBlockText(block map[string]any) string {
|
||||
if block == nil {
|
||||
return ""
|
||||
}
|
||||
for _, key := range []string{"thinking", "text", "content"} {
|
||||
if text := strings.TrimSpace(safeStringValue(block[key])); text != "" {
|
||||
return text
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func buildClaudeToolPrompt(tools []any) string {
|
||||
toolSchemas := make([]string, 0, len(tools))
|
||||
names := make([]string, 0, len(tools))
|
||||
|
||||
@@ -29,9 +29,10 @@ type claudeStreamRuntime struct {
|
||||
bufferToolContent bool
|
||||
stripReferenceMarkers bool
|
||||
|
||||
messageID string
|
||||
thinking strings.Builder
|
||||
text strings.Builder
|
||||
messageID string
|
||||
thinking strings.Builder
|
||||
text strings.Builder
|
||||
responseMessageID int
|
||||
|
||||
sieve toolstream.State
|
||||
rawText strings.Builder
|
||||
@@ -92,6 +93,9 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
|
||||
s.upstreamErr = parsed.ErrorMessage
|
||||
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("upstream_error")}
|
||||
}
|
||||
if parsed.ResponseMessageID > 0 {
|
||||
s.responseMessageID = parsed.ResponseMessageID
|
||||
}
|
||||
if parsed.Stop {
|
||||
return streamengine.ParsedDecision{Stop: true}
|
||||
}
|
||||
|
||||
@@ -22,16 +22,27 @@ func (s *claudeStreamRuntime) send(event string, v any) {
|
||||
}
|
||||
|
||||
func (s *claudeStreamRuntime) sendError(message string) {
|
||||
s.sendErrorWithCode(500, message, "internal_error")
|
||||
}
|
||||
|
||||
func (s *claudeStreamRuntime) sendErrorWithCode(status int, message, code string) {
|
||||
msg := strings.TrimSpace(message)
|
||||
if msg == "" {
|
||||
msg = "upstream stream error"
|
||||
}
|
||||
if code == "" {
|
||||
code = "internal_error"
|
||||
}
|
||||
errType := "api_error"
|
||||
if status == 429 {
|
||||
errType = "rate_limit_error"
|
||||
}
|
||||
s.send("error", map[string]any{
|
||||
"type": "error",
|
||||
"error": map[string]any{
|
||||
"type": "api_error",
|
||||
"type": errType,
|
||||
"message": msg,
|
||||
"code": "internal_error",
|
||||
"code": code,
|
||||
"param": nil,
|
||||
},
|
||||
})
|
||||
|
||||
@@ -63,13 +63,10 @@ func (s *claudeStreamRuntime) sendToolUseBlock(idx int, tc toolcall.ParsedToolCa
|
||||
})
|
||||
}
|
||||
|
||||
func (s *claudeStreamRuntime) finalize(stopReason string) {
|
||||
func (s *claudeStreamRuntime) finalize(stopReason string, deferEmptyOutput bool) bool {
|
||||
if s.ended {
|
||||
return
|
||||
return true
|
||||
}
|
||||
s.ended = true
|
||||
|
||||
s.closeThinkingBlock()
|
||||
|
||||
if s.bufferToolContent {
|
||||
for _, evt := range toolstream.Flush(&s.sieve, s.toolNames) {
|
||||
@@ -123,6 +120,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
|
||||
RawThinking: s.rawThinking.String(),
|
||||
VisibleThinking: s.thinking.String(),
|
||||
DetectionThinking: s.toolDetectionThinking.String(),
|
||||
ResponseMessageID: s.responseMessageID,
|
||||
AlreadyEmittedCalls: s.toolCallsDetected,
|
||||
AlreadyEmittedToolRaw: s.toolCallsDetected,
|
||||
}, assistantturn.BuildOptions{
|
||||
@@ -137,6 +135,22 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
|
||||
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
|
||||
AlreadyEmittedToolCalls: s.toolCallsDetected,
|
||||
})
|
||||
if outcome.ShouldFail {
|
||||
if deferEmptyOutput {
|
||||
return false
|
||||
}
|
||||
s.ended = true
|
||||
s.closeThinkingBlock()
|
||||
s.closeTextBlock()
|
||||
if s.history != nil {
|
||||
s.history.Error(outcome.Error.Status, outcome.Error.Message, outcome.Error.Code, responsehistory.ThinkingForArchive(turn.RawThinking, turn.DetectionThinking, turn.Thinking), responsehistory.TextForArchive(turn.RawText, turn.Text))
|
||||
}
|
||||
s.sendErrorWithCode(outcome.Error.Status, outcome.Error.Message, outcome.Error.Code)
|
||||
return true
|
||||
}
|
||||
|
||||
s.ended = true
|
||||
s.closeThinkingBlock()
|
||||
|
||||
if s.bufferToolContent && !s.toolCallsDetected {
|
||||
if len(turn.ToolCalls) > 0 {
|
||||
@@ -197,6 +211,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
|
||||
},
|
||||
})
|
||||
s.send("message_stop", map[string]any{"type": "message_stop"})
|
||||
return true
|
||||
}
|
||||
|
||||
func (s *claudeStreamRuntime) onFinalize(reason streamengine.StopReason, scannerErr error) {
|
||||
@@ -214,5 +229,5 @@ func (s *claudeStreamRuntime) onFinalize(reason streamengine.StopReason, scanner
|
||||
s.sendError(scannerErr.Error())
|
||||
return
|
||||
}
|
||||
s.finalize("end_turn")
|
||||
s.finalize("end_turn", false)
|
||||
}
|
||||
|
||||
@@ -44,14 +44,20 @@ func geminiMessagesFromRequest(req map[string]any) []any {
|
||||
}
|
||||
|
||||
textParts := make([]string, 0, len(parts))
|
||||
pendingThinking := ""
|
||||
flushText := func() {
|
||||
if len(textParts) == 0 {
|
||||
return
|
||||
}
|
||||
out = append(out, map[string]any{
|
||||
msg := map[string]any{
|
||||
"role": role,
|
||||
"content": strings.Join(textParts, "\n"),
|
||||
})
|
||||
}
|
||||
if role == "assistant" && strings.TrimSpace(pendingThinking) != "" {
|
||||
msg["reasoning_content"] = pendingThinking
|
||||
pendingThinking = ""
|
||||
}
|
||||
out = append(out, msg)
|
||||
textParts = textParts[:0]
|
||||
}
|
||||
|
||||
@@ -61,6 +67,14 @@ func geminiMessagesFromRequest(req map[string]any) []any {
|
||||
continue
|
||||
}
|
||||
if text := strings.TrimSpace(asString(part["text"])); text != "" {
|
||||
if role == "assistant" && isGeminiThoughtPart(part) {
|
||||
if pendingThinking == "" {
|
||||
pendingThinking = text
|
||||
} else {
|
||||
pendingThinking += "\n" + text
|
||||
}
|
||||
continue
|
||||
}
|
||||
textParts = append(textParts, text)
|
||||
continue
|
||||
}
|
||||
@@ -75,7 +89,7 @@ func geminiMessagesFromRequest(req map[string]any) []any {
|
||||
}
|
||||
}
|
||||
lastToolCallIDByName[strings.ToLower(name)] = callID
|
||||
out = append(out, map[string]any{
|
||||
msg := map[string]any{
|
||||
"role": "assistant",
|
||||
"tool_calls": []any{
|
||||
map[string]any{
|
||||
@@ -87,7 +101,12 @@ func geminiMessagesFromRequest(req map[string]any) []any {
|
||||
},
|
||||
},
|
||||
},
|
||||
})
|
||||
}
|
||||
if strings.TrimSpace(pendingThinking) != "" {
|
||||
msg["reasoning_content"] = pendingThinking
|
||||
pendingThinking = ""
|
||||
}
|
||||
out = append(out, msg)
|
||||
}
|
||||
continue
|
||||
}
|
||||
@@ -132,10 +151,29 @@ func geminiMessagesFromRequest(req map[string]any) []any {
|
||||
}
|
||||
}
|
||||
flushText()
|
||||
if role == "assistant" && strings.TrimSpace(pendingThinking) != "" {
|
||||
out = append(out, map[string]any{
|
||||
"role": "assistant",
|
||||
"reasoning_content": pendingThinking,
|
||||
})
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func isGeminiThoughtPart(part map[string]any) bool {
|
||||
if part == nil {
|
||||
return false
|
||||
}
|
||||
if v, ok := part["thought"].(bool); ok {
|
||||
return v
|
||||
}
|
||||
if v, ok := part["thoughtSignature"].(string); ok && strings.TrimSpace(v) != "" {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func normalizeGeminiSystemInstruction(raw any) string {
|
||||
switch v := raw.(type) {
|
||||
case string:
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
package gemini
|
||||
|
||||
import (
|
||||
"ds2api/internal/promptcompat"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
@@ -53,6 +54,46 @@ func TestGeminiMessagesFromRequestPreservesFunctionRoundtrip(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestGeminiMessagesFromRequestPreservesThoughtOnFunctionCallHistory(t *testing.T) {
|
||||
req := map[string]any{
|
||||
"contents": []any{
|
||||
map[string]any{
|
||||
"role": "model",
|
||||
"parts": []any{
|
||||
map[string]any{"text": "need current state before answering", "thought": true},
|
||||
map[string]any{
|
||||
"functionCall": map[string]any{
|
||||
"id": "call_g1",
|
||||
"name": "search_web",
|
||||
"args": map[string]any{"query": "ai"},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
got := geminiMessagesFromRequest(req)
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected one normalized message, got %#v", got)
|
||||
}
|
||||
assistant, _ := got[0].(map[string]any)
|
||||
if assistant["reasoning_content"] != "need current state before answering" {
|
||||
t.Fatalf("expected thought preserved as reasoning_content, got %#v", assistant)
|
||||
}
|
||||
tc, _ := assistant["tool_calls"].([]any)
|
||||
if len(tc) != 1 {
|
||||
t.Fatalf("expected one tool call, got %#v", assistant["tool_calls"])
|
||||
}
|
||||
prompt, _ := promptcompat.BuildOpenAIPromptForAdapter(got, nil, "", true)
|
||||
if !strings.Contains(prompt, "[reasoning_content]\nneed current state before answering\n[/reasoning_content]") {
|
||||
t.Fatalf("expected thought in prompt history, got %q", prompt)
|
||||
}
|
||||
if !strings.Contains(prompt, `<|DSML|invoke name="search_web">`) {
|
||||
t.Fatalf("expected tool call in prompt history, got %q", prompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGeminiMessagesFromRequestPreservesUnknownPartAsRawJSONText(t *testing.T) {
|
||||
req := map[string]any{
|
||||
"contents": []any{
|
||||
|
||||
@@ -137,7 +137,7 @@ func (h *Handler) handleGeminiDirectStream(w http.ResponseWriter, r *http.Reques
|
||||
return
|
||||
}
|
||||
streamReq := start.Request
|
||||
h.handleStreamGenerateContent(w, r, start.Response, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, historySession)
|
||||
h.handleStreamGenerateContentWithRetry(w, r, a, start.Response, start.Payload, start.Pow, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, historySession)
|
||||
}
|
||||
|
||||
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream bool) bool {
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
package gemini
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
@@ -8,6 +9,8 @@ import (
|
||||
"time"
|
||||
|
||||
"ds2api/internal/assistantturn"
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/completionruntime"
|
||||
dsprotocol "ds2api/internal/deepseek/protocol"
|
||||
"ds2api/internal/responsehistory"
|
||||
"ds2api/internal/sse"
|
||||
@@ -54,7 +57,7 @@ func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Req
|
||||
}, streamengine.ConsumeHooks{
|
||||
OnParsed: runtime.onParsed,
|
||||
OnFinalize: func(_ streamengine.StopReason, _ error) {
|
||||
runtime.finalize()
|
||||
runtime.finalize(false)
|
||||
},
|
||||
})
|
||||
}
|
||||
@@ -78,9 +81,83 @@ type geminiStreamRuntime struct {
|
||||
accumulator *assistantturn.Accumulator
|
||||
contentFilter bool
|
||||
responseMessageID int
|
||||
finalErrorStatus int
|
||||
finalErrorMessage string
|
||||
finalErrorCode string
|
||||
history *responsehistory.Session
|
||||
}
|
||||
|
||||
func (h *Handler) handleStreamGenerateContentWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *responsehistory.Session) {
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
if historySession != nil {
|
||||
historySession.Error(resp.StatusCode, strings.TrimSpace(string(body)), "error", "", "")
|
||||
}
|
||||
writeGeminiError(w, resp.StatusCode, strings.TrimSpace(string(body)))
|
||||
return
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "text/event-stream")
|
||||
w.Header().Set("Cache-Control", "no-cache, no-transform")
|
||||
w.Header().Set("Connection", "keep-alive")
|
||||
w.Header().Set("X-Accel-Buffering", "no")
|
||||
|
||||
rc := http.NewResponseController(w)
|
||||
_, canFlush := w.(http.Flusher)
|
||||
runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw, historySession)
|
||||
|
||||
completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
|
||||
Surface: "gemini.generate_content",
|
||||
Stream: true,
|
||||
RetryEnabled: true,
|
||||
MaxAttempts: 3,
|
||||
UsagePrompt: finalPrompt,
|
||||
}, completionruntime.StreamRetryHooks{
|
||||
ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
return h.consumeGeminiStreamAttempt(r.Context(), currentResp, runtime, thinkingEnabled, allowDeferEmpty)
|
||||
},
|
||||
Finalize: func(_ int) {
|
||||
runtime.finalize(false)
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return runtime.responseMessageID
|
||||
},
|
||||
OnRetryPrompt: func(prompt string) {
|
||||
runtime.finalPrompt = prompt
|
||||
},
|
||||
OnRetryFailure: func(status int, message, _ string) {
|
||||
runtime.sendErrorChunk(status, strings.TrimSpace(message))
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
func (h *Handler) consumeGeminiStreamAttempt(ctx context.Context, resp *http.Response, runtime *geminiStreamRuntime, thinkingEnabled bool, allowDeferEmpty bool) (bool, bool) {
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
initialType := "text"
|
||||
if thinkingEnabled {
|
||||
initialType = "thinking"
|
||||
}
|
||||
streamengine.ConsumeSSE(streamengine.ConsumeConfig{
|
||||
Context: ctx,
|
||||
Body: resp.Body,
|
||||
ThinkingEnabled: thinkingEnabled,
|
||||
InitialType: initialType,
|
||||
KeepAliveInterval: time.Duration(dsprotocol.KeepAliveTimeout) * time.Second,
|
||||
IdleTimeout: time.Duration(dsprotocol.StreamIdleTimeout) * time.Second,
|
||||
MaxKeepAliveNoInput: dsprotocol.MaxKeepaliveCount,
|
||||
}, streamengine.ConsumeHooks{
|
||||
OnParsed: runtime.onParsed,
|
||||
OnFinalize: func(_ streamengine.StopReason, _ error) {
|
||||
},
|
||||
})
|
||||
terminalWritten := runtime.finalize(allowDeferEmpty)
|
||||
if terminalWritten {
|
||||
return true, false
|
||||
}
|
||||
return false, true
|
||||
}
|
||||
|
||||
//nolint:unused // retained for native Gemini stream handling path.
|
||||
func newGeminiStreamRuntime(
|
||||
w http.ResponseWriter,
|
||||
@@ -127,6 +204,35 @@ func (s *geminiStreamRuntime) sendChunk(payload map[string]any) {
|
||||
}
|
||||
}
|
||||
|
||||
func (s *geminiStreamRuntime) sendErrorChunk(status int, message string) {
|
||||
msg := strings.TrimSpace(message)
|
||||
if msg == "" {
|
||||
msg = http.StatusText(status)
|
||||
}
|
||||
errorStatus := "INVALID_ARGUMENT"
|
||||
switch status {
|
||||
case http.StatusUnauthorized:
|
||||
errorStatus = "UNAUTHENTICATED"
|
||||
case http.StatusForbidden:
|
||||
errorStatus = "PERMISSION_DENIED"
|
||||
case http.StatusTooManyRequests:
|
||||
errorStatus = "RESOURCE_EXHAUSTED"
|
||||
case http.StatusNotFound:
|
||||
errorStatus = "NOT_FOUND"
|
||||
default:
|
||||
if status >= 500 {
|
||||
errorStatus = "INTERNAL"
|
||||
}
|
||||
}
|
||||
s.sendChunk(map[string]any{
|
||||
"error": map[string]any{
|
||||
"code": status,
|
||||
"message": msg,
|
||||
"status": errorStatus,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
//nolint:unused // retained for native Gemini stream handling path.
|
||||
func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedDecision {
|
||||
if !parsed.Parsed {
|
||||
@@ -192,7 +298,7 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
|
||||
}
|
||||
|
||||
//nolint:unused // retained for native Gemini stream handling path.
|
||||
func (s *geminiStreamRuntime) finalize() {
|
||||
func (s *geminiStreamRuntime) finalize(deferEmptyOutput bool) bool {
|
||||
rawText, text, rawThinking, thinking, detectionThinking := s.accumulator.Snapshot()
|
||||
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
|
||||
RawText: rawText,
|
||||
@@ -211,6 +317,19 @@ func (s *geminiStreamRuntime) finalize() {
|
||||
ToolsRaw: s.toolsRaw,
|
||||
})
|
||||
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
|
||||
if outcome.ShouldFail {
|
||||
if deferEmptyOutput {
|
||||
s.finalErrorStatus = outcome.Error.Status
|
||||
s.finalErrorMessage = outcome.Error.Message
|
||||
s.finalErrorCode = outcome.Error.Code
|
||||
return false
|
||||
}
|
||||
if s.history != nil {
|
||||
s.history.Error(outcome.Error.Status, outcome.Error.Message, outcome.Error.Code, responsehistory.ThinkingForArchive(turn.RawThinking, turn.DetectionThinking, turn.Thinking), responsehistory.TextForArchive(turn.RawText, turn.Text))
|
||||
}
|
||||
s.sendErrorChunk(outcome.Error.Status, outcome.Error.Message)
|
||||
return true
|
||||
}
|
||||
if s.history != nil {
|
||||
s.history.Success(
|
||||
http.StatusOK,
|
||||
@@ -257,4 +376,5 @@ func (s *geminiStreamRuntime) finalize() {
|
||||
"totalTokenCount": outcome.Usage.TotalTokens,
|
||||
},
|
||||
})
|
||||
return true
|
||||
}
|
||||
|
||||
@@ -4,11 +4,11 @@ import (
|
||||
"context"
|
||||
"io"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"ds2api/internal/assistantturn"
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/completionruntime"
|
||||
"ds2api/internal/config"
|
||||
dsprotocol "ds2api/internal/deepseek/protocol"
|
||||
openaifmt "ds2api/internal/format/openai"
|
||||
@@ -17,191 +17,94 @@ import (
|
||||
streamengine "ds2api/internal/stream"
|
||||
)
|
||||
|
||||
type chatNonStreamResult struct {
|
||||
rawThinking string
|
||||
rawText string
|
||||
thinking string
|
||||
toolDetectionThinking string
|
||||
text string
|
||||
contentFilter bool
|
||||
detectedCalls int
|
||||
body map[string]any
|
||||
finishReason string
|
||||
responseMessageID int
|
||||
outputError *assistantturn.OutputError
|
||||
}
|
||||
|
||||
func (r chatNonStreamResult) historyText() string {
|
||||
return historyTextForArchive(r.rawText, r.text)
|
||||
}
|
||||
|
||||
func (r chatNonStreamResult) historyThinking() string {
|
||||
return historyThinkingForArchive(r.rawThinking, r.toolDetectionThinking, r.thinking)
|
||||
}
|
||||
|
||||
func (h *Handler) handleNonStreamWithRetry(w http.ResponseWriter, ctx context.Context, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) {
|
||||
attempts := 0
|
||||
currentResp := resp
|
||||
usagePrompt := finalPrompt
|
||||
accumulatedThinking := ""
|
||||
accumulatedRawThinking := ""
|
||||
accumulatedToolDetectionThinking := ""
|
||||
for {
|
||||
result, ok := h.collectChatNonStreamAttempt(w, currentResp, completionID, model, usagePrompt, thinkingEnabled, searchEnabled, toolNames, toolsRaw)
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, result.thinking)
|
||||
accumulatedRawThinking += sse.TrimContinuationOverlap(accumulatedRawThinking, result.rawThinking)
|
||||
accumulatedToolDetectionThinking += sse.TrimContinuationOverlap(accumulatedToolDetectionThinking, result.toolDetectionThinking)
|
||||
result.thinking = accumulatedThinking
|
||||
result.rawThinking = accumulatedRawThinking
|
||||
result.toolDetectionThinking = accumulatedToolDetectionThinking
|
||||
detected := detectAssistantToolCalls(result.rawText, result.text, result.rawThinking, result.toolDetectionThinking, toolNames)
|
||||
result.detectedCalls = len(detected.Calls)
|
||||
result.body = openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, result.thinking, result.text, detected.Calls, toolsRaw)
|
||||
addRefFileTokensToUsage(result.body, refFileTokens)
|
||||
result.finishReason = chatFinishReason(result.body)
|
||||
if !shouldRetryChatNonStream(result, attempts) {
|
||||
h.finishChatNonStreamResult(w, result, attempts, usagePrompt, refFileTokens, historySession)
|
||||
return
|
||||
}
|
||||
|
||||
attempts++
|
||||
config.Logger.Info("[openai_empty_retry] attempting synthetic retry", "surface", "chat.completions", "stream", false, "retry_attempt", attempts, "parent_message_id", result.responseMessageID)
|
||||
retryPow, powErr := h.DS.GetPow(ctx, a, 3)
|
||||
if powErr != nil {
|
||||
config.Logger.Warn("[openai_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", "chat.completions", "stream", false, "retry_attempt", attempts, "error", powErr)
|
||||
retryPow = pow
|
||||
}
|
||||
retryPayload := clonePayloadForEmptyOutputRetry(payload, result.responseMessageID)
|
||||
nextResp, err := h.DS.CallCompletion(ctx, a, retryPayload, retryPow, 3)
|
||||
if err != nil {
|
||||
if historySession != nil {
|
||||
historySession.error(http.StatusInternalServerError, "Failed to get completion.", "error", result.historyThinking(), result.historyText())
|
||||
}
|
||||
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
|
||||
config.Logger.Warn("[openai_empty_retry] retry request failed", "surface", "chat.completions", "stream", false, "retry_attempt", attempts, "error", err)
|
||||
return
|
||||
}
|
||||
usagePrompt = usagePromptWithEmptyOutputRetry(usagePrompt, attempts)
|
||||
currentResp = nextResp
|
||||
}
|
||||
}
|
||||
|
||||
func (h *Handler) collectChatNonStreamAttempt(w http.ResponseWriter, resp *http.Response, completionID, model, usagePrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any) (chatNonStreamResult, bool) {
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
writeOpenAIError(w, resp.StatusCode, string(body))
|
||||
return chatNonStreamResult{}, false
|
||||
}
|
||||
result := sse.CollectStream(resp, thinkingEnabled, true)
|
||||
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
|
||||
Model: model,
|
||||
Prompt: usagePrompt,
|
||||
SearchEnabled: searchEnabled,
|
||||
ToolNames: toolNames,
|
||||
ToolsRaw: toolsRaw,
|
||||
})
|
||||
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
|
||||
return chatNonStreamResult{
|
||||
rawThinking: result.Thinking,
|
||||
rawText: result.Text,
|
||||
thinking: turn.Thinking,
|
||||
toolDetectionThinking: result.ToolDetectionThinking,
|
||||
text: turn.Text,
|
||||
contentFilter: result.ContentFilter,
|
||||
detectedCalls: len(turn.ToolCalls),
|
||||
body: respBody,
|
||||
finishReason: chatFinishReason(respBody),
|
||||
responseMessageID: result.ResponseMessageID,
|
||||
outputError: turn.Error,
|
||||
}, true
|
||||
}
|
||||
|
||||
func (h *Handler) finishChatNonStreamResult(w http.ResponseWriter, result chatNonStreamResult, attempts int, usagePrompt string, refFileTokens int, historySession *chatHistorySession) {
|
||||
if result.detectedCalls == 0 && strings.TrimSpace(result.text) == "" {
|
||||
status, message, code := upstreamEmptyOutputDetail(result.contentFilter, result.text, result.thinking)
|
||||
if result.outputError != nil {
|
||||
status, message, code = result.outputError.Status, result.outputError.Message, result.outputError.Code
|
||||
}
|
||||
if historySession != nil {
|
||||
historySession.error(status, message, code, result.historyThinking(), result.historyText())
|
||||
historySession.error(resp.StatusCode, string(body), "error", "", "")
|
||||
}
|
||||
writeOpenAIErrorWithCode(w, status, message, code)
|
||||
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "chat.completions", "stream", false, "retry_attempts", attempts, "success_source", "none", "content_filter", result.contentFilter)
|
||||
writeOpenAIError(w, resp.StatusCode, string(body))
|
||||
return
|
||||
}
|
||||
if historySession != nil {
|
||||
historySession.success(http.StatusOK, result.historyThinking(), result.historyText(), result.finishReason, openaifmt.BuildChatUsageForModel("", usagePrompt, result.thinking, result.text, refFileTokens))
|
||||
stdReq := promptcompat.StandardRequest{
|
||||
Surface: "chat.completions",
|
||||
ResponseModel: model,
|
||||
PromptTokenText: finalPrompt,
|
||||
FinalPrompt: finalPrompt,
|
||||
RefFileTokens: refFileTokens,
|
||||
Thinking: thinkingEnabled,
|
||||
Search: searchEnabled,
|
||||
ToolNames: toolNames,
|
||||
ToolsRaw: toolsRaw,
|
||||
ToolChoice: promptcompat.DefaultToolChoicePolicy(),
|
||||
}
|
||||
writeJSON(w, http.StatusOK, result.body)
|
||||
source := "first_attempt"
|
||||
if attempts > 0 {
|
||||
source = "synthetic_retry"
|
||||
}
|
||||
config.Logger.Info("[openai_empty_retry] completed", "surface", "chat.completions", "stream", false, "retry_attempts", attempts, "success_source", source)
|
||||
}
|
||||
|
||||
func chatFinishReason(respBody map[string]any) string {
|
||||
if choices, ok := respBody["choices"].([]map[string]any); ok && len(choices) > 0 {
|
||||
if fr, _ := choices[0]["finish_reason"].(string); strings.TrimSpace(fr) != "" {
|
||||
return fr
|
||||
retryEnabled := h != nil && h.DS != nil && emptyOutputRetryEnabled()
|
||||
result, outErr := completionruntime.ExecuteNonStreamStartedWithRetry(ctx, h.DS, a, completionruntime.StartResult{
|
||||
SessionID: completionID,
|
||||
Payload: payload,
|
||||
Pow: pow,
|
||||
Response: resp,
|
||||
Request: stdReq,
|
||||
}, completionruntime.Options{
|
||||
RetryEnabled: retryEnabled,
|
||||
RetryMaxAttempts: emptyOutputRetryMaxAttempts(),
|
||||
})
|
||||
if outErr != nil {
|
||||
if historySession != nil {
|
||||
historySession.error(outErr.Status, outErr.Message, outErr.Code, historyThinkingForArchive(result.Turn.RawThinking, result.Turn.DetectionThinking, result.Turn.Thinking), historyTextForArchive(result.Turn.RawText, result.Turn.Text))
|
||||
}
|
||||
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
|
||||
return
|
||||
}
|
||||
return "stop"
|
||||
respBody := openaifmt.BuildChatCompletionWithToolCalls(result.SessionID, model, result.Turn.Prompt, result.Turn.Thinking, result.Turn.Text, result.Turn.ToolCalls, toolsRaw)
|
||||
respBody["usage"] = assistantturn.OpenAIChatUsage(result.Turn)
|
||||
outcome := assistantturn.FinalizeTurn(result.Turn, assistantturn.FinalizeOptions{})
|
||||
if historySession != nil {
|
||||
historySession.success(http.StatusOK, historyThinkingForArchive(result.Turn.RawThinking, result.Turn.DetectionThinking, result.Turn.Thinking), historyTextForArchive(result.Turn.RawText, result.Turn.Text), outcome.FinishReason, assistantturn.OpenAIChatUsage(result.Turn))
|
||||
}
|
||||
writeJSON(w, http.StatusOK, respBody)
|
||||
}
|
||||
|
||||
func shouldRetryChatNonStream(result chatNonStreamResult, attempts int) bool {
|
||||
return emptyOutputRetryEnabled() &&
|
||||
attempts < emptyOutputRetryMaxAttempts() &&
|
||||
!result.contentFilter &&
|
||||
result.detectedCalls == 0 &&
|
||||
strings.TrimSpace(result.text) == ""
|
||||
}
|
||||
|
||||
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
|
||||
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID string, sessionIDRef *string, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
|
||||
streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, historySession)
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
attempts := 0
|
||||
currentResp := resp
|
||||
for {
|
||||
terminalWritten, retryable := h.consumeChatStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, historySession, attempts < emptyOutputRetryMaxAttempts())
|
||||
if terminalWritten {
|
||||
logChatStreamTerminal(streamRuntime, attempts)
|
||||
return
|
||||
}
|
||||
if !retryable || !emptyOutputRetryEnabled() || attempts >= emptyOutputRetryMaxAttempts() {
|
||||
completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
|
||||
Surface: "chat.completions",
|
||||
Stream: true,
|
||||
RetryEnabled: emptyOutputRetryEnabled(),
|
||||
RetryMaxAttempts: emptyOutputRetryMaxAttempts(),
|
||||
MaxAttempts: 3,
|
||||
UsagePrompt: finalPrompt,
|
||||
}, completionruntime.StreamRetryHooks{
|
||||
ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
return h.consumeChatStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, historySession, allowDeferEmpty)
|
||||
},
|
||||
Finalize: func(attempts int) {
|
||||
streamRuntime.finalize("stop", false)
|
||||
recordChatStreamHistory(streamRuntime, historySession)
|
||||
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "chat.completions", "stream", true, "retry_attempts", attempts, "success_source", "none")
|
||||
return
|
||||
}
|
||||
attempts++
|
||||
config.Logger.Info("[openai_empty_retry] attempting synthetic retry", "surface", "chat.completions", "stream", true, "retry_attempt", attempts, "parent_message_id", streamRuntime.responseMessageID)
|
||||
retryPow, powErr := h.DS.GetPow(r.Context(), a, 3)
|
||||
if powErr != nil {
|
||||
config.Logger.Warn("[openai_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", "chat.completions", "stream", true, "retry_attempt", attempts, "error", powErr)
|
||||
retryPow = pow
|
||||
}
|
||||
nextResp, err := h.DS.CallCompletion(r.Context(), a, clonePayloadForEmptyOutputRetry(payload, streamRuntime.responseMessageID), retryPow, 3)
|
||||
if err != nil {
|
||||
failChatStreamRetry(streamRuntime, historySession, http.StatusInternalServerError, "Failed to get completion.", "error")
|
||||
config.Logger.Warn("[openai_empty_retry] retry request failed", "surface", "chat.completions", "stream", true, "retry_attempt", attempts, "error", err)
|
||||
return
|
||||
}
|
||||
if nextResp.StatusCode != http.StatusOK {
|
||||
defer func() { _ = nextResp.Body.Close() }()
|
||||
body, _ := io.ReadAll(nextResp.Body)
|
||||
failChatStreamRetry(streamRuntime, historySession, nextResp.StatusCode, string(body), "error")
|
||||
return
|
||||
}
|
||||
streamRuntime.finalPrompt = usagePromptWithEmptyOutputRetry(finalPrompt, attempts)
|
||||
currentResp = nextResp
|
||||
}
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return streamRuntime.responseMessageID
|
||||
},
|
||||
OnRetryPrompt: func(prompt string) {
|
||||
streamRuntime.finalPrompt = prompt
|
||||
},
|
||||
OnRetryFailure: func(status int, message, code string) {
|
||||
failChatStreamRetry(streamRuntime, historySession, status, message, code)
|
||||
},
|
||||
OnAccountSwitch: func(sessionID string) {
|
||||
if sessionIDRef != nil {
|
||||
*sessionIDRef = sessionID
|
||||
}
|
||||
},
|
||||
OnTerminal: func(attempts int) {
|
||||
logChatStreamTerminal(streamRuntime, attempts)
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Response, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) (*chatStreamRuntime, string, bool) {
|
||||
|
||||
@@ -106,10 +106,6 @@ func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
|
||||
return shared.CleanVisibleOutput(text, stripReferenceMarkers)
|
||||
}
|
||||
|
||||
func upstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
|
||||
return shared.UpstreamEmptyOutputDetail(contentFilter, text, thinking)
|
||||
}
|
||||
|
||||
func emptyOutputRetryEnabled() bool {
|
||||
return shared.EmptyOutputRetryEnabled()
|
||||
}
|
||||
@@ -118,14 +114,6 @@ func emptyOutputRetryMaxAttempts() int {
|
||||
return shared.EmptyOutputRetryMaxAttempts()
|
||||
}
|
||||
|
||||
func clonePayloadForEmptyOutputRetry(payload map[string]any, parentMessageID int) map[string]any {
|
||||
return shared.ClonePayloadForEmptyOutputRetry(payload, parentMessageID)
|
||||
}
|
||||
|
||||
func usagePromptWithEmptyOutputRetry(originalPrompt string, retryAttempts int) string {
|
||||
return shared.UsagePromptWithEmptyOutputRetry(originalPrompt, retryAttempts)
|
||||
}
|
||||
|
||||
func formatIncrementalStreamToolCallDeltas(deltas []toolstream.ToolCallDelta, ids map[int]string) []map[string]any {
|
||||
return shared.FormatIncrementalStreamToolCallDeltas(deltas, ids)
|
||||
}
|
||||
@@ -137,7 +125,3 @@ func filterIncrementalToolCallDeltasByAllowed(deltas []toolstream.ToolCallDelta,
|
||||
func formatFinalStreamToolCallsWithStableIDs(calls []toolcall.ParsedToolCall, ids map[int]string, toolsRaw any) []map[string]any {
|
||||
return shared.FormatFinalStreamToolCallsWithStableIDs(calls, ids, toolsRaw)
|
||||
}
|
||||
|
||||
func detectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking string, toolNames []string) toolcall.ToolCallParseResult {
|
||||
return shared.DetectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking, toolNames)
|
||||
}
|
||||
|
||||
@@ -114,7 +114,7 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
|
||||
}
|
||||
streamReq := start.Request
|
||||
refFileTokens := streamReq.RefFileTokens
|
||||
h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
|
||||
h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, &sessionID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
|
||||
}
|
||||
|
||||
func (h *Handler) autoDeleteRemoteSession(ctx context.Context, a *auth.RequestAuth, sessionID string) {
|
||||
|
||||
@@ -85,8 +85,7 @@ func streamFinishReason(frames []map[string]any) string {
|
||||
return ""
|
||||
}
|
||||
|
||||
// Backward-compatible alias for historical test name used in CI logs.
|
||||
func TestHandleNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
|
||||
func TestHandleNonStreamSingleAttemptReturns503WhenUpstreamOutputEmpty(t *testing.T) {
|
||||
h := &Handler{}
|
||||
resp := makeSSEHTTPResponse(
|
||||
`data: {"p":"response/content","v":""}`,
|
||||
@@ -95,17 +94,17 @@ func TestHandleNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
h.handleNonStream(rec, resp, "cid-empty", "deepseek-v4-flash", "prompt", 0, false, false, nil, nil, nil)
|
||||
if rec.Code != http.StatusTooManyRequests {
|
||||
t.Fatalf("expected status 429 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
|
||||
if rec.Code != http.StatusServiceUnavailable {
|
||||
t.Fatalf("expected status 503 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := decodeJSONBody(t, rec.Body.String())
|
||||
errObj, _ := out["error"].(map[string]any)
|
||||
if asString(errObj["code"]) != "upstream_empty_output" {
|
||||
t.Fatalf("expected code=upstream_empty_output, got %#v", out)
|
||||
if asString(errObj["code"]) != "upstream_unavailable" {
|
||||
t.Fatalf("expected code=upstream_unavailable, got %#v", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutput(t *testing.T) {
|
||||
func TestHandleNonStreamSingleAttemptReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutput(t *testing.T) {
|
||||
h := &Handler{}
|
||||
resp := makeSSEHTTPResponse(
|
||||
`data: {"code":"content_filter"}`,
|
||||
@@ -124,7 +123,7 @@ func TestHandleNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutp
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
|
||||
func TestHandleNonStreamSingleAttemptReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
|
||||
h := &Handler{}
|
||||
resp := makeSSEHTTPResponse(
|
||||
`data: {"p":"response/thinking_content","v":"Only thinking"}`,
|
||||
|
||||
@@ -1,26 +0,0 @@
|
||||
package chat
|
||||
|
||||
// addRefFileTokensToUsage adds inline-uploaded file token estimates to an existing
|
||||
// usage map inside a response object. This keeps the token accounting aware of file
|
||||
// content that the upstream model processes but that is not part of the prompt text.
|
||||
func addRefFileTokensToUsage(obj map[string]any, refFileTokens int) {
|
||||
if refFileTokens <= 0 || obj == nil {
|
||||
return
|
||||
}
|
||||
usage, ok := obj["usage"].(map[string]any)
|
||||
if !ok || usage == nil {
|
||||
return
|
||||
}
|
||||
for _, key := range []string{"input_tokens", "prompt_tokens"} {
|
||||
if v, ok := usage[key]; ok {
|
||||
if n, ok := v.(int); ok {
|
||||
usage[key] = n + refFileTokens
|
||||
}
|
||||
}
|
||||
}
|
||||
if v, ok := usage["total_tokens"]; ok {
|
||||
if n, ok := v.(int); ok {
|
||||
usage["total_tokens"] = n + refFileTokens
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -84,7 +84,7 @@ func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *
|
||||
"latest user turn",
|
||||
"[reasoning_content]",
|
||||
"hidden reasoning",
|
||||
"<|DSML|tool_calls>",
|
||||
"<|DSML|tool_calls>",
|
||||
} {
|
||||
if !strings.Contains(transcript, want) {
|
||||
t.Fatalf("expected transcript to contain %q, got %q", want, transcript)
|
||||
|
||||
@@ -7,6 +7,7 @@ import (
|
||||
"time"
|
||||
|
||||
"ds2api/internal/auth"
|
||||
"ds2api/internal/completionruntime"
|
||||
"ds2api/internal/config"
|
||||
dsprotocol "ds2api/internal/deepseek/protocol"
|
||||
"ds2api/internal/promptcompat"
|
||||
@@ -19,41 +20,34 @@ func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
attempts := 0
|
||||
currentResp := resp
|
||||
for {
|
||||
terminalWritten, retryable := h.consumeResponsesStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, attempts < emptyOutputRetryMaxAttempts())
|
||||
if terminalWritten {
|
||||
logResponsesStreamTerminal(streamRuntime, attempts)
|
||||
return
|
||||
}
|
||||
if !retryable || !emptyOutputRetryEnabled() || attempts >= emptyOutputRetryMaxAttempts() {
|
||||
completionruntime.ExecuteStreamWithRetry(r.Context(), h.DS, a, resp, payload, pow, completionruntime.StreamRetryOptions{
|
||||
Surface: "responses",
|
||||
Stream: true,
|
||||
RetryEnabled: emptyOutputRetryEnabled(),
|
||||
RetryMaxAttempts: emptyOutputRetryMaxAttempts(),
|
||||
MaxAttempts: 3,
|
||||
UsagePrompt: finalPrompt,
|
||||
}, completionruntime.StreamRetryHooks{
|
||||
ConsumeAttempt: func(currentResp *http.Response, allowDeferEmpty bool) (bool, bool) {
|
||||
return h.consumeResponsesStreamAttempt(r, currentResp, streamRuntime, initialType, thinkingEnabled, allowDeferEmpty)
|
||||
},
|
||||
Finalize: func(attempts int) {
|
||||
streamRuntime.finalize("stop", false)
|
||||
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "responses", "stream", true, "retry_attempts", attempts, "success_source", "none", "error_code", streamRuntime.finalErrorCode)
|
||||
return
|
||||
}
|
||||
attempts++
|
||||
config.Logger.Info("[openai_empty_retry] attempting synthetic retry", "surface", "responses", "stream", true, "retry_attempt", attempts, "parent_message_id", streamRuntime.responseMessageID)
|
||||
retryPow, powErr := h.DS.GetPow(r.Context(), a, 3)
|
||||
if powErr != nil {
|
||||
config.Logger.Warn("[openai_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", "responses", "stream", true, "retry_attempt", attempts, "error", powErr)
|
||||
retryPow = pow
|
||||
}
|
||||
nextResp, err := h.DS.CallCompletion(r.Context(), a, clonePayloadForEmptyOutputRetry(payload, streamRuntime.responseMessageID), retryPow, 3)
|
||||
if err != nil {
|
||||
streamRuntime.failResponse(http.StatusInternalServerError, "Failed to get completion.", "error")
|
||||
config.Logger.Warn("[openai_empty_retry] retry request failed", "surface", "responses", "stream", true, "retry_attempt", attempts, "error", err)
|
||||
return
|
||||
}
|
||||
if nextResp.StatusCode != http.StatusOK {
|
||||
defer func() { _ = nextResp.Body.Close() }()
|
||||
body, _ := io.ReadAll(nextResp.Body)
|
||||
streamRuntime.failResponse(nextResp.StatusCode, strings.TrimSpace(string(body)), "error")
|
||||
return
|
||||
}
|
||||
streamRuntime.finalPrompt = usagePromptWithEmptyOutputRetry(finalPrompt, attempts)
|
||||
currentResp = nextResp
|
||||
}
|
||||
},
|
||||
ParentMessageID: func() int {
|
||||
return streamRuntime.responseMessageID
|
||||
},
|
||||
OnRetryPrompt: func(prompt string) {
|
||||
streamRuntime.finalPrompt = prompt
|
||||
},
|
||||
OnRetryFailure: func(status int, message, code string) {
|
||||
streamRuntime.failResponse(status, strings.TrimSpace(message), code)
|
||||
},
|
||||
OnTerminal: func(attempts int) {
|
||||
logResponsesStreamTerminal(streamRuntime, attempts)
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
func (h *Handler) prepareResponsesStreamRuntime(w http.ResponseWriter, resp *http.Response, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string, historySession *responsehistory.Session) (*responsesStreamRuntime, string, bool) {
|
||||
|
||||
@@ -103,14 +103,6 @@ func emptyOutputRetryMaxAttempts() int {
|
||||
return shared.EmptyOutputRetryMaxAttempts()
|
||||
}
|
||||
|
||||
func clonePayloadForEmptyOutputRetry(payload map[string]any, parentMessageID int) map[string]any {
|
||||
return shared.ClonePayloadForEmptyOutputRetry(payload, parentMessageID)
|
||||
}
|
||||
|
||||
func usagePromptWithEmptyOutputRetry(originalPrompt string, retryAttempts int) string {
|
||||
return shared.UsagePromptWithEmptyOutputRetry(originalPrompt, retryAttempts)
|
||||
}
|
||||
|
||||
func filterIncrementalToolCallDeltasByAllowed(deltas []toolstream.ToolCallDelta, seenNames map[int]string) []toolstream.ToolCallDelta {
|
||||
return shared.FilterIncrementalToolCallDeltasByAllowed(deltas, seenNames)
|
||||
}
|
||||
|
||||
@@ -81,6 +81,22 @@ func (s *responsesStreamRuntime) buildCompletedResponseObject(finalThinking, fin
|
||||
},
|
||||
},
|
||||
})
|
||||
} else if len(calls) > 0 && strings.TrimSpace(finalThinking) != "" {
|
||||
indexed = append(indexed, indexedItem{
|
||||
index: s.ensureMessageOutputIndex(),
|
||||
item: map[string]any{
|
||||
"id": s.ensureMessageItemID(),
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"status": "completed",
|
||||
"content": []map[string]any{
|
||||
{
|
||||
"type": "reasoning",
|
||||
"text": finalThinking,
|
||||
},
|
||||
},
|
||||
},
|
||||
})
|
||||
} else if len(calls) == 0 {
|
||||
content := make([]map[string]any, 0, 2)
|
||||
if finalThinking != "" {
|
||||
|
||||
@@ -397,7 +397,7 @@ func TestHandleResponsesNonStreamRequiredToolChoiceIgnoresThinkingToolPayloadWhe
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleResponsesNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
|
||||
func TestHandleResponsesNonStreamSingleAttemptReturns503WhenUpstreamOutputEmpty(t *testing.T) {
|
||||
h := &Handler{}
|
||||
rec := httptest.NewRecorder()
|
||||
resp := &http.Response{
|
||||
@@ -409,17 +409,17 @@ func TestHandleResponsesNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T)
|
||||
}
|
||||
|
||||
h.handleResponsesNonStream(rec, resp, "owner-a", "resp_test", "deepseek-v4-flash", "prompt", 0, false, false, nil, nil, promptcompat.DefaultToolChoicePolicy(), "")
|
||||
if rec.Code != http.StatusTooManyRequests {
|
||||
t.Fatalf("expected 429 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
|
||||
if rec.Code != http.StatusServiceUnavailable {
|
||||
t.Fatalf("expected 503 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := decodeJSONBody(t, rec.Body.String())
|
||||
errObj, _ := out["error"].(map[string]any)
|
||||
if asString(errObj["code"]) != "upstream_empty_output" {
|
||||
t.Fatalf("expected code=upstream_empty_output, got %#v", out)
|
||||
if asString(errObj["code"]) != "upstream_unavailable" {
|
||||
t.Fatalf("expected code=upstream_unavailable, got %#v", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleResponsesNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutput(t *testing.T) {
|
||||
func TestHandleResponsesNonStreamSingleAttemptReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutput(t *testing.T) {
|
||||
h := &Handler{}
|
||||
rec := httptest.NewRecorder()
|
||||
resp := &http.Response{
|
||||
@@ -441,7 +441,7 @@ func TestHandleResponsesNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWi
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleResponsesNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
|
||||
func TestHandleResponsesNonStreamSingleAttemptReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
|
||||
h := &Handler{}
|
||||
rec := httptest.NewRecorder()
|
||||
resp := &http.Response{
|
||||
|
||||
@@ -17,7 +17,7 @@ func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int,
|
||||
if thinking != "" {
|
||||
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned reasoning without visible output.", "upstream_empty_output"
|
||||
}
|
||||
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned empty output.", "upstream_empty_output"
|
||||
return http.StatusServiceUnavailable, "Upstream service is unavailable and returned no output.", "upstream_unavailable"
|
||||
}
|
||||
|
||||
func WriteUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
|
||||
|
||||
@@ -274,12 +274,12 @@ func TestChatCompletionsStreamEmitsFailureFrameWhenUpstreamOutputEmpty(t *testin
|
||||
}
|
||||
last := frames[0]
|
||||
statusCode, ok := last["status_code"].(float64)
|
||||
if !ok || int(statusCode) != http.StatusTooManyRequests {
|
||||
t.Fatalf("expected status_code=429, got %#v body=%s", last["status_code"], rec.Body.String())
|
||||
if !ok || int(statusCode) != http.StatusServiceUnavailable {
|
||||
t.Fatalf("expected status_code=503, got %#v body=%s", last["status_code"], rec.Body.String())
|
||||
}
|
||||
errObj, _ := last["error"].(map[string]any)
|
||||
if asString(errObj["code"]) != "upstream_empty_output" {
|
||||
t.Fatalf("expected code=upstream_empty_output, got %#v", last)
|
||||
if asString(errObj["code"]) != "upstream_unavailable" {
|
||||
t.Fatalf("expected code=upstream_unavailable, got %#v", last)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -345,7 +345,7 @@ func TestChatCompletionsStreamRetriesEmptyOutputOnSameSession(t *testing.T) {
|
||||
|
||||
func TestChatCompletionsNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
|
||||
ds := &streamStatusDSSeqStub{resps: []*http.Response{
|
||||
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99}`, "data: [DONE]"),
|
||||
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
|
||||
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
|
||||
}}
|
||||
h := &openAITestSurface{
|
||||
@@ -496,7 +496,7 @@ func TestResponsesStreamRetriesThinkingOnlyOutput(t *testing.T) {
|
||||
|
||||
func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
|
||||
ds := &streamStatusDSSeqStub{resps: []*http.Response{
|
||||
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88}`, "data: [DONE]"),
|
||||
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
|
||||
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
|
||||
}}
|
||||
h := &openAITestSurface{
|
||||
@@ -537,8 +537,15 @@ func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
|
||||
if len(content) == 0 {
|
||||
t.Fatalf("expected content entries, got %#v", item)
|
||||
}
|
||||
textEntry, _ := content[0].(map[string]any)
|
||||
if asString(textEntry["type"]) != "output_text" || asString(textEntry["text"]) != "visible" {
|
||||
var textEntry map[string]any
|
||||
for _, entry := range content {
|
||||
obj, _ := entry.(map[string]any)
|
||||
if asString(obj["type"]) == "output_text" {
|
||||
textEntry = obj
|
||||
break
|
||||
}
|
||||
}
|
||||
if asString(textEntry["text"]) != "visible" {
|
||||
t.Fatalf("expected visible text entry, got %#v", content)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -641,9 +641,9 @@ function upstreamEmptyOutputDetail(contentFilter, _text, thinking) {
|
||||
};
|
||||
}
|
||||
return {
|
||||
status: 429,
|
||||
message: 'Upstream account hit a rate limit and returned empty output.',
|
||||
code: 'upstream_empty_output',
|
||||
status: 503,
|
||||
message: 'Upstream service is unavailable and returned no output.',
|
||||
code: 'upstream_unavailable',
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
@@ -113,9 +113,10 @@ function filterToolCallsDetailed(parsed, toolNames) {
|
||||
if (!tc || !tc.name) {
|
||||
continue;
|
||||
}
|
||||
const input = tc.input && typeof tc.input === 'object' && !Array.isArray(tc.input) ? tc.input : {};
|
||||
calls.push({
|
||||
name: tc.name,
|
||||
input: tc.input && typeof tc.input === 'object' && !Array.isArray(tc.input) ? tc.input : {},
|
||||
input,
|
||||
});
|
||||
}
|
||||
return { calls, rejectedToolNames: [] };
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
'use strict';
|
||||
|
||||
const CDATA_PATTERN = /^<!\[CDATA\[([\s\S]*?)]]>$/i;
|
||||
const CDATA_PATTERN = /^(?:<|〈)!\[CDATA\[([\s\S]*?)]](?:>|>|〉)$/i;
|
||||
const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi;
|
||||
const TOOL_MARKUP_NAMES = [
|
||||
{ raw: 'tool_calls', canonical: 'tool_calls' },
|
||||
@@ -102,9 +102,10 @@ function updateCDATAStateLine(inCDATA, line) {
|
||||
let state = inCDATA;
|
||||
while (pos < lower.length) {
|
||||
if (state) {
|
||||
const end = lower.indexOf(']]>', pos);
|
||||
const cdataEnd = findCDATAEnd(lower, pos);
|
||||
const end = cdataEnd.index;
|
||||
if (end < 0) return true;
|
||||
pos = end + ']]>'.length;
|
||||
pos = end + cdataEnd.len;
|
||||
state = false;
|
||||
continue;
|
||||
}
|
||||
@@ -252,8 +253,9 @@ function replaceDSMLToolMarkupOutsideIgnored(text) {
|
||||
const tag = scanToolMarkupTagAt(raw, i);
|
||||
if (tag) {
|
||||
if (tag.dsmlLike) {
|
||||
out += `<${tag.closing ? '/' : ''}${tag.name}${raw.slice(tag.nameEnd, tag.end + 1)}`;
|
||||
if (raw[tag.end] !== '>') {
|
||||
const tail = normalizeToolMarkupTagTailForXML(raw.slice(tag.nameEnd, tag.end + 1));
|
||||
out += `<${tag.closing ? '/' : ''}${tag.name}${tail}`;
|
||||
if (!tail.endsWith('>')) {
|
||||
out += '>';
|
||||
}
|
||||
} else {
|
||||
@@ -409,11 +411,12 @@ function findMatchingXmlEndTagOutsideCDATA(text, tag, from) {
|
||||
|
||||
function skipXmlIgnoredSection(lower, i) {
|
||||
if (lower.startsWith('<![cdata[', i)) {
|
||||
const end = lower.indexOf(']]>', i + '<![cdata['.length);
|
||||
const cdataEnd = findCDATAEnd(lower, i + '<![cdata['.length);
|
||||
const end = cdataEnd.index;
|
||||
if (end < 0) {
|
||||
return { advanced: false, blocked: true, next: i };
|
||||
}
|
||||
return { advanced: true, blocked: false, next: end + ']]>'.length };
|
||||
return { advanced: true, blocked: false, next: end + cdataEnd.len };
|
||||
}
|
||||
if (lower.startsWith('<!--', i)) {
|
||||
const end = lower.indexOf('-->', i + '<!--'.length);
|
||||
@@ -425,14 +428,34 @@ function skipXmlIgnoredSection(lower, i) {
|
||||
return { advanced: false, blocked: false, next: i };
|
||||
}
|
||||
|
||||
function findCDATAEnd(text, from) {
|
||||
const ascii = text.indexOf(']]>', from);
|
||||
const fullwidth = text.indexOf(']]>', from);
|
||||
const cjk = text.indexOf(']]〉', from);
|
||||
if (ascii < 0 && fullwidth < 0 && cjk < 0) {
|
||||
return { index: -1, len: 0 };
|
||||
}
|
||||
let best = { index: -1, len: 0 };
|
||||
for (const candidate of [
|
||||
{ index: ascii, len: ']]>'.length },
|
||||
{ index: fullwidth, len: ']]>'.length },
|
||||
{ index: cjk, len: ']]〉'.length },
|
||||
]) {
|
||||
if (candidate.index >= 0 && (best.index < 0 || candidate.index < best.index)) {
|
||||
best = candidate;
|
||||
}
|
||||
}
|
||||
return best;
|
||||
}
|
||||
|
||||
function scanToolMarkupTagAt(text, start) {
|
||||
const raw = toStringSafe(text);
|
||||
if (!raw || start < 0 || start >= raw.length || raw[start] !== '<') {
|
||||
if (!raw || start < 0 || start >= raw.length || normalizeFullwidthASCIIChar(raw[start]) !== '<') {
|
||||
return null;
|
||||
}
|
||||
const lower = raw.toLowerCase();
|
||||
let i = start + 1;
|
||||
while (i < raw.length && raw[i] === '<') {
|
||||
while (i < raw.length && normalizeFullwidthASCIIChar(raw[i]) === '<') {
|
||||
i += 1;
|
||||
}
|
||||
const closing = raw[i] === '/';
|
||||
@@ -440,11 +463,19 @@ function scanToolMarkupTagAt(text, start) {
|
||||
i += 1;
|
||||
}
|
||||
const prefix = consumeToolMarkupNamePrefix(raw, lower, i);
|
||||
const prefixStart = i;
|
||||
i = prefix.next;
|
||||
const dsmlLike = prefix.dsmlLike;
|
||||
const { name, len } = matchToolMarkupName(lower, i, dsmlLike);
|
||||
let dsmlLike = prefix.dsmlLike;
|
||||
let { name, len } = matchToolMarkupName(raw, i, dsmlLike);
|
||||
if (!name) {
|
||||
return null;
|
||||
const fallback = matchToolMarkupNameAfterArbitraryPrefix(raw, prefixStart);
|
||||
if (!fallback.ok) {
|
||||
return null;
|
||||
}
|
||||
name = fallback.name;
|
||||
i = fallback.start;
|
||||
len = fallback.len;
|
||||
dsmlLike = true;
|
||||
}
|
||||
const originalNameEnd = i + len;
|
||||
let nameEnd = originalNameEnd;
|
||||
@@ -541,7 +572,7 @@ function findPartialToolMarkupStart(text) {
|
||||
}
|
||||
const start = includeDuplicateLeadingLessThan(raw, lastLT);
|
||||
const tail = raw.slice(start);
|
||||
if (tail.includes('>')) {
|
||||
if (tail.includes('>') || tail.includes('>')) {
|
||||
return -1;
|
||||
}
|
||||
return isPartialToolMarkupTagPrefix(tail) ? start : -1;
|
||||
@@ -556,7 +587,7 @@ function includeDuplicateLeadingLessThan(text, idx) {
|
||||
}
|
||||
|
||||
function isToolMarkupPipe(ch) {
|
||||
return ch === '|' || ch === '|';
|
||||
return ch === '|' || ch === '|' || ch === '␂' || ch === '\x02';
|
||||
}
|
||||
|
||||
function isPartialToolMarkupTagPrefix(text) {
|
||||
@@ -579,10 +610,13 @@ function isPartialToolMarkupTagPrefix(text) {
|
||||
if (i === raw.length) {
|
||||
return true;
|
||||
}
|
||||
if (hasToolMarkupNamePrefix(lower.slice(i))) {
|
||||
if (hasToolMarkupNamePrefix(raw, i)) {
|
||||
return true;
|
||||
}
|
||||
if ('dsml'.startsWith(lower.slice(i))) {
|
||||
if (normalizedASCIITailAt(raw, i).startsWith('dsml') || 'dsml'.startsWith(normalizedASCIITailAt(raw, i))) {
|
||||
return true;
|
||||
}
|
||||
if (hasPartialToolMarkupNameAfterArbitraryPrefix(raw, i)) {
|
||||
return true;
|
||||
}
|
||||
const next = consumeToolMarkupNamePrefixOnce(raw, lower, i);
|
||||
@@ -607,6 +641,61 @@ function consumeToolMarkupNamePrefix(raw, lower, idx) {
|
||||
}
|
||||
}
|
||||
|
||||
function matchToolMarkupNameAfterArbitraryPrefix(raw, start) {
|
||||
for (let idx = start; idx < raw.length;) {
|
||||
if (isToolMarkupTagTerminator(raw, idx)) {
|
||||
return { ok: false };
|
||||
}
|
||||
for (const name of TOOL_MARKUP_NAMES) {
|
||||
const matched = matchNormalizedASCII(raw, idx, name.raw);
|
||||
if (!matched.ok) continue;
|
||||
if (!toolMarkupPrefixAllowsLocalName(raw.slice(start, idx))) continue;
|
||||
return { ok: true, name: name.canonical, start: idx, len: matched.len };
|
||||
}
|
||||
idx += 1;
|
||||
}
|
||||
return { ok: false };
|
||||
}
|
||||
|
||||
function hasPartialToolMarkupNameAfterArbitraryPrefix(raw, start) {
|
||||
for (let idx = start; idx < raw.length;) {
|
||||
if (isToolMarkupTagTerminator(raw, idx)) {
|
||||
return false;
|
||||
}
|
||||
if (toolMarkupPrefixAllowsLocalName(raw.slice(start, idx)) && hasToolMarkupNamePrefix(raw, idx)) {
|
||||
return true;
|
||||
}
|
||||
if (toolMarkupPrefixAllowsLocalName(raw.slice(start, idx)) && hasDSMLNamePrefixOrPartial(raw, idx)) {
|
||||
return true;
|
||||
}
|
||||
idx += 1;
|
||||
}
|
||||
return toolMarkupPrefixAllowsLocalName(raw.slice(start));
|
||||
}
|
||||
|
||||
function hasDSMLNamePrefixOrPartial(raw, start) {
|
||||
const tail = normalizedASCIITailAt(raw, start);
|
||||
return tail.startsWith('dsml') || 'dsml'.startsWith(tail);
|
||||
}
|
||||
|
||||
function toolMarkupPrefixAllowsLocalName(prefix) {
|
||||
if (!prefix) {
|
||||
return false;
|
||||
}
|
||||
if (normalizedASCIITailAt(prefix, 0).includes('dsml')) {
|
||||
return true;
|
||||
}
|
||||
if (/[="'"]/.test(prefix)) {
|
||||
return false;
|
||||
}
|
||||
const previous = normalizeFullwidthASCIIChar(prefix[prefix.length - 1] || '');
|
||||
return !/^[A-Za-z0-9]$/.test(previous);
|
||||
}
|
||||
|
||||
function isToolMarkupTagTerminator(raw, idx) {
|
||||
return raw[idx] === '>' || normalizeFullwidthASCIIChar(raw[idx] || '') === '>';
|
||||
}
|
||||
|
||||
function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
|
||||
if (idx < raw.length && isToolMarkupPipe(raw[idx])) {
|
||||
return { next: idx + 1, ok: true };
|
||||
@@ -614,32 +703,87 @@ function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
|
||||
if (idx < raw.length && [' ', '\t', '\r', '\n'].includes(raw[idx])) {
|
||||
return { next: idx + 1, ok: true };
|
||||
}
|
||||
if (lower.startsWith('dsml', idx)) {
|
||||
let next = idx + 'dsml'.length;
|
||||
if (next < raw.length && raw[next] === '-') {
|
||||
const dsml = matchNormalizedASCII(raw, idx, 'dsml');
|
||||
if (dsml.ok) {
|
||||
let next = idx + dsml.len;
|
||||
const sep = normalizeFullwidthASCIIChar(raw[next] || '');
|
||||
if (next < raw.length && (sep === '-' || sep === '_')) {
|
||||
next += 1;
|
||||
}
|
||||
return { next, ok: true };
|
||||
}
|
||||
const arbitrary = consumeArbitraryToolMarkupNamePrefix(raw, lower, idx);
|
||||
if (arbitrary.ok) {
|
||||
return arbitrary;
|
||||
}
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
|
||||
function hasToolMarkupNamePrefix(lowerTail) {
|
||||
function consumeArbitraryToolMarkupNamePrefix(raw, lower, idx) {
|
||||
const first = consumeToolMarkupPrefixSegment(raw, idx);
|
||||
if (!first.ok) {
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
let j = first.next;
|
||||
while (j < raw.length) {
|
||||
const segment = consumeToolMarkupPrefixSegment(raw, j);
|
||||
if (!segment.ok) break;
|
||||
j = segment.next;
|
||||
}
|
||||
let k = j;
|
||||
while (k < raw.length && [' ', '\t', '\r', '\n'].includes(raw[k])) {
|
||||
k += 1;
|
||||
}
|
||||
let next = k;
|
||||
let ok = false;
|
||||
if (next < raw.length && isToolMarkupPipe(raw[next])) {
|
||||
next += 1;
|
||||
ok = true;
|
||||
} else if (next < raw.length && ['_', '-'].includes(normalizeFullwidthASCIIChar(raw[next]))) {
|
||||
next += 1;
|
||||
ok = true;
|
||||
}
|
||||
if (!ok) {
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
while (next < raw.length && [' ', '\t', '\r', '\n'].includes(raw[next])) {
|
||||
next += 1;
|
||||
}
|
||||
if (!hasToolMarkupNamePrefix(raw, next)) {
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
return { next, ok: true };
|
||||
}
|
||||
|
||||
function consumeToolMarkupPrefixSegment(raw, idx) {
|
||||
if (idx < 0 || idx >= raw.length) {
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
const ch = normalizeFullwidthASCIIChar(raw[idx]);
|
||||
if (/^[A-Za-z0-9]$/.test(ch)) {
|
||||
return { next: idx + 1, ok: true };
|
||||
}
|
||||
return { next: idx, ok: false };
|
||||
}
|
||||
|
||||
function hasToolMarkupNamePrefix(raw, start) {
|
||||
const tail = normalizedASCIITailAt(raw, start);
|
||||
for (const name of TOOL_MARKUP_NAMES) {
|
||||
if (lowerTail.startsWith(name.raw) || name.raw.startsWith(lowerTail)) {
|
||||
if (tail.startsWith(name.raw) || name.raw.startsWith(tail)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
function matchToolMarkupName(lower, start, dsmlLike) {
|
||||
function matchToolMarkupName(raw, start, dsmlLike) {
|
||||
for (const name of TOOL_MARKUP_NAMES) {
|
||||
if (name.dsmlOnly && !dsmlLike) {
|
||||
continue;
|
||||
}
|
||||
if (lower.startsWith(name.raw, start)) {
|
||||
return { name: name.canonical, len: name.raw.length };
|
||||
const matched = matchNormalizedASCII(raw, start, name.raw);
|
||||
if (matched.ok) {
|
||||
return { name: name.canonical, len: matched.len };
|
||||
}
|
||||
}
|
||||
return { name: '', len: 0 };
|
||||
@@ -649,17 +793,18 @@ function findXmlTagEnd(text, from) {
|
||||
let quote = '';
|
||||
for (let i = Math.max(0, from || 0); i < text.length; i += 1) {
|
||||
const ch = text[i];
|
||||
const normalized = normalizeFullwidthASCIIChar(ch);
|
||||
if (quote) {
|
||||
if (ch === quote) {
|
||||
if (normalized === quote) {
|
||||
quote = '';
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if (ch === '"' || ch === "'") {
|
||||
quote = ch;
|
||||
if (normalized === '"' || normalized === "'") {
|
||||
quote = normalized;
|
||||
continue;
|
||||
}
|
||||
if (ch === '>') {
|
||||
if (normalized === '>') {
|
||||
return i;
|
||||
}
|
||||
}
|
||||
@@ -670,13 +815,90 @@ function hasXmlTagBoundary(text, idx) {
|
||||
if (idx >= text.length) {
|
||||
return true;
|
||||
}
|
||||
return [' ', '\t', '\n', '\r', '>', '/'].includes(text[idx]);
|
||||
return [' ', '\t', '\n', '\r', '>', '/'].includes(text[idx])
|
||||
|| normalizeFullwidthASCIIChar(text[idx]) === '>';
|
||||
}
|
||||
|
||||
function isSelfClosingXmlTag(startTag) {
|
||||
return toStringSafe(startTag).trim().endsWith('/');
|
||||
}
|
||||
|
||||
function normalizeFullwidthASCIIChar(ch) {
|
||||
if (!ch) {
|
||||
return ch;
|
||||
}
|
||||
if (ch === '〈') {
|
||||
return '<';
|
||||
}
|
||||
if (ch === '〉') {
|
||||
return '>';
|
||||
}
|
||||
const code = ch.charCodeAt(0);
|
||||
if (code >= 0xff01 && code <= 0xff5e) {
|
||||
return String.fromCharCode(code - 0xfee0);
|
||||
}
|
||||
return ch;
|
||||
}
|
||||
|
||||
function normalizedASCIITailAt(raw, start) {
|
||||
let out = '';
|
||||
for (let i = Math.max(0, start || 0); i < raw.length; i += 1) {
|
||||
const ch = normalizeFullwidthASCIIChar(raw[i]).toLowerCase();
|
||||
if (ch.charCodeAt(0) > 0x7f) {
|
||||
break;
|
||||
}
|
||||
out += ch;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function matchNormalizedASCII(raw, start, expected) {
|
||||
let idx = start;
|
||||
for (let j = 0; j < expected.length; j += 1) {
|
||||
if (idx >= raw.length) {
|
||||
return { ok: false, len: 0 };
|
||||
}
|
||||
const ch = normalizeFullwidthASCIIChar(raw[idx]).toLowerCase();
|
||||
if (ch !== expected[j].toLowerCase()) {
|
||||
return { ok: false, len: 0 };
|
||||
}
|
||||
idx += 1;
|
||||
}
|
||||
return { ok: true, len: idx - start };
|
||||
}
|
||||
|
||||
function normalizeToolMarkupTagTailForXML(tail) {
|
||||
let out = '';
|
||||
const raw = typeof tail === 'string' ? tail : String(tail || '');
|
||||
let quote = '';
|
||||
for (let i = 0; i < raw.length; i += 1) {
|
||||
const ch = raw[i];
|
||||
const normalized = normalizeFullwidthASCIIChar(ch);
|
||||
if (quote) {
|
||||
out += normalized;
|
||||
if (normalized === quote) {
|
||||
quote = '';
|
||||
}
|
||||
} else if (normalized === '"' || normalized === "'") {
|
||||
quote = normalized;
|
||||
out += normalized;
|
||||
} else if (normalized === '|') {
|
||||
let j = i + 1;
|
||||
while (j < raw.length && [' ', '\t', '\r', '\n'].includes(raw[j])) {
|
||||
j += 1;
|
||||
}
|
||||
if (normalizeFullwidthASCIIChar(raw[j] || '') !== '>') {
|
||||
out += normalized;
|
||||
}
|
||||
} else if (['>', '/', '='].includes(normalized)) {
|
||||
out += normalized;
|
||||
} else {
|
||||
out += ch;
|
||||
}
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function parseMarkupInput(raw) {
|
||||
const s = toStringSafe(raw).trim();
|
||||
if (!s) {
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
'use strict';
|
||||
const { parseToolCalls } = require('./parse');
|
||||
const { parseToolCallsDetailed } = require('./parse');
|
||||
const {
|
||||
findToolMarkupTagOutsideIgnored,
|
||||
findMatchingToolMarkupClose,
|
||||
@@ -27,19 +27,30 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) {
|
||||
const xmlBlock = captured.slice(openTag.start, closeTag.end + 1);
|
||||
const prefixPart = captured.slice(0, openTag.start);
|
||||
const suffixPart = captured.slice(closeTag.end + 1);
|
||||
const parsed = parseToolCalls(xmlBlock, toolNames);
|
||||
if (Array.isArray(parsed) && parsed.length > 0) {
|
||||
const parsed = parseToolCallsDetailed(xmlBlock, toolNames);
|
||||
if (Array.isArray(parsed.calls) && parsed.calls.length > 0) {
|
||||
const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart);
|
||||
if (!best || openTag.start < best.start) {
|
||||
best = {
|
||||
start: openTag.start,
|
||||
prefix: trimmedFence.prefix,
|
||||
calls: parsed,
|
||||
calls: parsed.calls,
|
||||
suffix: trimmedFence.suffix,
|
||||
};
|
||||
}
|
||||
break;
|
||||
}
|
||||
if (parsed.sawToolCallSyntax) {
|
||||
if (!rejected || openTag.start < rejected.start) {
|
||||
rejected = {
|
||||
start: openTag.start,
|
||||
prefix: prefixPart + xmlBlock,
|
||||
suffix: suffixPart,
|
||||
};
|
||||
}
|
||||
searchFrom = openTag.end + 1;
|
||||
continue;
|
||||
}
|
||||
if (!rejected || openTag.start < rejected.start) {
|
||||
rejected = {
|
||||
start: openTag.start,
|
||||
@@ -69,16 +80,19 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) {
|
||||
const xmlBlock = '<tool_calls>' + captured.slice(invokeTag.start, closeTag.end + 1);
|
||||
const prefixPart = captured.slice(0, invokeTag.start);
|
||||
const suffixPart = captured.slice(closeTag.end + 1);
|
||||
const parsed = parseToolCalls(xmlBlock, toolNames);
|
||||
if (Array.isArray(parsed) && parsed.length > 0) {
|
||||
const parsed = parseToolCallsDetailed(xmlBlock, toolNames);
|
||||
if (Array.isArray(parsed.calls) && parsed.calls.length > 0) {
|
||||
const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart);
|
||||
return {
|
||||
ready: true,
|
||||
prefix: trimmedFence.prefix,
|
||||
calls: parsed,
|
||||
calls: parsed.calls,
|
||||
suffix: trimmedFence.suffix,
|
||||
};
|
||||
}
|
||||
if (parsed.sawToolCallSyntax) {
|
||||
return { ready: true, prefix: prefixPart + captured.slice(invokeTag.start, closeTag.end + 1), calls: [], suffix: suffixPart };
|
||||
}
|
||||
return { ready: true, prefix: prefixPart + captured.slice(invokeTag.start, closeTag.end + 1), calls: [], suffix: suffixPart };
|
||||
}
|
||||
}
|
||||
|
||||
@@ -16,6 +16,15 @@ var promptXMLTextEscaper = strings.NewReplacer(
|
||||
|
||||
var promptXMLNamePattern = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_.:-]*$`)
|
||||
|
||||
const (
|
||||
promptDSMLToolCallsOpen = "<|DSML|tool_calls>"
|
||||
promptDSMLToolCallsClose = "</|DSML|tool_calls>"
|
||||
promptDSMLInvokeOpen = "<|DSML|invoke"
|
||||
promptDSMLInvokeClose = "</|DSML|invoke>"
|
||||
promptDSMLParameterOpen = "<|DSML|parameter"
|
||||
promptDSMLParameterClose = "</|DSML|parameter>"
|
||||
)
|
||||
|
||||
// FormatToolCallsForPrompt renders a tool_calls slice into the prompt-visible
|
||||
// invoke/parameter history block used across adapters.
|
||||
func FormatToolCallsForPrompt(raw any) string {
|
||||
@@ -38,7 +47,7 @@ func FormatToolCallsForPrompt(raw any) string {
|
||||
if len(blocks) == 0 {
|
||||
return ""
|
||||
}
|
||||
return "<|DSML|tool_calls>\n" + strings.Join(blocks, "\n") + "\n</|DSML|tool_calls>"
|
||||
return promptDSMLToolCallsOpen + "\n" + strings.Join(blocks, "\n") + "\n" + promptDSMLToolCallsClose
|
||||
}
|
||||
|
||||
// StringifyToolCallArguments normalizes tool arguments into a compact string
|
||||
@@ -94,12 +103,12 @@ func formatToolCallForPrompt(call map[string]any) string {
|
||||
|
||||
parameters := formatToolCallParametersForPrompt(argsRaw)
|
||||
if parameters == "" {
|
||||
return ` <|DSML|invoke name="` + escapeXMLAttribute(name) + `"></|DSML|invoke>`
|
||||
return ` ` + promptDSMLInvokeOpen + ` name="` + escapeXMLAttribute(name) + `">` + promptDSMLInvokeClose
|
||||
}
|
||||
|
||||
return " <|DSML|invoke name=\"" + escapeXMLAttribute(name) + "\">\n" +
|
||||
return " " + promptDSMLInvokeOpen + " name=\"" + escapeXMLAttribute(name) + "\">\n" +
|
||||
parameters + "\n" +
|
||||
" </|DSML|invoke>"
|
||||
" " + promptDSMLInvokeClose
|
||||
}
|
||||
|
||||
func formatToolCallParametersForPrompt(raw any) string {
|
||||
@@ -113,7 +122,7 @@ func formatToolCallParametersForPrompt(raw any) string {
|
||||
if strings.TrimSpace(fallback) == "" {
|
||||
return ""
|
||||
}
|
||||
return " <|DSML|parameter name=\"content\">" + renderPromptXMLText(fallback) + "</|DSML|parameter>"
|
||||
return " " + promptDSMLParameterOpen + " name=\"content\">" + renderPromptXMLText(fallback) + promptDSMLParameterClose
|
||||
}
|
||||
|
||||
func renderPromptToolParameters(value any, indent string) (string, bool) {
|
||||
@@ -149,9 +158,9 @@ func renderPromptToolParameters(value any, indent string) (string, bool) {
|
||||
}
|
||||
return strings.Join(lines, "\n"), true
|
||||
case string:
|
||||
return indent + `<|DSML|parameter name="content">` + renderPromptXMLText(v) + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="content">` + renderPromptXMLText(v) + promptDSMLParameterClose, true
|
||||
default:
|
||||
return indent + `<|DSML|parameter name="value">` + renderPromptXMLText(fmt.Sprint(v)) + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="value">` + renderPromptXMLText(fmt.Sprint(v)) + promptDSMLParameterClose, true
|
||||
}
|
||||
}
|
||||
|
||||
@@ -162,29 +171,29 @@ func renderPromptParameterNode(name string, value any, indent string) (string, b
|
||||
}
|
||||
switch v := value.(type) {
|
||||
case nil:
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
|
||||
case map[string]any:
|
||||
body, ok := renderPromptToolXMLBody(v, indent+" ")
|
||||
if !ok {
|
||||
return "", false
|
||||
}
|
||||
if strings.TrimSpace(body) == "" {
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
|
||||
}
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + promptDSMLParameterClose, true
|
||||
case []any:
|
||||
body, ok := renderPromptToolXMLArray(v, indent+" ")
|
||||
if !ok {
|
||||
return "", false
|
||||
}
|
||||
if strings.TrimSpace(body) == "" {
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `"></|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + promptDSMLParameterClose, true
|
||||
}
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + "\">\n" + body + "\n" + indent + promptDSMLParameterClose, true
|
||||
case string:
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(v) + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(v) + promptDSMLParameterClose, true
|
||||
default:
|
||||
return indent + `<|DSML|parameter name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(fmt.Sprint(v)) + `</|DSML|parameter>`, true
|
||||
return indent + promptDSMLParameterOpen + ` name="` + escapeXMLAttribute(trimmedName) + `">` + renderPromptXMLText(fmt.Sprint(v)) + promptDSMLParameterClose, true
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -22,7 +22,7 @@ func TestFormatToolCallsForPromptDSML(t *testing.T) {
|
||||
if got == "" {
|
||||
t.Fatal("expected non-empty formatted tool calls")
|
||||
}
|
||||
if got != "<|DSML|tool_calls>\n <|DSML|invoke name=\"search_web\">\n <|DSML|parameter name=\"query\"><![CDATA[latest]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>" {
|
||||
if got != "<|DSML|tool_calls>\n <|DSML|invoke name=\"search_web\">\n <|DSML|parameter name=\"query\"><![CDATA[latest]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>" {
|
||||
t.Fatalf("unexpected formatted tool call DSML: %q", got)
|
||||
}
|
||||
}
|
||||
@@ -34,7 +34,7 @@ func TestFormatToolCallsForPromptEscapesXMLEntities(t *testing.T) {
|
||||
"arguments": `{"q":"a < b && c > d"}`,
|
||||
},
|
||||
})
|
||||
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"search<&>\">\n <|DSML|parameter name=\"q\"><![CDATA[a < b && c > d]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
|
||||
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"search<&>\">\n <|DSML|parameter name=\"q\"><![CDATA[a < b && c > d]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
|
||||
if got != want {
|
||||
t.Fatalf("unexpected escaped tool call XML: %q", got)
|
||||
}
|
||||
@@ -50,7 +50,7 @@ func TestFormatToolCallsForPromptUsesCDATAForMultilineContent(t *testing.T) {
|
||||
},
|
||||
},
|
||||
})
|
||||
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"write_file\">\n <|DSML|parameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></|DSML|parameter>\n <|DSML|parameter name=\"path\"><![CDATA[script.sh]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
|
||||
want := "<|DSML|tool_calls>\n <|DSML|invoke name=\"write_file\">\n <|DSML|parameter name=\"content\"><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></|DSML|parameter>\n <|DSML|parameter name=\"path\"><![CDATA[script.sh]]></|DSML|parameter>\n </|DSML|invoke>\n</|DSML|tool_calls>"
|
||||
if got != want {
|
||||
t.Fatalf("unexpected multiline cdata tool call XML: %q", got)
|
||||
}
|
||||
|
||||
@@ -38,10 +38,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
|
||||
t.Fatalf("expected 4 normalized messages with assistant tool history preserved, got %d", len(normalized))
|
||||
}
|
||||
assistantContent, _ := normalized[2]["content"].(string)
|
||||
if !strings.Contains(assistantContent, "<|DSML|tool_calls>") {
|
||||
if !strings.Contains(assistantContent, "<|DSML|tool_calls>") {
|
||||
t.Fatalf("assistant tool history should be preserved in DSML form, got %q", assistantContent)
|
||||
}
|
||||
if !strings.Contains(assistantContent, `<|DSML|invoke name="get_weather">`) {
|
||||
if !strings.Contains(assistantContent, `<|DSML|invoke name="get_weather">`) {
|
||||
t.Fatalf("expected tool name in preserved history, got %q", assistantContent)
|
||||
}
|
||||
if !strings.Contains(normalized[3]["content"].(string), `"temp":18`) {
|
||||
@@ -49,7 +49,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantToolCallsAndToolResult(t *tes
|
||||
}
|
||||
|
||||
prompt := util.MessagesPrepare(normalized)
|
||||
if !strings.Contains(prompt, "<|DSML|tool_calls>") {
|
||||
if !strings.Contains(prompt, "<|DSML|tool_calls>") {
|
||||
t.Fatalf("expected preserved assistant tool history in prompt: %q", prompt)
|
||||
}
|
||||
}
|
||||
@@ -177,10 +177,10 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantMultipleToolCallsRemainSepara
|
||||
t.Fatalf("expected assistant tool_call-only message preserved, got %#v", normalized)
|
||||
}
|
||||
content, _ := normalized[0]["content"].(string)
|
||||
if strings.Count(content, "<|DSML|invoke name=") != 2 {
|
||||
if strings.Count(content, "<|DSML|invoke name=") != 2 {
|
||||
t.Fatalf("expected two preserved tool call blocks, got %q", content)
|
||||
}
|
||||
if !strings.Contains(content, `<|DSML|invoke name="search_web">`) || !strings.Contains(content, `<|DSML|invoke name="eval_javascript">`) {
|
||||
if !strings.Contains(content, `<|DSML|invoke name="search_web">`) || !strings.Contains(content, `<|DSML|invoke name="eval_javascript">`) {
|
||||
t.Fatalf("expected both tool names in preserved history, got %q", content)
|
||||
}
|
||||
}
|
||||
@@ -258,7 +258,7 @@ func TestNormalizeOpenAIMessagesForPrompt_AssistantNilContentDoesNotInjectNullLi
|
||||
if strings.Contains(content, "null") {
|
||||
t.Fatalf("expected no null literal injection, got %q", content)
|
||||
}
|
||||
if !strings.Contains(content, "<|DSML|tool_calls>") {
|
||||
if !strings.Contains(content, "<|DSML|tool_calls>") {
|
||||
t.Fatalf("expected assistant tool history in normalized content, got %q", content)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -47,10 +47,10 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes
|
||||
if !strings.Contains(finalPrompt, `"condition":"sunny"`) {
|
||||
t.Fatalf("handler finalPrompt should preserve tool output content: %q", finalPrompt)
|
||||
}
|
||||
if !strings.Contains(finalPrompt, "<|DSML|tool_calls>") {
|
||||
if !strings.Contains(finalPrompt, "<|DSML|tool_calls>") {
|
||||
t.Fatalf("handler finalPrompt should preserve assistant tool history: %q", finalPrompt)
|
||||
}
|
||||
if !strings.Contains(finalPrompt, `<|DSML|invoke name="get_weather">`) {
|
||||
if !strings.Contains(finalPrompt, `<|DSML|invoke name="get_weather">`) {
|
||||
t.Fatalf("handler finalPrompt should include tool name history: %q", finalPrompt)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
package promptcompat
|
||||
|
||||
import "testing"
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestNormalizeResponsesInputItemPreservesAssistantReasoningContent(t *testing.T) {
|
||||
item := map[string]any{
|
||||
@@ -48,3 +51,44 @@ func TestNormalizeResponsesInputItemAssistantMessageWithReasoningBlocks(t *testi
|
||||
t.Fatalf("expected content blocks preserved, got %#v", got["content"])
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizeResponsesInputArrayMergesReasoningMessageIntoFunctionCallHistory(t *testing.T) {
|
||||
input := []any{
|
||||
map[string]any{
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"content": []any{
|
||||
map[string]any{"type": "reasoning", "text": "need fresh docs before answering"},
|
||||
},
|
||||
},
|
||||
map[string]any{
|
||||
"type": "function_call",
|
||||
"call_id": "call_search",
|
||||
"name": "search_web",
|
||||
"arguments": `{"query":"docs"}`,
|
||||
},
|
||||
}
|
||||
|
||||
got := NormalizeResponsesInputAsMessages(input)
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected reasoning and function_call merged into one assistant message, got %#v", got)
|
||||
}
|
||||
msg, _ := got[0].(map[string]any)
|
||||
if msg["role"] != "assistant" {
|
||||
t.Fatalf("expected assistant message, got %#v", msg)
|
||||
}
|
||||
if msg["reasoning_content"] != "need fresh docs before answering" {
|
||||
t.Fatalf("expected reasoning_content on tool-call message, got %#v", msg)
|
||||
}
|
||||
toolCalls, _ := msg["tool_calls"].([]any)
|
||||
if len(toolCalls) != 1 {
|
||||
t.Fatalf("expected one tool call, got %#v", msg["tool_calls"])
|
||||
}
|
||||
history := BuildOpenAIHistoryTranscript(got)
|
||||
if !strings.Contains(history, "[reasoning_content]\nneed fresh docs before answering\n[/reasoning_content]") {
|
||||
t.Fatalf("expected reasoning in history transcript, got %q", history)
|
||||
}
|
||||
if !strings.Contains(history, `<|DSML|invoke name="search_web">`) {
|
||||
t.Fatalf("expected tool call in history transcript, got %q", history)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -61,19 +61,52 @@ func normalizeResponsesInputArray(items []any) []any {
|
||||
out := make([]any, 0, len(items))
|
||||
callNameByID := map[string]string{}
|
||||
fallbackParts := make([]string, 0, len(items))
|
||||
pendingAssistantReasoning := ""
|
||||
flushFallback := func() {
|
||||
if len(fallbackParts) == 0 {
|
||||
return
|
||||
}
|
||||
if pendingAssistantReasoning != "" {
|
||||
out = append(out, map[string]any{"role": "assistant", "reasoning_content": pendingAssistantReasoning})
|
||||
pendingAssistantReasoning = ""
|
||||
}
|
||||
out = append(out, map[string]any{"role": "user", "content": strings.Join(fallbackParts, "\n")})
|
||||
fallbackParts = fallbackParts[:0]
|
||||
}
|
||||
flushPendingReasoning := func() {
|
||||
if pendingAssistantReasoning == "" {
|
||||
return
|
||||
}
|
||||
out = append(out, map[string]any{"role": "assistant", "reasoning_content": pendingAssistantReasoning})
|
||||
pendingAssistantReasoning = ""
|
||||
}
|
||||
|
||||
for _, item := range items {
|
||||
switch x := item.(type) {
|
||||
case map[string]any:
|
||||
if msg := normalizeResponsesInputItemWithState(x, callNameByID); msg != nil {
|
||||
if reasoning := assistantReasoningOnlyContent(msg); reasoning != "" {
|
||||
if pendingAssistantReasoning == "" {
|
||||
pendingAssistantReasoning = reasoning
|
||||
} else {
|
||||
pendingAssistantReasoning += "\n" + reasoning
|
||||
}
|
||||
continue
|
||||
}
|
||||
if isAssistantToolCallMessage(msg) && pendingAssistantReasoning != "" {
|
||||
if strings.TrimSpace(normalizeOpenAIReasoningContentForPrompt(msg["reasoning_content"])) == "" {
|
||||
msg["reasoning_content"] = pendingAssistantReasoning
|
||||
}
|
||||
pendingAssistantReasoning = ""
|
||||
} else {
|
||||
flushPendingReasoning()
|
||||
}
|
||||
flushFallback()
|
||||
if isAssistantToolCallMessage(msg) && len(out) > 0 {
|
||||
if merged := mergeResponsesAssistantToolCalls(out[len(out)-1], msg); merged {
|
||||
continue
|
||||
}
|
||||
}
|
||||
out = append(out, msg)
|
||||
continue
|
||||
}
|
||||
@@ -86,9 +119,55 @@ func normalizeResponsesInputArray(items []any) []any {
|
||||
}
|
||||
}
|
||||
}
|
||||
flushPendingReasoning()
|
||||
flushFallback()
|
||||
if len(out) == 0 {
|
||||
return nil
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func assistantReasoningOnlyContent(msg map[string]any) string {
|
||||
if !isAssistantMessage(msg) || isAssistantToolCallMessage(msg) {
|
||||
return ""
|
||||
}
|
||||
if _, hasContent := msg["content"]; hasContent {
|
||||
normalizedContent := strings.TrimSpace(NormalizeOpenAIContentForPrompt(msg["content"]))
|
||||
reasoningFromContent := strings.TrimSpace(extractOpenAIReasoningContentFromMessage(msg["content"]))
|
||||
if normalizedContent != "" && normalizedContent != reasoningFromContent {
|
||||
return ""
|
||||
}
|
||||
if reasoningFromContent != "" {
|
||||
return reasoningFromContent
|
||||
}
|
||||
}
|
||||
return strings.TrimSpace(normalizeOpenAIReasoningContentForPrompt(msg["reasoning_content"]))
|
||||
}
|
||||
|
||||
func isAssistantMessage(msg map[string]any) bool {
|
||||
return strings.EqualFold(strings.TrimSpace(asString(msg["role"])), "assistant")
|
||||
}
|
||||
|
||||
func isAssistantToolCallMessage(msg map[string]any) bool {
|
||||
if !isAssistantMessage(msg) {
|
||||
return false
|
||||
}
|
||||
toolCalls, ok := msg["tool_calls"].([]any)
|
||||
return ok && len(toolCalls) > 0
|
||||
}
|
||||
|
||||
func mergeResponsesAssistantToolCalls(prev any, next map[string]any) bool {
|
||||
prevMsg, ok := prev.(map[string]any)
|
||||
if !ok || !isAssistantToolCallMessage(prevMsg) || !isAssistantToolCallMessage(next) {
|
||||
return false
|
||||
}
|
||||
prevCalls, _ := prevMsg["tool_calls"].([]any)
|
||||
nextCalls, _ := next["tool_calls"].([]any)
|
||||
prevMsg["tool_calls"] = append(prevCalls, nextCalls...)
|
||||
if strings.TrimSpace(normalizeOpenAIReasoningContentForPrompt(prevMsg["reasoning_content"])) == "" {
|
||||
if reasoning := strings.TrimSpace(normalizeOpenAIReasoningContentForPrompt(next["reasoning_content"])); reasoning != "" {
|
||||
prevMsg["reasoning_content"] = reasoning
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
@@ -26,10 +26,13 @@ RULES:
|
||||
6) Objects use nested XML elements inside the parameter body. Arrays may repeat <item> children.
|
||||
7) Numbers, booleans, and null stay plain text.
|
||||
8) Use only the parameter names in the tool schema. Do not invent fields.
|
||||
9) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.
|
||||
10) If you call a tool, the first non-whitespace characters of that tool block must be exactly <|DSML|tool_calls>.
|
||||
11) Never omit the opening <|DSML|tool_calls> tag, even if you already plan to close with </|DSML|tool_calls>.
|
||||
12) Compatibility note: the runtime also accepts the legacy XML tags <tool_calls> / <invoke> / <parameter>, but prefer the DSML-prefixed form above.
|
||||
9) Fill parameters with the actual values required for this call. Do not emit placeholder, blank, or whitespace-only parameters.
|
||||
10) If a required parameter value is unknown, ask the user or answer normally instead of outputting an empty tool call.
|
||||
11) For shell tools such as Bash / execute_command, the command/script must be inside the command parameter. Never call them with an empty command.
|
||||
12) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.
|
||||
13) If you call a tool, the first non-whitespace characters of that tool block must be exactly <|DSML|tool_calls>.
|
||||
14) Never omit the opening <|DSML|tool_calls> tag, even if you already plan to close with </|DSML|tool_calls>.
|
||||
15) Compatibility note: the runtime also accepts the legacy XML tags <tool_calls> / <invoke> / <parameter>, but prefer the DSML-prefixed form above.
|
||||
|
||||
PARAMETER SHAPES:
|
||||
- string => <|DSML|parameter name="x"><![CDATA[value]]></|DSML|parameter>
|
||||
@@ -48,6 +51,12 @@ Wrong 2 — Markdown code fences:
|
||||
Wrong 3 — missing opening wrapper:
|
||||
<|DSML|invoke name="TOOL_NAME">...</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
Wrong 4 — empty parameters:
|
||||
<|DSML|tool_calls>
|
||||
<|DSML|invoke name="Bash">
|
||||
<|DSML|parameter name="command"></|DSML|parameter>
|
||||
</|DSML|invoke>
|
||||
</|DSML|tool_calls>
|
||||
|
||||
Remember: The ONLY valid way to use tools is the <|DSML|tool_calls>...</|DSML|tool_calls> block at the end of your response.
|
||||
` + buildCorrectToolExamples(toolNames)
|
||||
|
||||
@@ -119,6 +119,20 @@ func TestBuildToolCallInstructions_AnchorsMissingOpeningWrapperFailureMode(t *te
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildToolCallInstructions_RejectsEmptyParametersInPrompt(t *testing.T) {
|
||||
out := BuildToolCallInstructions([]string{"Bash"})
|
||||
for _, want := range []string{
|
||||
"Do not emit placeholder, blank, or whitespace-only parameters.",
|
||||
"If a required parameter value is unknown, ask the user or answer normally instead of outputting an empty tool call.",
|
||||
"Never call them with an empty command.",
|
||||
"Wrong 4 — empty parameters",
|
||||
} {
|
||||
if !strings.Contains(out, want) {
|
||||
t.Fatalf("expected empty-parameter instruction %q, got: %s", want, out)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func findInvokeBlocks(text, name string) []string {
|
||||
open := `<|DSML|invoke name="` + name + `">`
|
||||
remaining := text
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
package toolcall
|
||||
|
||||
import "strings"
|
||||
import (
|
||||
"strings"
|
||||
"unicode/utf8"
|
||||
)
|
||||
|
||||
func normalizeDSMLToolCallMarkup(text string) (string, bool) {
|
||||
if text == "" {
|
||||
@@ -42,8 +45,9 @@ func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
|
||||
b.WriteByte('/')
|
||||
}
|
||||
b.WriteString(tag.Name)
|
||||
b.WriteString(text[tag.NameEnd : tag.End+1])
|
||||
if text[tag.End] != '>' {
|
||||
tail := normalizeToolMarkupTagTailForXML(text[tag.NameEnd : tag.End+1])
|
||||
b.WriteString(tail)
|
||||
if !strings.HasSuffix(tail, ">") {
|
||||
b.WriteByte('>')
|
||||
}
|
||||
i = tag.End + 1
|
||||
@@ -54,3 +58,57 @@ func rewriteDSMLToolMarkupOutsideIgnored(text string) string {
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func normalizeToolMarkupTagTailForXML(tail string) string {
|
||||
if tail == "" {
|
||||
return ""
|
||||
}
|
||||
var b strings.Builder
|
||||
b.Grow(len(tail))
|
||||
quote := rune(0)
|
||||
for i := 0; i < len(tail); {
|
||||
r, size := utf8.DecodeRuneInString(tail[i:])
|
||||
if r == utf8.RuneError && size == 1 {
|
||||
b.WriteByte(tail[i])
|
||||
i++
|
||||
continue
|
||||
}
|
||||
ch := normalizeFullwidthASCII(r)
|
||||
if quote != 0 {
|
||||
b.WriteRune(ch)
|
||||
if ch == quote {
|
||||
quote = 0
|
||||
}
|
||||
i += size
|
||||
continue
|
||||
}
|
||||
switch ch {
|
||||
case '"', '\'':
|
||||
quote = ch
|
||||
b.WriteRune(ch)
|
||||
case '|':
|
||||
j := i + size
|
||||
for j < len(tail) {
|
||||
next, nextSize := utf8.DecodeRuneInString(tail[j:])
|
||||
if nextSize <= 0 {
|
||||
break
|
||||
}
|
||||
if next == ' ' || next == '\t' || next == '\r' || next == '\n' {
|
||||
j += nextSize
|
||||
continue
|
||||
}
|
||||
break
|
||||
}
|
||||
next, _ := normalizedASCIIAt(tail, j)
|
||||
if next != '>' {
|
||||
b.WriteRune(ch)
|
||||
}
|
||||
case '>', '/', '=':
|
||||
b.WriteRune(ch)
|
||||
default:
|
||||
b.WriteString(tail[i : i+size])
|
||||
}
|
||||
i += size
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
@@ -10,7 +10,7 @@ import (
|
||||
var toolCallMarkupKVPattern = regexp.MustCompile(`(?is)<(?:[a-z0-9_:-]+:)?([a-z0-9_\-.]+)\b[^>]*>(.*?)</(?:[a-z0-9_:-]+:)?([a-z0-9_\-.]+)>`)
|
||||
|
||||
// cdataPattern matches a standalone CDATA section.
|
||||
var cdataPattern = regexp.MustCompile(`(?is)^<!\[CDATA\[(.*?)]]>$`)
|
||||
var cdataPattern = regexp.MustCompile(`(?is)^(?:<|〈)!\[CDATA\[(.*?)]](?:>|>|〉)$`)
|
||||
|
||||
func parseMarkupKVObject(text string) map[string]any {
|
||||
matches := toolCallMarkupKVPattern.FindAllStringSubmatch(strings.TrimSpace(text), -1)
|
||||
|
||||
@@ -92,45 +92,11 @@ func filterToolCallsDetailed(parsed []ParsedToolCall) ([]ParsedToolCall, []strin
|
||||
if tc.Input == nil {
|
||||
tc.Input = map[string]any{}
|
||||
}
|
||||
if len(tc.Input) > 0 && !toolCallInputHasMeaningfulValue(tc.Input) {
|
||||
continue
|
||||
}
|
||||
out = append(out, tc)
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
|
||||
func toolCallInputHasMeaningfulValue(v any) bool {
|
||||
switch x := v.(type) {
|
||||
case nil:
|
||||
return false
|
||||
case string:
|
||||
return strings.TrimSpace(x) != ""
|
||||
case map[string]any:
|
||||
if len(x) == 0 {
|
||||
return false
|
||||
}
|
||||
for _, child := range x {
|
||||
if toolCallInputHasMeaningfulValue(child) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
case []any:
|
||||
if len(x) == 0 {
|
||||
return false
|
||||
}
|
||||
for _, child := range x {
|
||||
if toolCallInputHasMeaningfulValue(child) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
default:
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
func looksLikeToolCallSyntax(text string) bool {
|
||||
hasDSML, hasCanonical := ContainsToolCallWrapperSyntaxOutsideIgnored(text)
|
||||
return hasDSML || hasCanonical
|
||||
|
||||
@@ -6,6 +6,7 @@ import (
|
||||
"html"
|
||||
"regexp"
|
||||
"strings"
|
||||
"unicode/utf8"
|
||||
)
|
||||
|
||||
var xmlAttrPattern = regexp.MustCompile(`(?is)\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')`)
|
||||
@@ -214,7 +215,7 @@ func skipXMLIgnoredSection(text string, i int) (next int, advanced bool, blocked
|
||||
if end < 0 {
|
||||
return 0, false, true
|
||||
}
|
||||
return end + len("]]>"), true, false
|
||||
return end + toolCDATACloseLenAt(text, end), true, false
|
||||
case strings.HasPrefix(text[i:], "<!--"):
|
||||
end := strings.Index(text[i+len("<!--"):], "-->")
|
||||
if end < 0 {
|
||||
@@ -227,15 +228,26 @@ func skipXMLIgnoredSection(text string, i int) (next int, advanced bool, blocked
|
||||
}
|
||||
|
||||
func hasASCIIPrefixFoldAt(text string, start int, prefix string) bool {
|
||||
if start < 0 || len(text)-start < len(prefix) {
|
||||
return false
|
||||
_, ok := matchASCIIPrefixFoldAt(text, start, prefix)
|
||||
return ok
|
||||
}
|
||||
|
||||
func matchASCIIPrefixFoldAt(text string, start int, prefix string) (int, bool) {
|
||||
if start < 0 || start >= len(text) && prefix != "" {
|
||||
return 0, false
|
||||
}
|
||||
idx := start
|
||||
for j := 0; j < len(prefix); j++ {
|
||||
if asciiLower(text[start+j]) != asciiLower(prefix[j]) {
|
||||
return false
|
||||
if idx >= len(text) {
|
||||
return 0, false
|
||||
}
|
||||
ch, size := normalizedASCIIAt(text, idx)
|
||||
if size <= 0 || asciiLower(ch) != asciiLower(prefix[j]) {
|
||||
return 0, false
|
||||
}
|
||||
idx += size
|
||||
}
|
||||
return true
|
||||
return idx - start, true
|
||||
}
|
||||
|
||||
func asciiLower(b byte) byte {
|
||||
@@ -266,15 +278,14 @@ func findToolCDATAEnd(text string, from int) int {
|
||||
if from < 0 || from >= len(text) {
|
||||
return -1
|
||||
}
|
||||
const closeMarker = "]]>"
|
||||
firstNonFenceEnd := -1
|
||||
for searchFrom := from; searchFrom < len(text); {
|
||||
rel := strings.Index(text[searchFrom:], closeMarker)
|
||||
if rel < 0 {
|
||||
end := indexToolCDATAClose(text, searchFrom)
|
||||
if end < 0 {
|
||||
break
|
||||
}
|
||||
end := searchFrom + rel
|
||||
searchFrom = end + len(closeMarker)
|
||||
closeLen := toolCDATACloseLenAt(text, end)
|
||||
searchFrom = end + closeLen
|
||||
if cdataOffsetIsInsideMarkdownFence(text[from:end]) {
|
||||
continue
|
||||
}
|
||||
@@ -288,6 +299,35 @@ func findToolCDATAEnd(text string, from int) int {
|
||||
return firstNonFenceEnd
|
||||
}
|
||||
|
||||
func indexToolCDATAClose(text string, from int) int {
|
||||
if from < 0 {
|
||||
from = 0
|
||||
}
|
||||
asciiIdx := strings.Index(text[from:], "]]>")
|
||||
fullIdx := strings.Index(text[from:], "]]>")
|
||||
cjkIdx := strings.Index(text[from:], "]]〉")
|
||||
if asciiIdx < 0 && fullIdx < 0 && cjkIdx < 0 {
|
||||
return -1
|
||||
}
|
||||
best := -1
|
||||
for _, idx := range []int{asciiIdx, fullIdx, cjkIdx} {
|
||||
if idx >= 0 && (best < 0 || idx < best) {
|
||||
best = idx
|
||||
}
|
||||
}
|
||||
return from + best
|
||||
}
|
||||
|
||||
func toolCDATACloseLenAt(text string, idx int) int {
|
||||
if strings.HasPrefix(text[idx:], "]]〉") {
|
||||
return len("]]〉")
|
||||
}
|
||||
if strings.HasPrefix(text[idx:], "]]>") {
|
||||
return len("]]>")
|
||||
}
|
||||
return len("]]>")
|
||||
}
|
||||
|
||||
func cdataEndLooksStructural(text string, after int) bool {
|
||||
for after < len(text) {
|
||||
switch {
|
||||
@@ -327,22 +367,29 @@ func cdataOffsetIsInsideMarkdownFence(fragment string) bool {
|
||||
}
|
||||
|
||||
func findXMLTagEnd(text string, from int) int {
|
||||
quote := byte(0)
|
||||
for i := maxInt(from, 0); i < len(text); i++ {
|
||||
ch := text[i]
|
||||
quote := rune(0)
|
||||
for i := maxInt(from, 0); i < len(text); {
|
||||
r, size := utf8.DecodeRuneInString(text[i:])
|
||||
if r == utf8.RuneError && size == 0 {
|
||||
break
|
||||
}
|
||||
ch := normalizeFullwidthASCII(r)
|
||||
if quote != 0 {
|
||||
if ch == quote {
|
||||
quote = 0
|
||||
}
|
||||
i += size
|
||||
continue
|
||||
}
|
||||
if ch == '"' || ch == '\'' {
|
||||
quote = ch
|
||||
i += size
|
||||
continue
|
||||
}
|
||||
if ch == '>' {
|
||||
return i
|
||||
return i + size - 1
|
||||
}
|
||||
i += size
|
||||
}
|
||||
return -1
|
||||
}
|
||||
@@ -355,7 +402,8 @@ func hasXMLTagBoundary(text string, idx int) bool {
|
||||
case ' ', '\t', '\n', '\r', '>', '/':
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
r, _ := utf8.DecodeRuneInString(text[idx:])
|
||||
return normalizeFullwidthASCII(r) == '>'
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
package toolcall
|
||||
|
||||
import "strings"
|
||||
import (
|
||||
"strings"
|
||||
"unicode/utf8"
|
||||
)
|
||||
|
||||
type toolMarkupNameAlias struct {
|
||||
raw string
|
||||
@@ -131,22 +134,35 @@ func FindMatchingToolMarkupClose(text string, open ToolMarkupTag) (ToolMarkupTag
|
||||
}
|
||||
|
||||
func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
|
||||
if start < 0 || start >= len(text) || text[start] != '<' {
|
||||
next, ok := consumeToolMarkupLessThan(text, start)
|
||||
if !ok {
|
||||
return ToolMarkupTag{}, false
|
||||
}
|
||||
i := start + 1
|
||||
for i < len(text) && text[i] == '<' {
|
||||
i++
|
||||
i := next
|
||||
for {
|
||||
next, ok := consumeToolMarkupLessThan(text, i)
|
||||
if !ok {
|
||||
break
|
||||
}
|
||||
i = next
|
||||
}
|
||||
closing := false
|
||||
if i < len(text) && text[i] == '/' {
|
||||
closing = true
|
||||
i++
|
||||
}
|
||||
prefixStart := i
|
||||
i, dsmlLike := consumeToolMarkupNamePrefix(text, i)
|
||||
name, nameLen := matchToolMarkupName(text, i, dsmlLike)
|
||||
if nameLen == 0 {
|
||||
return ToolMarkupTag{}, false
|
||||
fallbackName, fallbackStart, fallbackLen, ok := matchToolMarkupNameAfterArbitraryPrefix(text, prefixStart)
|
||||
if !ok {
|
||||
return ToolMarkupTag{}, false
|
||||
}
|
||||
name = fallbackName
|
||||
i = fallbackStart
|
||||
nameLen = fallbackLen
|
||||
dsmlLike = true
|
||||
}
|
||||
nameEnd := i + nameLen
|
||||
nameEndBeforePipes := nameEnd
|
||||
@@ -184,7 +200,7 @@ func scanToolMarkupTagAt(text string, start int) (ToolMarkupTag, bool) {
|
||||
}
|
||||
|
||||
func IsPartialToolMarkupTagPrefix(text string) bool {
|
||||
if text == "" || text[0] != '<' || strings.Contains(text, ">") {
|
||||
if text == "" || text[0] != '<' || strings.Contains(text, ">") || strings.Contains(text, ">") {
|
||||
return false
|
||||
}
|
||||
i := 1
|
||||
@@ -207,6 +223,9 @@ func IsPartialToolMarkupTagPrefix(text string) bool {
|
||||
if hasASCIIPartialPrefixFoldAt(text, i, "dsml") {
|
||||
return true
|
||||
}
|
||||
if hasPartialToolMarkupNameAfterArbitraryPrefix(text, i) {
|
||||
return true
|
||||
}
|
||||
next, ok := consumeToolMarkupNamePrefixOnce(text, i)
|
||||
if !ok {
|
||||
return false
|
||||
@@ -236,26 +255,81 @@ func consumeToolMarkupNamePrefixOnce(text string, idx int) (int, bool) {
|
||||
return idx + 1, true
|
||||
}
|
||||
if hasASCIIPrefixFoldAt(text, idx, "dsml") {
|
||||
next := idx + len("dsml")
|
||||
if next < len(text) && (text[next] == '-' || text[next] == '_') {
|
||||
next++
|
||||
dsmlLen, _ := matchASCIIPrefixFoldAt(text, idx, "dsml")
|
||||
next := idx + dsmlLen
|
||||
if sep, size := normalizedASCIIAt(text, next); sep == '-' || sep == '_' {
|
||||
next += size
|
||||
}
|
||||
return next, true
|
||||
}
|
||||
if next, ok := consumeArbitraryToolMarkupNamePrefix(text, idx); ok {
|
||||
return next, true
|
||||
}
|
||||
return idx, false
|
||||
}
|
||||
|
||||
func hasASCIIPartialPrefixFoldAt(text string, start int, prefix string) bool {
|
||||
remain := len(text) - start
|
||||
if remain <= 0 || remain > len(prefix) {
|
||||
return false
|
||||
func consumeArbitraryToolMarkupNamePrefix(text string, idx int) (int, bool) {
|
||||
nextSegment, ok := consumeToolMarkupPrefixSegment(text, idx)
|
||||
if !ok {
|
||||
return idx, false
|
||||
}
|
||||
for j := 0; j < remain; j++ {
|
||||
if asciiLower(text[start+j]) != asciiLower(prefix[j]) {
|
||||
return false
|
||||
j := nextSegment
|
||||
for {
|
||||
nextSegment, ok = consumeToolMarkupPrefixSegment(text, j)
|
||||
if !ok {
|
||||
break
|
||||
}
|
||||
j = nextSegment
|
||||
}
|
||||
k := j
|
||||
for k < len(text) && (text[k] == ' ' || text[k] == '\t' || text[k] == '\r' || text[k] == '\n') {
|
||||
k++
|
||||
}
|
||||
next, ok := consumeToolMarkupPipe(text, k)
|
||||
if !ok {
|
||||
if sep, size := normalizedASCIIAt(text, k); sep == '_' || sep == '-' {
|
||||
next = k + size
|
||||
ok = true
|
||||
}
|
||||
}
|
||||
return true
|
||||
if !ok {
|
||||
return idx, false
|
||||
}
|
||||
for next < len(text) && (text[next] == ' ' || text[next] == '\t' || text[next] == '\r' || text[next] == '\n') {
|
||||
next++
|
||||
}
|
||||
if !hasToolMarkupNamePrefix(text, next) {
|
||||
return idx, false
|
||||
}
|
||||
return next, true
|
||||
}
|
||||
|
||||
func consumeToolMarkupPrefixSegment(text string, idx int) (int, bool) {
|
||||
ch, size := normalizedASCIIAt(text, idx)
|
||||
if size <= 0 {
|
||||
return idx, false
|
||||
}
|
||||
if (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || (ch >= '0' && ch <= '9') {
|
||||
return idx + size, true
|
||||
}
|
||||
return idx, false
|
||||
}
|
||||
|
||||
func hasASCIIPartialPrefixFoldAt(text string, start int, prefix string) bool {
|
||||
if start < 0 || start >= len(text) {
|
||||
return false
|
||||
}
|
||||
idx := start
|
||||
matched := 0
|
||||
for matched < len(prefix) && idx < len(text) {
|
||||
ch, size := normalizedASCIIAt(text, idx)
|
||||
if size <= 0 || asciiLower(ch) != asciiLower(prefix[matched]) {
|
||||
return false
|
||||
}
|
||||
idx += size
|
||||
matched++
|
||||
}
|
||||
return matched > 0 && matched < len(prefix) && idx == len(text)
|
||||
}
|
||||
|
||||
func hasToolMarkupNamePrefix(text string, start int) bool {
|
||||
@@ -275,13 +349,102 @@ func matchToolMarkupName(text string, start int, dsmlLike bool) (string, int) {
|
||||
if name.dsmlOnly && !dsmlLike {
|
||||
continue
|
||||
}
|
||||
if hasASCIIPrefixFoldAt(text, start, name.raw) {
|
||||
return name.canonical, len(name.raw)
|
||||
if nameLen, ok := matchASCIIPrefixFoldAt(text, start, name.raw); ok {
|
||||
return name.canonical, nameLen
|
||||
}
|
||||
}
|
||||
return "", 0
|
||||
}
|
||||
|
||||
func matchToolMarkupNameAfterArbitraryPrefix(text string, start int) (string, int, int, bool) {
|
||||
for idx := start; idx < len(text); {
|
||||
if isToolMarkupTagTerminator(text, idx) {
|
||||
return "", 0, 0, false
|
||||
}
|
||||
for _, name := range toolMarkupNames {
|
||||
nameLen, ok := matchASCIIPrefixFoldAt(text, idx, name.raw)
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if !toolMarkupPrefixAllowsLocalName(text[start:idx]) {
|
||||
continue
|
||||
}
|
||||
return name.canonical, idx, nameLen, true
|
||||
}
|
||||
_, size := utf8.DecodeRuneInString(text[idx:])
|
||||
if size <= 0 {
|
||||
size = 1
|
||||
}
|
||||
idx += size
|
||||
}
|
||||
return "", 0, 0, false
|
||||
}
|
||||
|
||||
func hasPartialToolMarkupNameAfterArbitraryPrefix(text string, start int) bool {
|
||||
for idx := start; idx < len(text); {
|
||||
if isToolMarkupTagTerminator(text, idx) {
|
||||
return false
|
||||
}
|
||||
if toolMarkupPrefixAllowsLocalName(text[start:idx]) && hasToolMarkupNamePrefix(text, idx) {
|
||||
return true
|
||||
}
|
||||
if toolMarkupPrefixAllowsLocalName(text[start:idx]) && hasDSMLNamePrefixOrPartial(text, idx) {
|
||||
return true
|
||||
}
|
||||
_, size := utf8.DecodeRuneInString(text[idx:])
|
||||
if size <= 0 {
|
||||
size = 1
|
||||
}
|
||||
idx += size
|
||||
}
|
||||
return toolMarkupPrefixAllowsLocalName(text[start:])
|
||||
}
|
||||
|
||||
func hasDSMLNamePrefixOrPartial(text string, start int) bool {
|
||||
return hasASCIIPrefixFoldAt(text, start, "dsml") || hasASCIIPartialPrefixFoldAt(text, start, "dsml")
|
||||
}
|
||||
|
||||
func toolMarkupPrefixAllowsLocalName(prefix string) bool {
|
||||
if prefix == "" {
|
||||
return false
|
||||
}
|
||||
if strings.Contains(normalizedASCIILowerString(prefix), "dsml") {
|
||||
return true
|
||||
}
|
||||
if strings.ContainsAny(prefix, "=\"'") {
|
||||
return false
|
||||
}
|
||||
r, _ := utf8.DecodeLastRuneInString(prefix)
|
||||
r = normalizeFullwidthASCII(r)
|
||||
return (r < 'a' || r > 'z') && (r < 'A' || r > 'Z') && (r < '0' || r > '9')
|
||||
}
|
||||
|
||||
func normalizedASCIILowerString(text string) string {
|
||||
var b strings.Builder
|
||||
b.Grow(len(text))
|
||||
for _, r := range text {
|
||||
r = normalizeFullwidthASCII(r)
|
||||
if r >= 'A' && r <= 'Z' {
|
||||
r += 'a' - 'A'
|
||||
}
|
||||
if r <= 0x7f {
|
||||
b.WriteRune(r)
|
||||
}
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func isToolMarkupTagTerminator(text string, idx int) bool {
|
||||
if idx >= len(text) {
|
||||
return false
|
||||
}
|
||||
if text[idx] == '>' {
|
||||
return true
|
||||
}
|
||||
r, _ := utf8.DecodeRuneInString(text[idx:])
|
||||
return normalizeFullwidthASCII(r) == '>'
|
||||
}
|
||||
|
||||
func consumeToolMarkupPipe(text string, idx int) (int, bool) {
|
||||
if idx >= len(text) {
|
||||
return idx, false
|
||||
@@ -289,12 +452,26 @@ func consumeToolMarkupPipe(text string, idx int) (int, bool) {
|
||||
if text[idx] == '|' {
|
||||
return idx + 1, true
|
||||
}
|
||||
if text[idx] == '\x02' {
|
||||
return idx + 1, true
|
||||
}
|
||||
if strings.HasPrefix(text[idx:], "|") {
|
||||
return idx + len("|"), true
|
||||
}
|
||||
if strings.HasPrefix(text[idx:], "␂") {
|
||||
return idx + len("␂"), true
|
||||
}
|
||||
return idx, false
|
||||
}
|
||||
|
||||
func consumeToolMarkupLessThan(text string, idx int) (int, bool) {
|
||||
ch, size := normalizedASCIIAt(text, idx)
|
||||
if size <= 0 || ch != '<' {
|
||||
return idx, false
|
||||
}
|
||||
return idx + size, true
|
||||
}
|
||||
|
||||
func hasToolMarkupBoundary(text string, idx int) bool {
|
||||
if idx >= len(text) {
|
||||
return true
|
||||
@@ -303,6 +480,35 @@ func hasToolMarkupBoundary(text string, idx int) bool {
|
||||
case ' ', '\t', '\n', '\r', '>', '/':
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
r, _ := utf8.DecodeRuneInString(text[idx:])
|
||||
return normalizeFullwidthASCII(r) == '>'
|
||||
}
|
||||
}
|
||||
|
||||
func normalizedASCIIAt(text string, idx int) (byte, int) {
|
||||
if idx < 0 || idx >= len(text) {
|
||||
return 0, 0
|
||||
}
|
||||
r, size := utf8.DecodeRuneInString(text[idx:])
|
||||
if r == utf8.RuneError && size == 0 {
|
||||
return 0, 0
|
||||
}
|
||||
normalized := normalizeFullwidthASCII(r)
|
||||
if normalized > 0x7f {
|
||||
return 0, 0
|
||||
}
|
||||
return byte(normalized), size
|
||||
}
|
||||
|
||||
func normalizeFullwidthASCII(r rune) rune {
|
||||
switch r {
|
||||
case '〈':
|
||||
return '<'
|
||||
case '〉':
|
||||
return '>'
|
||||
}
|
||||
if r >= '!' && r <= '~' {
|
||||
return r - 0xFEE0
|
||||
}
|
||||
return r
|
||||
}
|
||||
|
||||
@@ -72,6 +72,97 @@ EOF
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsSupportsUnderscoredDSMLShell(t *testing.T) {
|
||||
text := `<dsml_tool_calls>
|
||||
<dsml_invoke name="search_web">
|
||||
<dsml_parameter name="query"><![CDATA[2026年5月 热点事件]]></dsml_parameter>
|
||||
<dsml_parameter name="topic"><![CDATA[news]]></dsml_parameter>
|
||||
</dsml_invoke>
|
||||
<dsml_invoke name="eval_javascript">
|
||||
<dsml_parameter name="code"><![CDATA[1 + 1]]></dsml_parameter>
|
||||
</dsml_invoke>
|
||||
</dsml_tool_calls>`
|
||||
calls := ParseToolCalls(text, []string{"search_web", "eval_javascript"})
|
||||
if len(calls) != 2 {
|
||||
t.Fatalf("expected two underscored DSML calls, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "search_web" || calls[0].Input["query"] != "2026年5月 热点事件" || calls[0].Input["topic"] != "news" {
|
||||
t.Fatalf("unexpected first underscored DSML call: %#v", calls[0])
|
||||
}
|
||||
if calls[1].Name != "eval_javascript" || calls[1].Input["code"] != "1 + 1" {
|
||||
t.Fatalf("unexpected second underscored DSML call: %#v", calls[1])
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsSupportsArbitraryPrefixedToolMarkup(t *testing.T) {
|
||||
cases := []string{
|
||||
`<abc|tool_calls><abc|invoke name="Read"><abc|parameter name="file_path">README.md</abc|parameter></abc|invoke></abc|tool_calls>`,
|
||||
`<vendor_tool_calls><vendor_invoke name="Read"><vendor_parameter name="file_path">README.md</vendor_parameter></vendor_invoke></vendor_tool_calls>`,
|
||||
`<agent - tool_calls><agent - invoke name="Read"><agent - parameter name="file_path">README.md</agent - parameter></agent - invoke></agent - tool_calls>`,
|
||||
}
|
||||
for _, text := range cases {
|
||||
calls := ParseToolCalls(text, []string{"Read"})
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected one arbitrary-prefixed tool call for %q, got %#v", text, calls)
|
||||
}
|
||||
if calls[0].Name != "Read" || calls[0].Input["file_path"] != "README.md" {
|
||||
t.Fatalf("unexpected arbitrary-prefixed parse result: %#v", calls[0])
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsSupportsFullwidthDSMLShell(t *testing.T) {
|
||||
text := `<dSML|tool_calls>
|
||||
<dSML|invoke name="Read">
|
||||
<dSML|parameter name="file_path"><![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]></dSML|parameter>
|
||||
</dSML|invoke>
|
||||
<dSML|invoke name="Read">
|
||||
<dSML|parameter name="file_path"><![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]></dSML|parameter>
|
||||
</dSML|invoke>
|
||||
</dSML|tool_calls>`
|
||||
calls := ParseToolCalls(text, []string{"Read"})
|
||||
if len(calls) != 2 {
|
||||
t.Fatalf("expected two fullwidth DSML calls, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "Read" || calls[0].Input["file_path"] != "/Users/aq/Desktop/myproject/Personal_Blog/README.md" {
|
||||
t.Fatalf("unexpected first fullwidth DSML call: %#v", calls[0])
|
||||
}
|
||||
if calls[1].Name != "Read" || calls[1].Input["file_path"] != "/Users/aq/Desktop/myproject/Personal_Blog/index.html" {
|
||||
t.Fatalf("unexpected second fullwidth DSML call: %#v", calls[1])
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsSupportsCJKAngleDSMDrift(t *testing.T) {
|
||||
text := `<DSM|tool_calls>
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
〈/DSM|tool_calls〉`
|
||||
|
||||
calls := ParseToolCalls(text, []string{"Bash"})
|
||||
if len(calls) != 3 {
|
||||
t.Fatalf("expected three CJK-angle DSM drift calls, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "Bash" || calls[0].Input["command"] != "git log --oneline origin/dev..dev" {
|
||||
t.Fatalf("unexpected first CJK-angle DSM drift call: %#v", calls[0])
|
||||
}
|
||||
if calls[1].Name != "Bash" || calls[1].Input["description"] != "Show commits on origin/dev not on local dev" {
|
||||
t.Fatalf("unexpected second CJK-angle DSM drift call: %#v", calls[1])
|
||||
}
|
||||
if calls[2].Name != "Bash" || calls[2].Input["command"] != "git status -b --short" {
|
||||
t.Fatalf("unexpected third CJK-angle DSM drift call: %#v", calls[2])
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsIgnoresBareHyphenatedToolCallsLookalike(t *testing.T) {
|
||||
text := `<tool-calls><invoke name="Bash"><parameter name="command">pwd</parameter></invoke></tool-calls>`
|
||||
calls := ParseToolCalls(text, []string{"Bash"})
|
||||
@@ -516,14 +607,17 @@ func TestParseToolCallsDetailedMarksToolCallsSyntax(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseToolCallsRejectsAllEmptyParameterPayload(t *testing.T) {
|
||||
func TestParseToolCallsAllowsAllEmptyParameterPayload(t *testing.T) {
|
||||
text := `<tool_calls><invoke name="Bash"><parameter name="command"></parameter><parameter name="description"> </parameter><parameter name="timeout"></parameter></invoke></tool_calls>`
|
||||
res := ParseToolCallsDetailed(text, []string{"Bash"})
|
||||
if !res.SawToolCallSyntax {
|
||||
t.Fatalf("expected tool syntax to be detected, got %#v", res)
|
||||
}
|
||||
if len(res.Calls) != 0 {
|
||||
t.Fatalf("expected all-empty payload to be rejected, got %#v", res.Calls)
|
||||
if len(res.Calls) != 1 {
|
||||
t.Fatalf("expected all-empty payload to be parsed, got %#v", res.Calls)
|
||||
}
|
||||
if res.Calls[0].Input["command"] != "" || res.Calls[0].Input["description"] != "" || res.Calls[0].Input["timeout"] != "" {
|
||||
t.Fatalf("expected empty parameters to be preserved, got %#v", res.Calls[0].Input)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -54,7 +54,7 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string,
|
||||
}
|
||||
if parsed.SawToolCallSyntax {
|
||||
if rejected == nil || tag.Start < rejected.start {
|
||||
rejected = &rejectedBlock{start: tag.Start, prefix: prefixPart, suffix: suffixPart}
|
||||
rejected = &rejectedBlock{start: tag.Start, prefix: prefixPart + xmlBlock, suffix: suffixPart}
|
||||
}
|
||||
searchFrom = tag.End + 1
|
||||
continue
|
||||
@@ -88,7 +88,7 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string,
|
||||
return prefixPart, parsed.Calls, suffixPart, true
|
||||
}
|
||||
if parsed.SawToolCallSyntax {
|
||||
return prefixPart, nil, suffixPart, true
|
||||
return prefixPart + captured[invokeTag.Start:closeTag.End+1], nil, suffixPart, true
|
||||
}
|
||||
return prefixPart + captured[invokeTag.Start:closeTag.End+1], nil, suffixPart, true
|
||||
}
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
package toolstream
|
||||
|
||||
import (
|
||||
"ds2api/internal/toolcall"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
@@ -104,6 +105,99 @@ func TestProcessToolSieveInterceptsDSMLTrailingPipeToolCallWithoutLeak(t *testin
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveInterceptsDSMLControlSeparatorWithoutLeak(t *testing.T) {
|
||||
for _, tc := range []struct {
|
||||
name string
|
||||
sep string
|
||||
}{
|
||||
{name: "control_picture", sep: "␂"},
|
||||
{name: "raw_stx", sep: "\x02"},
|
||||
} {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
sep := tc.sep
|
||||
var state State
|
||||
chunks := []string{
|
||||
"<DSML" + sep + "tool",
|
||||
"_calls>\n",
|
||||
` <DSML` + sep + `invoke name="Read">` + "\n",
|
||||
` <DSML` + sep + `parameter name="file_path"><![CDATA[/tmp/input.txt]]></DSML` + sep + `parameter>` + "\n",
|
||||
" </DSML" + sep + "invoke>\n",
|
||||
"</DSML" + sep + "tool_calls>",
|
||||
}
|
||||
var events []Event
|
||||
for _, c := range chunks {
|
||||
events = append(events, ProcessChunk(&state, c, []string{"Read"})...)
|
||||
}
|
||||
events = append(events, Flush(&state, []string{"Read"})...)
|
||||
|
||||
var textContent strings.Builder
|
||||
var calls []any
|
||||
for _, evt := range events {
|
||||
textContent.WriteString(evt.Content)
|
||||
for _, call := range evt.ToolCalls {
|
||||
calls = append(calls, call)
|
||||
}
|
||||
}
|
||||
if text := textContent.String(); strings.Contains(strings.ToLower(text), "dsml") || strings.Contains(text, "Read") || strings.Contains(text, sep) {
|
||||
t.Fatalf("control-separator DSML tool call leaked to text: %q events=%#v", text, events)
|
||||
}
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected one control-separator DSML tool call, got %d events=%#v", len(calls), events)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveInterceptsArbitraryPrefixedToolTagsWithoutLeak(t *testing.T) {
|
||||
var state State
|
||||
chunks := []string{
|
||||
"<proto💥tool",
|
||||
"_calls>\n",
|
||||
` <proto💥invoke name="Read">` + "\n",
|
||||
` <proto💥parameter name="file_path"><![CDATA[/tmp/input.txt]]></proto💥parameter>` + "\n",
|
||||
" </proto💥invoke>\n",
|
||||
"</proto💥tool_calls>",
|
||||
}
|
||||
var events []Event
|
||||
for _, c := range chunks {
|
||||
events = append(events, ProcessChunk(&state, c, []string{"Read"})...)
|
||||
}
|
||||
events = append(events, Flush(&state, []string{"Read"})...)
|
||||
|
||||
var textContent strings.Builder
|
||||
var calls []any
|
||||
for _, evt := range events {
|
||||
textContent.WriteString(evt.Content)
|
||||
for _, call := range evt.ToolCalls {
|
||||
calls = append(calls, call)
|
||||
}
|
||||
}
|
||||
if text := textContent.String(); strings.Contains(text, "proto") || strings.Contains(text, "Read") || strings.Contains(text, "💥") {
|
||||
t.Fatalf("arbitrary-prefixed tool call leaked to text: %q events=%#v", text, events)
|
||||
}
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected one arbitrary-prefixed tool call, got %d events=%#v", len(calls), events)
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveEmitsEmptyDSMLControlSeparatorBlockWithoutLeak(t *testing.T) {
|
||||
sep := "␂"
|
||||
chunks := []string{
|
||||
"<DSML" + sep + "tool_calls>\n",
|
||||
` <DSML` + sep + `invoke name="Read">` + "\n",
|
||||
` <DSML` + sep + `parameter name="file_path"></DSML` + sep + `parameter>` + "\n",
|
||||
" </DSML" + sep + "invoke>\n",
|
||||
"</DSML" + sep + "tool_calls>",
|
||||
}
|
||||
calls := collectToolCallsForChunks(t, chunks, []string{"Read"})
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected empty control-separator block to produce one call, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "Read" || calls[0].Input["file_path"] != "" {
|
||||
t.Fatalf("expected empty file_path parameter to be preserved, got %#v", calls)
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveInterceptsExtraLeadingLessThanDSMLToolCallWithoutLeak(t *testing.T) {
|
||||
var state State
|
||||
chunks := []string{
|
||||
@@ -490,7 +584,7 @@ func TestProcessToolSieveNonToolXMLKeepsSuffixForToolParsing(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveSuppressesMalformedExecutableXMLBlock(t *testing.T) {
|
||||
func TestProcessToolSieveReleasesMalformedExecutableXMLBlock(t *testing.T) {
|
||||
var state State
|
||||
chunk := `<tool_calls><invoke name="read_file"><param>{"path":"README.md"}</param></invoke></tool_calls>`
|
||||
events := ProcessChunk(&state, chunk, []string{"read_file"})
|
||||
@@ -506,13 +600,12 @@ func TestProcessToolSieveSuppressesMalformedExecutableXMLBlock(t *testing.T) {
|
||||
if toolCalls != 0 {
|
||||
t.Fatalf("expected malformed executable-looking XML not to become a tool call, got %d events=%#v", toolCalls, events)
|
||||
}
|
||||
if textContent.Len() != 0 {
|
||||
t.Fatalf("expected malformed executable-looking XML to be suppressed, got %q", textContent.String())
|
||||
if textContent.String() != chunk {
|
||||
t.Fatalf("expected malformed executable-looking XML to be released as text, got %q", textContent.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveSuppressesAllEmptyDSMLToolBlock(t *testing.T) {
|
||||
var state State
|
||||
func TestProcessToolSieveEmitsAllEmptyDSMLToolBlock(t *testing.T) {
|
||||
chunk := strings.Join([]string{
|
||||
`<|DSML|tool_calls>`,
|
||||
`<|DSML|invoke name="Bash">`,
|
||||
@@ -522,22 +615,69 @@ func TestProcessToolSieveSuppressesAllEmptyDSMLToolBlock(t *testing.T) {
|
||||
`</|DSML|invoke>`,
|
||||
`</|DSML|tool_calls>`,
|
||||
}, "\n")
|
||||
events := ProcessChunk(&state, chunk, []string{"Bash"})
|
||||
events = append(events, Flush(&state, []string{"Bash"})...)
|
||||
calls := collectToolCallsForChunks(t, []string{chunk}, []string{"Bash"})
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected all-empty DSML block to produce one tool call, got %#v", calls)
|
||||
}
|
||||
if calls[0].Input["command"] != "" || calls[0].Input["description"] != "" || calls[0].Input["timeout"] != "" {
|
||||
t.Fatalf("expected empty parameters to be preserved, got %#v", calls[0].Input)
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveEmitsChunkedAllEmptyArbitraryPrefixedToolBlock(t *testing.T) {
|
||||
chunk := strings.Join([]string{
|
||||
`<T|DSML|tool_calls>`,
|
||||
` <T|DSML|invoke name="TaskOutput">`,
|
||||
` <T|DSML|parameter name="task_id"></T|DSML|parameter>`,
|
||||
` <T|DSML|parameter name="block"></T|DSML|parameter>`,
|
||||
` <T|DSML|parameter name="timeout"></T|DSML|parameter>`,
|
||||
` </T|DSML|invoke>`,
|
||||
` </T|DSML|tool_calls>`,
|
||||
}, "\n")
|
||||
calls := collectToolCallsForChunks(t, splitEveryNRBytes(chunk, 8), []string{"TaskOutput"})
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected chunked all-empty arbitrary-prefixed block to produce one tool call, got %#v", calls)
|
||||
}
|
||||
if calls[0].Name != "TaskOutput" || calls[0].Input["task_id"] != "" || calls[0].Input["block"] != "" || calls[0].Input["timeout"] != "" {
|
||||
t.Fatalf("expected empty TaskOutput parameters to be preserved, got %#v", calls)
|
||||
}
|
||||
}
|
||||
|
||||
func collectToolCallsForChunks(t *testing.T, chunks []string, toolNames []string) []toolcall.ParsedToolCall {
|
||||
t.Helper()
|
||||
var state State
|
||||
var events []Event
|
||||
for _, chunk := range chunks {
|
||||
events = append(events, ProcessChunk(&state, chunk, toolNames)...)
|
||||
}
|
||||
events = append(events, Flush(&state, toolNames)...)
|
||||
|
||||
var textContent strings.Builder
|
||||
toolCalls := 0
|
||||
var calls []toolcall.ParsedToolCall
|
||||
for _, evt := range events {
|
||||
textContent.WriteString(evt.Content)
|
||||
toolCalls += len(evt.ToolCalls)
|
||||
}
|
||||
|
||||
if toolCalls != 0 {
|
||||
t.Fatalf("expected all-empty DSML block not to produce tool calls, got %d events=%#v", toolCalls, events)
|
||||
calls = append(calls, evt.ToolCalls...)
|
||||
}
|
||||
if textContent.Len() != 0 {
|
||||
t.Fatalf("expected all-empty DSML block not to leak as text, got %q", textContent.String())
|
||||
t.Fatalf("expected tool block not to leak as text, got %q", textContent.String())
|
||||
}
|
||||
return calls
|
||||
}
|
||||
|
||||
func splitEveryNRBytes(s string, n int) []string {
|
||||
if n <= 0 {
|
||||
return []string{s}
|
||||
}
|
||||
out := make([]string, 0, len(s)/n+1)
|
||||
for len(s) > 0 {
|
||||
if len(s) <= n {
|
||||
out = append(out, s)
|
||||
break
|
||||
}
|
||||
out = append(out, s[:n])
|
||||
s = s[n:]
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func TestProcessToolSievePassesThroughFencedXMLToolCallExamples(t *testing.T) {
|
||||
@@ -671,6 +811,8 @@ func TestFindPartialXMLToolTagStart(t *testing.T) {
|
||||
{"partial_tool_calls", "Hello <tool_ca", 6},
|
||||
{"partial_dsml_trailing_pipe", "Hello <|DSML|tool_calls|", 6},
|
||||
{"partial_dsml_extra_leading_less_than", "Hello <<|DSML|tool_calls", 6},
|
||||
{"partial_arbitrary_prefix_before_dsml", "Hello <T|DS", 6},
|
||||
{"partial_arbitrary_prefix_after_dsml_pipe", "Hello <T|DSML|", 6},
|
||||
{"partial_invoke", "Hello <inv", 6},
|
||||
{"bare_tool_call_not_held", "Hello <tool_name", -1},
|
||||
{"partial_lt_only", "Text <", 5},
|
||||
@@ -1086,3 +1228,37 @@ func TestProcessToolSieveDSMLBarePrefixVariantDoesNotLeak(t *testing.T) {
|
||||
t.Fatalf("expected one tool call from DSML bare prefix variant, got %d events=%#v", toolCalls, events)
|
||||
}
|
||||
}
|
||||
|
||||
func TestProcessToolSieveCJKAngleDSMDriftDoesNotLeak(t *testing.T) {
|
||||
var state State
|
||||
chunks := []string{
|
||||
"<DSM|tool_calls>\n",
|
||||
"<DSM|invoke name=\"Bash\">\n",
|
||||
"<DSM|parameter name=\"description\"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉\n",
|
||||
"<DSM|parameter name=\"command\"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉\n",
|
||||
"〈/DSM|invoke〉\n",
|
||||
"〈/DSM|tool_calls〉",
|
||||
}
|
||||
var events []Event
|
||||
for _, c := range chunks {
|
||||
events = append(events, ProcessChunk(&state, c, []string{"Bash"})...)
|
||||
}
|
||||
events = append(events, Flush(&state, []string{"Bash"})...)
|
||||
|
||||
var textContent string
|
||||
var calls []toolcall.ParsedToolCall
|
||||
for _, evt := range events {
|
||||
textContent += evt.Content
|
||||
calls = append(calls, evt.ToolCalls...)
|
||||
}
|
||||
|
||||
if strings.Contains(textContent, "DSM") || strings.Contains(textContent, "git status") {
|
||||
t.Fatalf("CJK-angle DSM drift leaked to text: %q events=%#v", textContent, events)
|
||||
}
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("expected one CJK-angle DSM drift tool call, got %d events=%#v", len(calls), events)
|
||||
}
|
||||
if calls[0].Name != "Bash" || calls[0].Input["command"] != "git status -b --short" {
|
||||
t.Fatalf("unexpected CJK-angle DSM drift call: %#v", calls[0])
|
||||
}
|
||||
}
|
||||
|
||||
@@ -187,9 +187,9 @@ test('vercel stream emits Go-parity empty-output failure on DONE', async () => {
|
||||
const { frames } = await runMockVercelStream(['data: [DONE]\n\n']);
|
||||
assert.equal(frames.length, 2);
|
||||
const failed = JSON.parse(frames[0]);
|
||||
assert.equal(failed.status_code, 429);
|
||||
assert.equal(failed.error.type, 'rate_limit_error');
|
||||
assert.equal(failed.error.code, 'upstream_empty_output');
|
||||
assert.equal(failed.status_code, 503);
|
||||
assert.equal(failed.error.type, 'service_unavailable_error');
|
||||
assert.equal(failed.error.code, 'upstream_unavailable');
|
||||
assert.equal(frames[1], '[DONE]');
|
||||
});
|
||||
|
||||
@@ -209,6 +209,21 @@ test('vercel stream retries empty output once and keeps one terminal frame', asy
|
||||
assert.match(completionBodies[1].prompt, /Previous reply had no visible output\. Please regenerate the visible final answer or tool call now\.$/);
|
||||
});
|
||||
|
||||
test('vercel stream retries thinking-only output once', async () => {
|
||||
const { frames, fetchURLs, fetchBodies } = await runMockVercelStreamSequence([
|
||||
['data: {"response_message_id":42,"p":"response/thinking_content","v":"plan"}\n\n', 'data: [DONE]\n\n'],
|
||||
['data: {"p":"response/content","v":"visible"}\n\n', 'data: [DONE]\n\n'],
|
||||
], { thinking_enabled: true });
|
||||
const parsed = frames.filter((frame) => frame !== '[DONE]').map((frame) => JSON.parse(frame));
|
||||
const completionBodies = fetchBodies.filter((body) => Object.hasOwn(body, 'prompt'));
|
||||
assert.equal(fetchURLs.filter((url) => url === 'https://chat.deepseek.com/api/v0/chat/completion').length, 2);
|
||||
assert.equal(frames.filter((frame) => frame === '[DONE]').length, 1);
|
||||
assert.equal(completionBodies[1].parent_message_id, 42);
|
||||
assert.equal(parsed[0].choices[0].delta.reasoning_content, 'plan');
|
||||
assert.equal(parsed[1].choices[0].delta.content, 'visible');
|
||||
assert.equal(parsed[2].choices[0].finish_reason, 'stop');
|
||||
});
|
||||
|
||||
test('vercel stream coalesces many small content deltas while keeping one choice', async () => {
|
||||
const lines = Array.from({ length: 100 }, () => `data: ${JSON.stringify({ p: 'response/content', v: '字' })}\n\n`);
|
||||
lines.push('data: [DONE]\n\n');
|
||||
|
||||
@@ -80,6 +80,118 @@ EOF
|
||||
assert.equal(calls[0].input.command.includes('Co-Authored-By: Claude Opus 4.7'), true);
|
||||
});
|
||||
|
||||
test('parseToolCalls parses underscored DSML shell (Vercel parity)', () => {
|
||||
const payload = `<dsml_tool_calls>
|
||||
<dsml_invoke name="search_web">
|
||||
<dsml_parameter name="query"><![CDATA[2026年5月 热点事件]]></dsml_parameter>
|
||||
<dsml_parameter name="topic"><![CDATA[news]]></dsml_parameter>
|
||||
</dsml_invoke>
|
||||
<dsml_invoke name="eval_javascript">
|
||||
<dsml_parameter name="code"><![CDATA[1 + 1]]></dsml_parameter>
|
||||
</dsml_invoke>
|
||||
</dsml_tool_calls>`;
|
||||
const calls = parseToolCalls(payload, ['search_web', 'eval_javascript']);
|
||||
assert.equal(calls.length, 2);
|
||||
assert.equal(calls[0].name, 'search_web');
|
||||
assert.deepEqual(calls[0].input, { query: '2026年5月 热点事件', topic: 'news' });
|
||||
assert.equal(calls[1].name, 'eval_javascript');
|
||||
assert.deepEqual(calls[1].input, { code: '1 + 1' });
|
||||
});
|
||||
|
||||
test('parseToolCalls parses arbitrary-prefixed tool markup shells', () => {
|
||||
const samples = [
|
||||
'<abc|tool_calls><abc|invoke name="Read"><abc|parameter name="file_path">README.md</abc|parameter></abc|invoke></abc|tool_calls>',
|
||||
'<vendor_tool_calls><vendor_invoke name="Read"><vendor_parameter name="file_path">README.md</vendor_parameter></vendor_invoke></vendor_tool_calls>',
|
||||
'<agent - tool_calls><agent - invoke name="Read"><agent - parameter name="file_path">README.md</agent - parameter></agent - invoke></agent - tool_calls>',
|
||||
];
|
||||
for (const payload of samples) {
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 1);
|
||||
assert.equal(calls[0].name, 'Read');
|
||||
assert.deepEqual(calls[0].input, { file_path: 'README.md' });
|
||||
}
|
||||
});
|
||||
|
||||
test('parseToolCalls parses fullwidth DSML shell drift', () => {
|
||||
const payload = `<dSML|tool_calls>
|
||||
<dSML|invoke name="Read">
|
||||
<dSML|parameter name="file_path"><![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/README.md]]></dSML|parameter>
|
||||
</dSML|invoke>
|
||||
<dSML|invoke name="Read">
|
||||
<dSML|parameter name="file_path"><![CDATA[/Users/aq/Desktop/myproject/Personal_Blog/index.html]]></dSML|parameter>
|
||||
</dSML|invoke>
|
||||
</dSML|tool_calls>`;
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 2);
|
||||
assert.equal(calls[0].name, 'Read');
|
||||
assert.deepEqual(calls[0].input, { file_path: '/Users/aq/Desktop/myproject/Personal_Blog/README.md' });
|
||||
assert.equal(calls[1].name, 'Read');
|
||||
assert.deepEqual(calls[1].input, { file_path: '/Users/aq/Desktop/myproject/Personal_Blog/index.html' });
|
||||
});
|
||||
|
||||
test('parseToolCalls parses CJK-angle DSM drift', () => {
|
||||
const payload = `<DSM|tool_calls>
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Show commits on local dev not on origin/dev]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git log --oneline origin/dev..dev]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Show commits on origin/dev not on local dev]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git log --oneline dev..origin/dev]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
<DSM|invoke name="Bash">
|
||||
<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉
|
||||
<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉
|
||||
〈/DSM|invoke〉
|
||||
〈/DSM|tool_calls〉`;
|
||||
const calls = parseToolCalls(payload, ['Bash']);
|
||||
assert.equal(calls.length, 3);
|
||||
assert.equal(calls[0].name, 'Bash');
|
||||
assert.equal(calls[0].input.command, 'git log --oneline origin/dev..dev');
|
||||
assert.equal(calls[1].input.description, 'Show commits on origin/dev not on local dev');
|
||||
assert.equal(calls[2].input.command, 'git status -b --short');
|
||||
});
|
||||
|
||||
test('parseToolCalls parses DSML control separator drift', () => {
|
||||
for (const sep of ['␂', '\x02']) {
|
||||
const payload = `<DSML${sep}tool_calls>
|
||||
<DSML${sep}invoke name="Read">
|
||||
<DSML${sep}parameter name="file_path"><![CDATA[/tmp/input.txt]]></DSML${sep}parameter>
|
||||
</DSML${sep}invoke>
|
||||
</DSML${sep}tool_calls>`;
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 1);
|
||||
assert.equal(calls[0].name, 'Read');
|
||||
assert.deepEqual(calls[0].input, { file_path: '/tmp/input.txt' });
|
||||
}
|
||||
});
|
||||
|
||||
test('parseToolCalls parses arbitrary-prefixed tool tags', () => {
|
||||
const payload = `<proto💥tool_calls>
|
||||
<proto💥invoke name="Read">
|
||||
<proto💥parameter name="file_path"><![CDATA[/tmp/input.txt]]></proto💥parameter>
|
||||
</proto💥invoke>
|
||||
</proto💥tool_calls>`;
|
||||
const calls = parseToolCalls(payload, ['Read']);
|
||||
assert.equal(calls.length, 1);
|
||||
assert.equal(calls[0].name, 'Read');
|
||||
assert.deepEqual(calls[0].input, { file_path: '/tmp/input.txt' });
|
||||
});
|
||||
|
||||
test('parseToolCalls allows all-empty parameter payloads', () => {
|
||||
const payload = `<T|DSML|tool_calls>
|
||||
<T|DSML|invoke name="TaskOutput">
|
||||
<T|DSML|parameter name="task_id"></T|DSML|parameter>
|
||||
<T|DSML|parameter name="block"></T|DSML|parameter>
|
||||
<T|DSML|parameter name="timeout"></T|DSML|parameter>
|
||||
</T|DSML|invoke>
|
||||
</T|DSML|tool_calls>`;
|
||||
const calls = parseToolCalls(payload, ['TaskOutput']);
|
||||
assert.equal(calls.length, 1);
|
||||
assert.equal(calls[0].name, 'TaskOutput');
|
||||
assert.deepEqual(calls[0].input, { task_id: '', block: '', timeout: '' });
|
||||
});
|
||||
|
||||
test('parseToolCalls ignores bare hyphenated tool_calls lookalike', () => {
|
||||
const payload = '<tool-calls><invoke name="Bash"><parameter name="command">pwd</parameter></invoke></tool-calls>';
|
||||
const calls = parseToolCalls(payload, ['Bash']);
|
||||
@@ -396,6 +508,80 @@ test('sieve emits tool_calls for DSML trailing pipe tag terminator', () => {
|
||||
assert.equal(text.toLowerCase().includes('dsml'), false);
|
||||
});
|
||||
|
||||
test('sieve emits tool_calls for DSML control separator drift', () => {
|
||||
for (const sep of ['␂', '\x02']) {
|
||||
const events = runSieve([
|
||||
`<DSML${sep}tool`,
|
||||
'_calls>\n',
|
||||
`<DSML${sep}invoke name="Read">\n`,
|
||||
`<DSML${sep}parameter name="file_path"><![CDATA[/tmp/input.txt]]></DSML${sep}parameter>\n`,
|
||||
`</DSML${sep}invoke>\n`,
|
||||
`</DSML${sep}tool_calls>`,
|
||||
], ['Read']);
|
||||
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
|
||||
assert.equal(finalCalls.length, 1);
|
||||
assert.equal(finalCalls[0].name, 'Read');
|
||||
assert.equal(finalCalls[0].input.file_path, '/tmp/input.txt');
|
||||
const text = collectText(events);
|
||||
assert.equal(text.toLowerCase().includes('dsml'), false);
|
||||
assert.equal(text.includes(sep), false);
|
||||
}
|
||||
});
|
||||
|
||||
test('sieve emits tool_calls for arbitrary-prefixed tool tags', () => {
|
||||
const events = runSieve([
|
||||
'<proto💥tool',
|
||||
'_calls>\n',
|
||||
'<proto💥invoke name="Read">\n',
|
||||
'<proto💥parameter name="file_path"><![CDATA[/tmp/input.txt]]></proto💥parameter>\n',
|
||||
'</proto💥invoke>\n',
|
||||
'</proto💥tool_calls>',
|
||||
], ['Read']);
|
||||
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
|
||||
assert.equal(finalCalls.length, 1);
|
||||
assert.equal(finalCalls[0].name, 'Read');
|
||||
assert.equal(finalCalls[0].input.file_path, '/tmp/input.txt');
|
||||
const text = collectText(events);
|
||||
assert.equal(text.includes('proto'), false);
|
||||
assert.equal(text.includes('💥'), false);
|
||||
});
|
||||
|
||||
test('sieve emits tool_calls for CJK-angle DSM drift', () => {
|
||||
const events = runSieve([
|
||||
'<DSM|tool_calls>\n',
|
||||
'<DSM|invoke name="Bash">\n',
|
||||
'<DSM|parameter name="description"|>〈![CDATA[Check tracking branch status]]〉〈/DSM|parameter〉\n',
|
||||
'<DSM|parameter name="command"|>〈![CDATA[git status -b --short]]〉〈/DSM|parameter〉\n',
|
||||
'〈/DSM|invoke〉\n',
|
||||
'〈/DSM|tool_calls〉',
|
||||
], ['Bash']);
|
||||
const finalCalls = events.flatMap((evt) => (evt.type === 'tool_calls' ? evt.calls : []));
|
||||
assert.equal(finalCalls.length, 1);
|
||||
assert.equal(finalCalls[0].name, 'Bash');
|
||||
assert.equal(finalCalls[0].input.command, 'git status -b --short');
|
||||
assert.equal(collectText(events), '');
|
||||
});
|
||||
|
||||
test('sieve emits all-empty arbitrary-prefixed tool tags without leaking text', () => {
|
||||
const payload = [
|
||||
'<T|DSML|tool_calls>\n',
|
||||
' <T|DSML|invoke name="TaskOutput">\n',
|
||||
' <T|DSML|parameter name="task_id"></T|DSML|parameter>\n',
|
||||
' <T|DSML|parameter name="block"></T|DSML|parameter>\n',
|
||||
' <T|DSML|parameter name="timeout"></T|DSML|parameter>\n',
|
||||
' </T|DSML|invoke>\n',
|
||||
'</T|DSML|tool_calls>',
|
||||
].join('');
|
||||
for (const chunks of [[payload], payload.match(/.{1,8}/gs)]) {
|
||||
const events = runSieve(chunks, ['TaskOutput']);
|
||||
const finalCalls = events.filter((evt) => evt.type === 'tool_calls').flatMap((evt) => evt.calls || []);
|
||||
assert.equal(finalCalls.length, 1);
|
||||
assert.equal(finalCalls[0].name, 'TaskOutput');
|
||||
assert.deepEqual(finalCalls[0].input, { task_id: '', block: '', timeout: '' });
|
||||
assert.equal(collectText(events), '');
|
||||
}
|
||||
});
|
||||
|
||||
test('sieve emits tool_calls for extra leading less-than DSML tags without leaking prefix', () => {
|
||||
const events = runSieve([
|
||||
'<<|DSML|tool_calls>\n',
|
||||
@@ -734,7 +920,7 @@ test('sieve keeps embedded invalid tool-like json as normal text to avoid stream
|
||||
assert.equal(leakedText.toLowerCase().includes('tool_calls'), true);
|
||||
});
|
||||
|
||||
test('sieve passes malformed executable-looking XML through as text', () => {
|
||||
test('sieve releases malformed executable-looking XML wrappers as text', () => {
|
||||
const chunk = '<tool_calls><invoke name="read_file"><param>{"path":"README.MD"}</param></invoke></tool_calls>';
|
||||
const events = runSieve([chunk], ['read_file']);
|
||||
const leakedText = collectText(events);
|
||||
|
||||
Reference in New Issue
Block a user