From 08f32c4c4091f11c3398af6f26ec745d25fe744e Mon Sep 17 00:00:00 2001 From: "CJACK." <155826701+CJackHwang@users.noreply.github.com> Date: Sun, 19 Apr 2026 21:04:06 +0800 Subject: [PATCH 1/3] docs: align API docs with implemented routes --- API.en.md | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++- API.md | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++- README.MD | 4 ++-- README.en.md | 4 ++-- 4 files changed, 106 insertions(+), 6 deletions(-) diff --git a/API.en.md b/API.en.md index d363d3e..7d1ef36 100644 --- a/API.en.md +++ b/API.en.md @@ -108,6 +108,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key= | POST | `/v1/responses` | Business | OpenAI Responses API (stream/non-stream) | | GET | `/v1/responses/{response_id}` | Business | Query stored response (in-memory TTL) | | POST | `/v1/embeddings` | Business | OpenAI Embeddings API | +| POST | `/v1/files` | Business | OpenAI Files upload (multipart/form-data) | | GET | `/anthropic/v1/models` | None | Claude model list | | POST | `/anthropic/v1/messages` | Business | Claude messages | | POST | `/anthropic/v1/messages/count_tokens` | Business | Claude token counting | @@ -131,9 +132,15 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key= | GET | `/admin/config/export` | Admin | Export full config (`config`/`json`/`base64`) | | POST | `/admin/keys` | Admin | Add API key | | DELETE | `/admin/keys/{key}` | Admin | Delete API key | +| GET | `/admin/proxies` | Admin | List proxies | +| POST | `/admin/proxies` | Admin | Add proxy | +| PUT | `/admin/proxies/{proxyID}` | Admin | Update proxy (empty password keeps old secret) | +| DELETE | `/admin/proxies/{proxyID}` | Admin | Delete proxy (auto-unbind referenced accounts) | +| POST | `/admin/proxies/test` | Admin | Test proxy connectivity | | GET | `/admin/accounts` | Admin | Paginated account list | | POST | `/admin/accounts` | Admin | Add account | | DELETE | `/admin/accounts/{identifier}` | Admin | Delete account | +| PUT | `/admin/accounts/{identifier}/proxy` | Admin | Bind/unbind proxy for an account | | GET | `/admin/queue/status` | Admin | Account queue status | | POST | `/admin/accounts/test` | Admin | Test one account | | POST | `/admin/accounts/test-all` | Admin | Test all accounts | @@ -391,6 +398,21 @@ Business auth required. Returns OpenAI-compatible embeddings shape. > Requires `embeddings.provider`. Current supported values: `mock` / `deterministic` / `builtin`. If missing/unsupported, returns standard error shape with HTTP 501. +### `POST /v1/files` + +Business auth required. OpenAI Files-compatible upload endpoint; currently only `multipart/form-data` is supported. + +| Field | Type | Required | Notes | +| --- | --- | --- | --- | +| `file` | file | ✅ | Binary payload | +| `purpose` | string | ❌ | Forwarded purpose field | + +Constraints and behavior: + +- `Content-Type` must be `multipart/form-data` (otherwise `400`). +- Total request size limit is `100 MiB` (over-limit returns `413`). +- Success returns an OpenAI `file` object (`id/object/bytes/filename/purpose/status`, etc.) and includes `account_id` for source-account tracing. + --- ## Claude-Compatible API @@ -723,6 +745,26 @@ Exports full config in three forms: `config`, `json`, and `base64`. **Response**: `{"success": true, "total_keys": 2}` +### `GET /admin/proxies` + +Lists proxy configs (password is never returned; use `has_password` as a marker). + +### `POST /admin/proxies` + +Adds a proxy. Request accepts `id` (optional; auto-generated when omitted), `name`, `type` (`http` / `socks5`), `host`, `port`, `username`, `password`. + +### `PUT /admin/proxies/{proxyID}` + +Updates a proxy. If `password` is an empty string, the existing secret is preserved. + +### `DELETE /admin/proxies/{proxyID}` + +Deletes a proxy and automatically clears `proxy_id` on all accounts that reference it. + +### `POST /admin/proxies/test` + +Tests proxy connectivity: provide `proxy_id` to test a saved proxy; omit it to run a one-off test using proxy fields in the request body. + ### `GET /admin/accounts` **Query params**: @@ -730,7 +772,7 @@ Exports full config in three forms: `config`, `json`, and `base64`. | Param | Default | Range | | --- | --- | --- | | `page` | `1` | ≥ 1 | -| `page_size` | `10` | 1–100 | +| `page_size` | `10` | 1–5000 | | `q` | empty | Filter by identifier / email / mobile | **Response**: @@ -771,6 +813,14 @@ Returned items also include `test_status`, usually `ok` or `failed`. **Response**: `{"success": true, "total_accounts": 5}` +### `PUT /admin/accounts/{identifier}/proxy` + +Updates proxy binding for a specific account. + +- Request body: `{"proxy_id":"..."}`. +- Use empty `proxy_id` to unbind proxy. +- `identifier` supports email / mobile / token-only synthetic id. + ### `GET /admin/queue/status` ```json diff --git a/API.md b/API.md index 6c73b05..b87332d 100644 --- a/API.md +++ b/API.md @@ -108,6 +108,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=` | POST | `/v1/responses` | 业务 | OpenAI Responses 接口(流式/非流式) | | GET | `/v1/responses/{response_id}` | 业务 | 查询已生成 response(内存 TTL) | | POST | `/v1/embeddings` | 业务 | OpenAI Embeddings 接口 | +| POST | `/v1/files` | 业务 | OpenAI Files 上传(multipart/form-data) | | GET | `/anthropic/v1/models` | 无 | Claude 模型列表 | | POST | `/anthropic/v1/messages` | 业务 | Claude 消息接口 | | POST | `/anthropic/v1/messages/count_tokens` | 业务 | Claude token 计数 | @@ -131,9 +132,15 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=` | GET | `/admin/config/export` | Admin | 导出完整配置(含 `config`/`json`/`base64`) | | POST | `/admin/keys` | Admin | 添加 API key | | DELETE | `/admin/keys/{key}` | Admin | 删除 API key | +| GET | `/admin/proxies` | Admin | 代理列表 | +| POST | `/admin/proxies` | Admin | 添加代理 | +| PUT | `/admin/proxies/{proxyID}` | Admin | 更新代理(留空 password 表示保留原密码) | +| DELETE | `/admin/proxies/{proxyID}` | Admin | 删除代理(自动解绑引用该代理的账号) | +| POST | `/admin/proxies/test` | Admin | 测试代理连通性 | | GET | `/admin/accounts` | Admin | 分页账号列表 | | POST | `/admin/accounts` | Admin | 添加账号 | | DELETE | `/admin/accounts/{identifier}` | Admin | 删除账号 | +| PUT | `/admin/accounts/{identifier}/proxy` | Admin | 为账号绑定/解绑代理 | | GET | `/admin/queue/status` | Admin | 账号队列状态 | | POST | `/admin/accounts/test` | Admin | 测试单个账号 | | POST | `/admin/accounts/test-all` | Admin | 测试全部账号 | @@ -397,6 +404,21 @@ data: [DONE] > 需配置 `embeddings.provider`。当前支持:`mock` / `deterministic` / `builtin`。未配置或不支持时返回标准错误结构(HTTP 501)。 +### `POST /v1/files` + +需要业务鉴权。兼容 OpenAI Files 上传接口,当前仅支持 `multipart/form-data`。 + +| 字段 | 类型 | 必填 | 说明 | +| --- | --- | --- | --- | +| `file` | file | ✅ | 上传文件二进制 | +| `purpose` | string | ❌ | 透传到上游用途字段 | + +约束与行为: + +- 请求必须为 `multipart/form-data`,否则返回 `400`。 +- 请求体总大小上限 `100 MiB`(超限返回 `413`)。 +- 成功返回 OpenAI `file` 对象(`id/object/bytes/filename/purpose/status` 等字段),并附带 `account_id` 便于定位来源账号。 + --- ## Claude 兼容接口 @@ -729,6 +751,26 @@ data: {"type":"message_stop"} **响应**:`{"success": true, "total_keys": 2}` +### `GET /admin/proxies` + +列出代理配置(密码不回传,仅返回 `has_password` 标记)。 + +### `POST /admin/proxies` + +新增代理。请求体支持 `id`(可选,未传则自动生成)、`name`、`type`(`http` / `socks5`)、`host`、`port`、`username`、`password`。 + +### `PUT /admin/proxies/{proxyID}` + +更新指定代理。若请求中 `password` 为空字符串,则保留原密码。 + +### `DELETE /admin/proxies/{proxyID}` + +删除代理,并自动清空所有引用该代理账号的 `proxy_id`。 + +### `POST /admin/proxies/test` + +测试代理连通性:传 `proxy_id` 时测试已保存代理;不传时按请求体代理字段做临时连通性测试。 + ### `GET /admin/accounts` **查询参数**: @@ -736,7 +778,7 @@ data: {"type":"message_stop"} | 参数 | 默认 | 范围 | | --- | --- | --- | | `page` | `1` | ≥ 1 | -| `page_size` | `10` | 1–100 | +| `page_size` | `10` | 1–5000 | | `q` | 空 | 按 identifier / email / mobile 过滤 | **响应**: @@ -775,6 +817,14 @@ data: {"type":"message_stop"} **响应**:`{"success": true, "total_accounts": 5}` +### `PUT /admin/accounts/{identifier}/proxy` + +更新指定账号绑定代理。 + +- 请求体:`{"proxy_id":"..."}`; +- `proxy_id` 传空字符串时表示解绑代理; +- `identifier` 支持 email / mobile / token-only 合成标识。 + ### `GET /admin/queue/status` ```json diff --git a/README.MD b/README.MD index 4fda982..31ba76c 100644 --- a/README.MD +++ b/README.MD @@ -96,14 +96,14 @@ flowchart LR | 能力 | 说明 | | --- | --- | -| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings` | +| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files` | | Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`(及快捷路径 `/v1/messages`、`/messages`) | | Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`(及 `/v1/models/{model}:*` 路径) | | 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 | | 并发队列控制 | 每账号 in-flight 上限 + 等待队列,动态计算建议并发值 | | DeepSeek PoW | 纯 Go 高性能实现(DeepSeekHashV1),毫秒级响应 | | Tool Calling | 防泄漏处理:非代码块高置信特征识别、`delta.tool_calls` 早发、结构化增量输出 | -| Admin API | 配置管理、运行时设置热更新、账号测试 / 批量测试、会话清理、导入导出、Vercel 同步、版本检查 | +| Admin API | 配置管理、运行时设置热更新、代理管理、账号测试 / 批量测试、会话清理、导入导出、Vercel 同步、版本检查 | | WebUI 管理台 | `/admin` 单页应用(中英文双语、深色模式) | | 运维探针 | `GET /healthz`(存活)、`GET /readyz`(就绪) | diff --git a/README.en.md b/README.en.md index 6eafc67..23c2a94 100644 --- a/README.en.md +++ b/README.en.md @@ -94,14 +94,14 @@ For the full module-by-module architecture and directory responsibilities, see [ | Capability | Details | | --- | --- | -| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings` | +| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files` | | Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) | | Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) | | Multi-account rotation | Auto token refresh, email/mobile dual login | | Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency | | DeepSeek PoW | Pure Go high-performance solver (DeepSeekHashV1), ms-level response | | Tool Calling | Anti-leak handling: non-code-block feature match, early `delta.tool_calls`, structured incremental output | -| Admin API | Config management, runtime settings hot-reload, account testing/batch test, session cleanup, import/export, Vercel sync, version check | +| Admin API | Config management, runtime settings hot-reload, proxy management, account testing/batch test, session cleanup, import/export, Vercel sync, version check | | WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode) | | Health Probes | `GET /healthz` (liveness), `GET /readyz` (readiness) | From 0e7f5cdc86e4bcfb88290a0ae945b863e612dc52 Mon Sep 17 00:00:00 2001 From: "CJACK." <155826701+CJackHwang@users.noreply.github.com> Date: Sun, 19 Apr 2026 23:12:13 +0800 Subject: [PATCH 2/3] docs: sync tool-calling semantics with current implementation --- API.en.md | 9 +++- API.md | 6 +-- README.MD | 6 +-- README.en.md | 6 +-- docs/ARCHITECTURE.en.md | 2 +- docs/ARCHITECTURE.md | 2 +- docs/toolcall-semantics.md | 98 +++++++++++++++++++------------------- 7 files changed, 67 insertions(+), 62 deletions(-) diff --git a/API.en.md b/API.en.md index 7d1ef36..2150ded 100644 --- a/API.en.md +++ b/API.en.md @@ -37,7 +37,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl - OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`. - Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths. -- Tool-calling semantics are aligned between Go and Node runtime: structured parsing first (JSON/XML/invoke/markup), plus stream-time anti-leak filtering. +- Tool-calling semantics are aligned between Go and Node runtime: parsing is now centered on XML/Markup-family tool syntax (`` / `` / `` / `tool_use` / antml variants), plus stream-time anti-leak filtering. - `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior. --- @@ -319,7 +319,12 @@ When `tools` is present, DS2API performs anti-leak handling: } ``` -**Stream**: Once high-confidence toolcall features are matched, DS2API emits `delta.tool_calls` immediately (without waiting for full JSON closure), then keeps sending argument deltas; confirmed raw tool JSON is never forwarded as `delta.content`. +**Stream**: Once high-confidence toolcall features are matched, DS2API emits `delta.tool_calls` immediately (without waiting for full argument closure), then keeps sending argument deltas; confirmed tool-call fragments are not forwarded as `delta.content`. + +Additional notes: + +- The parser currently follows XML/Markup-family tool payloads (``, ``, ``, `tool_use`, antml variants). Standalone JSON `tool_calls` payloads are not treated as executable tool calls by default. +- `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls. --- diff --git a/API.md b/API.md index b87332d..bb06170 100644 --- a/API.md +++ b/API.md @@ -37,7 +37,7 @@ - OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。 - 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。 -- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:优先结构化解析(JSON/XML/invoke/markup),并在流式场景执行防泄漏筛分。 +- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:当前以 XML/Markup 家族解析为主(含 `` / `` / `` / `tool_use` / antml 变体),并在流式场景执行防泄漏筛分。 - `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。 --- @@ -319,12 +319,12 @@ data: [DONE] } ``` -**流式**:命中高置信特征后立即输出 `delta.tool_calls`(不等待完整 JSON 闭合),并持续发送 arguments 增量;已确认的 toolcall 原始 JSON 不会回流到 `delta.content`。 +**流式**:命中高置信特征后立即输出 `delta.tool_calls`(不等待完整工具参数闭合),并持续发送 arguments 增量;已确认的工具调用片段不会回流到 `delta.content`。 补充说明: - **非代码块上下文**下,工具负载即使与普通文本混合,也会按特征识别并产出可执行 tool call(前后普通文本仍可透传)。 -- 解析器以 XML/Markup 为最高优先级,并兼容 JSON、ANTML、text-kv 等格式输入;最终按客户端协议转译为对应 tool call 结构(OpenAI/Claude/Gemini)。 +- 解析器当前走 XML/Markup 家族(包含 ``、``、``、`tool_use`、antml 风格);纯 JSON `tool_calls` 片段默认不会直接作为可执行调用解析。 - Markdown fenced code block(例如 ```json ... ```)中的 `tool_calls` 仅视为示例文本,不会被执行。 --- diff --git a/README.MD b/README.MD index 31ba76c..105b979 100644 --- a/README.MD +++ b/README.MD @@ -87,7 +87,7 @@ flowchart LR - **统一路由内核**:所有协议入口统一汇聚到 `internal/server/router.go`,并在同一路由树中注册 OpenAI / Claude / Gemini / Admin / WebUI 路由,避免多入口行为漂移。 - **统一执行链路**:Claude / Gemini 入口先经 `internal/translatorcliproxy` 做协议转换,再进入 `openai.ChatCompletions` 统一处理工具调用与流式语义,最后再转换回原协议响应。 - **适配器分层更清晰**:`internal/adapter/{claude,gemini}` 负责入口/出口协议封装,`internal/adapter/openai` 负责核心执行,DeepSeek 侧调用只保留在 OpenAI 内核中。 -- **Tool Calling 双运行时对齐**:Go 侧(`internal/toolcall`)与 Vercel Node 侧(`internal/js/helpers/stream-tool-sieve`)保持一致的解析/防泄漏语义,覆盖 JSON / XML / invoke / text-kv 多风格输入。 +- **Tool Calling 双运行时对齐**:Go 侧(`internal/toolcall`)与 Vercel Node 侧(`internal/js/helpers/stream-tool-sieve`)保持一致的解析/防泄漏语义;当前以 XML/Markup 家族为主(`` / `` / `` / `tool_use` / antml 变体)。 - **配置与运行时设置解耦**:静态配置(`config`)与运行时策略(`settings`)通过 Admin API 分离管理,支持热更新和密码轮换失效旧 JWT。 - **流式能力升级**:`/v1/responses` 与 `/v1/chat/completions` 共享更一致的工具调用增量输出策略,降低不同 SDK 下的行为差异。 - **可观测与可运维增强**:`/healthz`、`/readyz`、`/admin/version`、`/admin/dev/captures` 形成排障闭环,便于发布后验证。 @@ -155,7 +155,7 @@ flowchart LR - `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址(例如 `http://127.0.0.1:5001`),Claude Code 会请求 `/v1/messages?beta=true`。 - `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致;建议同时保留常规 key 与 `sk-ant-*` 形态 key,兼容不同客户端校验习惯。 - 若系统设置了代理,建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`,避免本地回环请求被代理拦截。 -- 如遇“工具调用输出成文本、未执行”问题,请升级到包含 Claude 工具调用多格式解析(JSON/XML/ANTML/invoke)的版本。 +- 如遇“工具调用输出成文本、未执行”问题,请优先检查模型输出是否为受支持的 XML/Markup 工具块(例如 `` / `` / `` / `tool_use`),而不是纯 JSON `tool_calls` 片段。 ### Gemini 接口 @@ -398,7 +398,7 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 ` 当请求中带 `tools` 时,DS2API 会做防泄漏处理与结构化转译: 1. 只在**非代码块上下文**启用执行型 toolcall 识别(代码块示例默认不触发) -2. 解析层以 XML/Markup 为最高优先级,同时兼容 JSON / ANTML / invoke / text-kv,并统一归一到内部工具调用结构 +2. 解析层当前以 XML/Markup 家族为准(`` / `` / `` / `tool_use` / antml 变体);纯 JSON `tool_calls` 片段默认不作为可执行调用解析 3. `responses` 流式严格使用官方 item 生命周期事件(`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`) 4. `responses` 支持并执行 `tool_choice`(`auto`/`none`/`required`/强制函数);`required` 违规时非流式返回 `422`,流式返回 `response.failed` 5. 客户端请求哪种协议,就按该协议返回工具调用(OpenAI/Claude/Gemini 各自原生结构);模型侧优先约束输出规范 XML,再由兼容层转译 diff --git a/README.en.md b/README.en.md index 23c2a94..653ba6d 100644 --- a/README.en.md +++ b/README.en.md @@ -85,7 +85,7 @@ For the full module-by-module architecture and directory responsibilities, see [ - **Unified routing core**: all protocol entries are now centralized through `internal/server/router.go`, with OpenAI / Claude / Gemini / Admin / WebUI routes registered in one tree to avoid multi-entry drift. - **Unified execution chain**: Claude/Gemini entries are translated by `internal/translatorcliproxy`, then executed through `openai.ChatCompletions` for shared tool-calling and stream semantics, then translated back to the client protocol. - **Cleaner adapter boundaries**: `internal/adapter/{claude,gemini}` handles protocol wrappers, while `internal/adapter/openai` remains the execution core; upstream DeepSeek calls are retained only in the OpenAI core. -- **Tool-calling parity across runtimes**: Go (`internal/toolcall`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) follow aligned parsing/anti-leak semantics across JSON / XML / invoke / text-kv inputs. +- **Tool-calling parity across runtimes**: Go (`internal/toolcall`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) share aligned parsing/anti-leak semantics, now centered on XML/Markup-family payloads (`` / `` / `` / `tool_use` / antml variants). - **Config/runtime separation**: static config (`config`) and runtime policy (`settings`) are managed independently via Admin APIs, enabling hot updates and password rotation with JWT invalidation. - **Streaming behavior upgrade**: `/v1/responses` and `/v1/chat/completions` now share a more consistent incremental tool-call emission strategy across SDK ecosystems. - **Improved operability**: `/healthz`, `/readyz`, `/admin/version`, and `/admin/dev/captures` form a tighter post-deploy diagnostics loop. @@ -153,7 +153,7 @@ Besides the current primary aliases above, `/anthropic/v1/models` also returns C - Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`. - `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility. - If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,` for DS2API to avoid proxy interception of local traffic. -- If tool calls are rendered as plain text and not executed, upgrade to a build that includes multi-format Claude tool-call parsing (JSON/XML/ANTML/invoke). +- If tool calls are rendered as plain text and not executed, first verify the model output uses supported XML/Markup tool blocks (`` / `` / `` / `tool_use`) rather than standalone JSON `tool_calls`. ### Gemini Endpoint @@ -396,7 +396,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency) When `tools` is present in the request, DS2API performs anti-leak handling: 1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored) -2. The parser prioritizes XML/Markup, while also accepting JSON / ANTML / invoke / text-kv, and normalizes everything into the internal tool-call structure +2. The parser currently targets XML/Markup-family tool syntax (`` / `` / `` / `tool_use` / antml variants); standalone JSON `tool_calls` payloads are not treated as executable calls by default 3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`) 4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream 5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation diff --git a/docs/ARCHITECTURE.en.md b/docs/ARCHITECTURE.en.md index 5235183..81bb928 100644 --- a/docs/ARCHITECTURE.en.md +++ b/docs/ARCHITECTURE.en.md @@ -116,7 +116,7 @@ flowchart LR - `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI. - `internal/deepseek`: upstream request/session/PoW/SSE handling. - `internal/stream` + `internal/sse`: stream parsing and incremental assembly. -- `internal/toolcall`: JSON/XML/invoke/text-kv tool-call parsing + anti-leak sieve. +- `internal/toolcall`: XML/Markup-family tool-call parsing + anti-leak sieve (`` / `` / `` / `tool_use` / antml variants). - `internal/admin`: config/accounts/vercel sync/version/dev-capture endpoints. - `internal/config`: config loading/validation + runtime settings hot-reload. - `internal/account`: managed account pool, inflight slots, waiting queue. diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 619c2e4..b439127 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -116,7 +116,7 @@ flowchart LR - `internal/translatorcliproxy`:Claude/Gemini 与 OpenAI 结构互转。 - `internal/deepseek`:上游请求、会话、PoW、SSE 消费。 - `internal/stream` + `internal/sse`:流式解析与增量处理。 -- `internal/toolcall`:JSON/XML/invoke/text-kv 工具调用解析及防泄漏筛分。 +- `internal/toolcall`:以 XML/Markup 家族为核心的工具调用解析与防泄漏筛分(`` / `` / `` / `tool_use` / antml 变体)。 - `internal/admin`:配置管理、账号管理、Vercel 同步、版本检查、开发抓包。 - `internal/config`:配置加载、校验、运行时 settings 热更新。 - `internal/account`:托管账号池、并发槽位、等待队列。 diff --git a/docs/toolcall-semantics.md b/docs/toolcall-semantics.md index 0c6b99d..889e3ca 100644 --- a/docs/toolcall-semantics.md +++ b/docs/toolcall-semantics.md @@ -1,74 +1,74 @@ # Tool call parsing semantics(Go/Node 统一语义) -本文档描述当前代码中 `ParseToolCallsDetailed` / `parseToolCallsDetailed` 的**实际行为**,用于对齐 Go 与 Node Runtime。 +本文档描述当前代码中工具调用解析链路的**实际行为**(以 `internal/toolcall` 与 `internal/js/helpers/stream-tool-sieve` 为准)。 文档导航:[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [测试指南](./TESTING.md) -## 1) 输出结构(当前实现) +## 1) 当前输出结构 -- `calls`:解析得到的工具调用列表(`name` + `input`)。 -- `sawToolCallSyntax`:检测到工具调用语法特征时为 `true`(例如 `tool_calls`、``、``、``、`function.name:`)。 -- `rejectedByPolicy`:当前实现固定为 `false`(预留字段,尚未启用 allow-list 拒绝)。 +`ParseToolCallsDetailed` / `parseToolCallsDetailed` 返回: + +- `calls`:解析出的工具调用列表(`name` + `input`)。 +- `sawToolCallSyntax`:检测到工具调用语法特征时为 `true`。 +- `rejectedByPolicy`:当前实现固定为 `false`(预留字段)。 - `rejectedToolNames`:当前实现固定为空数组(预留字段)。 -> 说明:`filterToolCallsDetailed` 当前仅做结构清洗,不做工具名策略拒绝。 +> 当前 `filterToolCallsDetailed` 仅做结构清洗,不做 allow-list 工具名硬拒绝。 -## 2) 解析管线 +## 2) 解析范围(重点) -1. **示例保护**:若判定为 fenced code block 示例上下文,则跳过执行型解析。 -2. **候选片段构建**:从完整文本中构建候选(原文、围绕 `tool_calls` 的 JSON 片段、首尾大括号切片等)。 -3. **按序尝试解析(命中即停)**: - - 对“明显 JSON 工具载荷候选”(以 `{`/`[` 开头且包含 `tool_calls`/`\"function\"`)先走 JSON 解析,避免 JSON 字符串内偶发 XML 片段误命中; - - 其余候选优先 XML 解析(`` / `` / `` / `tool_use` / `antml:function_call` 等); - - JSON 解析(`{"tool_calls": [...]}`、列表、单对象); - - Markup 解析; - - Text-KV 回退(如 `function.name:` + `function.arguments:`)。 -4. **兜底**:候选全部失败后,再对全文做 XML / Text-KV 回退。 +当前版本的可执行解析以 **XML/Markup 家族**为主: -## 3) XML 能力边界(当前) +- `...` +- `...` +- `...`(含自闭合) +- `...` +- antml 变体(如 `antml:function_call` / `antml:argument`) -当前已支持输入端的“多 XML/标记风格”解析,包括但不限于: +并支持在这些标记块内部解析: -- `......` -- `tool...` -- `...` -- `antml:function_call` / `antml:argument` / `antml:parameters` -- `tool_use` 家族标签 +- JSON 参数字符串 +- 标签参数(`...`) +- key/value 风格子标签 -但**输出端仍统一转换为 OpenAI 兼容 JSON 事件/对象**(`message.tool_calls`、`delta.tool_calls`、`response.function_call_arguments.*`)。 +## 3) 不应再假设的行为 -## 4) 关于“是否可以封装成 XML 再喂给模型” +以下说法在当前实现中已不成立: -结论:**可以做,而且当前解析器已经能兼容 XML 作为输入格式之一**,但代码里并没有 `toolcall.prefer_xml_output` 这个开关。现有可调配置只有: +1. “纯 JSON `tool_calls` 片段会被直接当作可执行工具调用解析”。 +2. “存在 `toolcall.mode` / `toolcall.early_emit_confidence` 等可配置开关可以改变解析策略”。 -- `toolcall.mode`:`feature_match` / `off` -- `toolcall.early_emit_confidence`:`high` / `low` / `off` +当前策略在代码中固定为: -推荐思路仍然是“输入兼容层 + 输出按客户端协议渲染”: +- 特征匹配开启(feature-match on) +- 高置信度早发开启(early emit on) +- policy 拒绝字段保留但未启用 -1. **Prompt 约束层**:如果你要尝试 XML-first,可以在系统提示词里约束模型输出规范 XML tool block(例如 `...`)。 -2. **解析兼容层**:继续在 parser 中同时接受 JSON / XML / ANTML / invoke / text-kv。 -3. **协议归一层**:无论模型输出什么格式,统一落到内部 `ParsedToolCall`。 -4. **对外渲染层**:根据客户端请求协议渲染(OpenAI / Claude / Gemini 各自格式)。 +## 4) 流式与防泄漏语义 -这样可以同时获得: +在流式链路中(OpenAI / Claude / Gemini 统一内核): -- 减少模型端 JSON 转义/引号错误; -- 不破坏现有 SDK / 客户端生态; -- 逐步灰度(按模型、按租户、按请求开关)。 +- 工具调用片段会被优先提取为结构化增量输出; +- 已识别的工具调用原始片段不会作为普通文本再次回流; +- fenced code block 中的示例内容按文本处理,不作为可执行工具调用。 -## 5) 落地建议(低风险迭代) +## 5) 落地建议(按当前实现) -- 继续使用现有的 `toolcall.mode=feature_match` 和 `toolcall.early_emit_confidence=high` 作为默认策略。 -- 如果要试 XML-first,把它放在 prompt 层或上游模板层,不要假设代码里已有专门的 XML 输出开关。 -- 增加观测指标: - - `toolcall_parse_source`(json/xml/markup/textkv); - - `toolcall_parse_success_rate`; - - `toolcall_malformed_rate`; - - `toolcall_repair_rate`。 -- 先在 `responses` 链路灰度,再扩展 `chat.completions`。 +1. Prompt 里优先约束模型输出 XML/Markup 工具块。 +2. 执行器侧继续做工具名白名单与参数 schema 校验(不要依赖 parser 代替安全策略)。 +3. 需要兼容历史“纯 JSON tool_calls”模型输出时,请在上游模板层把输出规范化为 XML/Markup 风格再进入 DS2API。 -## 6) 兼容性提醒 +## 6) 回归验证建议 -- 上游模型若输出混合文本 + XML,仍可能出现“半结构化”噪声,需要依赖现有 sieve 增量消费策略。 -- XML 不等于安全:仍需做 tool 名、参数 schema、执行权限的服务端校验。 +可直接运行: + +```bash +go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/ +node --test tests/node/stream-tool-sieve.test.js +``` + +重点覆盖: + +- `` / `` / `` / `tool_use` / antml 变体 +- 参数 JSON 修复与解析 +- 流式增量下的工具调用提取与文本防泄漏 From 2c08375b4943fcb7bbca1ffa3a892e022f94089c Mon Sep 17 00:00:00 2001 From: "CJACK." <155826701+CJackHwang@users.noreply.github.com> Date: Sun, 19 Apr 2026 23:42:34 +0800 Subject: [PATCH 3/3] docs: refresh model alias examples to current defaults --- API.en.md | 9 ++++++++- API.md | 9 ++++++++- README.MD | 8 ++++++-- README.en.md | 8 ++++++-- 4 files changed, 28 insertions(+), 6 deletions(-) diff --git a/API.en.md b/API.en.md index 2150ded..1d6fe6c 100644 --- a/API.en.md +++ b/API.en.md @@ -215,6 +215,13 @@ For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-outp 3. If still unmatched, fall back by known family heuristics (`o*`, `gpt-*`, `claude-*`, etc.). 4. If still unmatched, return `invalid_request_error`. +Current built-in default aliases (excerpt): + +- OpenAI: `gpt-4o`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-5`, `gpt-5-mini`, `gpt-5-codex` +- OpenAI reasoning: `o1`, `o1-mini`, `o3`, `o3-mini` +- Claude: `claude-sonnet-4-5`, `claude-haiku-4-5`, `claude-opus-4-6` (plus compatibility aliases `claude-3-5-sonnet` / `claude-3-5-haiku` / `claude-3-opus`) +- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash` + ### `POST /v1/chat/completions` **Headers**: @@ -228,7 +235,7 @@ Content-Type: application/json | Field | Type | Required | Notes | | --- | --- | --- | --- | -| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, `gemini-2.5-pro`, etc.) | +| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5`, `gpt-5-mini`, `gpt-5-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-2.5-flash`, etc.) | | `messages` | array | ✅ | OpenAI-style messages | | `stream` | boolean | ❌ | Default `false` | | `tools` | array | ❌ | Function calling schema | diff --git a/API.md b/API.md index bb06170..1f9bcf5 100644 --- a/API.md +++ b/API.md @@ -215,6 +215,13 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=` 3. 未命中时按模型家族规则回退(如 `o*`、`gpt-*`、`claude-*`)。 4. 仍未命中则返回 `invalid_request_error`。 +当前内置默认 alias(节选): + +- OpenAI:`gpt-4o`、`gpt-4.1`、`gpt-4.1-mini`、`gpt-4.1-nano`、`gpt-5`、`gpt-5-mini`、`gpt-5-codex` +- OpenAI Reasoning:`o1`、`o1-mini`、`o3`、`o3-mini` +- Claude:`claude-sonnet-4-5`、`claude-haiku-4-5`、`claude-opus-4-6`(及 `claude-3-5-sonnet` / `claude-3-5-haiku` / `claude-3-opus` 兼容别名) +- Gemini:`gemini-2.5-pro`、`gemini-2.5-flash` + ### `POST /v1/chat/completions` **请求头**: @@ -228,7 +235,7 @@ Content-Type: application/json | 字段 | 类型 | 必填 | 说明 | | --- | --- | --- | --- | -| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias(如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`、`gemini-2.5-pro` 等) | +| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias(如 `gpt-5`、`gpt-5-mini`、`gpt-5-codex`、`o3`、`claude-opus-4-6`、`gemini-2.5-pro`、`gemini-2.5-flash` 等) | | `messages` | array | ✅ | OpenAI 风格消息数组 | | `stream` | boolean | ❌ | 默认 `false` | | `tools` | array | ❌ | Function Calling 定义 | diff --git a/README.MD b/README.MD index 105b979..5e6823a 100644 --- a/README.MD +++ b/README.MD @@ -137,7 +137,7 @@ flowchart LR | vision | `deepseek-vision-chat-search` | ❌ | ✅ | | vision | `deepseek-vision-reasoner-search` | ✅ | ✅ | -除原生模型外,也支持常见 alias 输入(如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`、`gemini-2.5-pro` 等),但 `/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID。 +除原生模型外,也支持常见 alias 输入(如 `gpt-5`、`gpt-5-mini`、`gpt-5-codex`、`gpt-4.1`、`o3`、`claude-opus-4-6`、`claude-sonnet-4-5`、`gemini-2.5-pro`、`gemini-2.5-flash` 等),但 `/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID。 ### Claude 接口(`GET /anthropic/v1/models`) @@ -293,8 +293,12 @@ go run ./cmd/ds2api ], "model_aliases": { "gpt-4o": "deepseek-chat", + "gpt-5": "deepseek-chat", + "gpt-5-mini": "deepseek-chat", "gpt-5-codex": "deepseek-reasoner", - "o3": "deepseek-reasoner" + "o3": "deepseek-reasoner", + "claude-opus-4-6": "deepseek-reasoner", + "gemini-2.5-flash": "deepseek-chat" }, "compat": { "wide_input_strict_output": true, diff --git a/README.en.md b/README.en.md index 653ba6d..e10c02d 100644 --- a/README.en.md +++ b/README.en.md @@ -135,7 +135,7 @@ For the full module-by-module architecture and directory responsibilities, see [ | vision | `deepseek-vision-chat-search` | ❌ | ✅ | | vision | `deepseek-vision-reasoner-search` | ✅ | ✅ | -Besides native IDs, DS2API also accepts common aliases as input (for example `gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, `gemini-2.5-pro`), but `/v1/models` returns normalized DeepSeek native model IDs. +Besides native IDs, DS2API also accepts common aliases as input (for example `gpt-5`, `gpt-5-mini`, `gpt-5-codex`, `gpt-4.1`, `o3`, `claude-opus-4-6`, `claude-sonnet-4-5`, `gemini-2.5-pro`, `gemini-2.5-flash`), but `/v1/models` returns normalized DeepSeek native model IDs. ### Claude Endpoint (`GET /anthropic/v1/models`) @@ -291,8 +291,12 @@ The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usua ], "model_aliases": { "gpt-4o": "deepseek-chat", + "gpt-5": "deepseek-chat", + "gpt-5-mini": "deepseek-chat", "gpt-5-codex": "deepseek-reasoner", - "o3": "deepseek-reasoner" + "o3": "deepseek-reasoner", + "claude-opus-4-6": "deepseek-reasoner", + "gemini-2.5-flash": "deepseek-chat" }, "compat": { "wide_input_strict_output": true,