Merge pull request #268 from CJackHwang/dev

chore: bump version to 3.5.0
2026-05-03 16:05:26 +08:00 · 2026-04-20 01:26:09 +08:00 · 2026-04-20 01:24:31 +08:00 · 2026-04-20 01:20:11 +08:00 · 2026-04-20 00:47:05 +08:00 · 2026-04-20 00:13:14 +08:00
126 changed files with 4692 additions and 4561 deletions
--- a/API.en.md
+++ b/API.en.md
@@ -37,7 +37,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl

 - OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`.
 - Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths.
- Tool-calling semantics are aligned between Go and Node runtime: structured parsing first (JSON/XML/invoke/markup), plus stream-time anti-leak filtering.
+- Tool-calling semantics are aligned between Go and Node runtime: parsing is now centered on XML/Markup-family tool syntax (`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml variants), plus stream-time anti-leak filtering.
 - `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior.

 ---
@@ -108,6 +108,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
 | POST | `/v1/responses` | Business | OpenAI Responses API (stream/non-stream) |
 | GET | `/v1/responses/{response_id}` | Business | Query stored response (in-memory TTL) |
 | POST | `/v1/embeddings` | Business | OpenAI Embeddings API |
+| POST | `/v1/files` | Business | OpenAI Files upload (multipart/form-data) |
 | GET | `/anthropic/v1/models` | None | Claude model list |
 | POST | `/anthropic/v1/messages` | Business | Claude messages |
 | POST | `/anthropic/v1/messages/count_tokens` | Business | Claude token counting |
@@ -131,9 +132,15 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
 | GET | `/admin/config/export` | Admin | Export full config (`config`/`json`/`base64`) |
 | POST | `/admin/keys` | Admin | Add API key |
 | DELETE | `/admin/keys/{key}` | Admin | Delete API key |
+| GET | `/admin/proxies` | Admin | List proxies |
+| POST | `/admin/proxies` | Admin | Add proxy |
+| PUT | `/admin/proxies/{proxyID}` | Admin | Update proxy (empty password keeps old secret) |
+| DELETE | `/admin/proxies/{proxyID}` | Admin | Delete proxy (auto-unbind referenced accounts) |
+| POST | `/admin/proxies/test` | Admin | Test proxy connectivity |
 | GET | `/admin/accounts` | Admin | Paginated account list |
 | POST | `/admin/accounts` | Admin | Add account |
 | DELETE | `/admin/accounts/{identifier}` | Admin | Delete account |
+| PUT | `/admin/accounts/{identifier}/proxy` | Admin | Bind/unbind proxy for an account |
 | GET | `/admin/queue/status` | Admin | Account queue status |
 | POST | `/admin/accounts/test` | Admin | Test one account |
 | POST | `/admin/accounts/test-all` | Admin | Test all accounts |
@@ -173,7 +180,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=

 ### `GET /v1/models`

-No auth required. Returns supported models.
+No auth required. Returns the currently supported DeepSeek native model list.

 **Response**:

@@ -184,11 +191,21 @@ No auth required. Returns supported models.
    {"id": "deepseek-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
-    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
+    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
  ]
 }
 ```

+> Note: `/v1/models` returns normalized DeepSeek native model IDs. Common aliases are accepted only as request input and are not expanded as separate items in this endpoint.
+
 ### Model Alias Resolution

 For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-output policy:
@@ -198,6 +215,13 @@ For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-outp
 3. If still unmatched, fall back by known family heuristics (`o*`, `gpt-*`, `claude-*`, etc.).
 4. If still unmatched, return `invalid_request_error`.

+Current built-in default aliases (excerpt):
+
+- OpenAI: `gpt-4o`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-5`, `gpt-5-mini`, `gpt-5-codex`
+- OpenAI reasoning: `o1`, `o1-mini`, `o3`, `o3-mini`
+- Claude: `claude-sonnet-4-5`, `claude-haiku-4-5`, `claude-opus-4-6` (plus compatibility aliases `claude-3-5-sonnet` / `claude-3-5-haiku` / `claude-3-opus`)
+- Gemini: `gemini-2.5-pro`, `gemini-2.5-flash`
+
 ### `POST /v1/chat/completions`

 **Headers**:
@@ -211,7 +235,7 @@ Content-Type: application/json

 | Field | Type | Required | Notes |
 | --- | --- | --- | --- |
-| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, etc.) |
+| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-5`, `gpt-5-mini`, `gpt-5-codex`, `o3`, `claude-opus-4-6`, `gemini-2.5-pro`, `gemini-2.5-flash`, etc.) |
 | `messages` | array | ✅ | OpenAI-style messages |
 | `stream` | boolean | ❌ | Default `false` |
 | `tools` | array | ❌ | Function calling schema |
@@ -302,7 +326,12 @@ When `tools` is present, DS2API performs anti-leak handling:
 }
 ```

-**Stream**: Once high-confidence toolcall features are matched, DS2API emits `delta.tool_calls` immediately (without waiting for full JSON closure), then keeps sending argument deltas; confirmed raw tool JSON is never forwarded as `delta.content`.
+**Stream**: Once high-confidence toolcall features are matched, DS2API emits `delta.tool_calls` immediately (without waiting for full argument closure), then keeps sending argument deltas; confirmed tool-call fragments are not forwarded as `delta.content`.
+
+Additional notes:
+
+- The parser currently follows XML/Markup-family tool payloads (`<tool_call>`, `<function_call>`, `<invoke>`, `tool_use`, antml variants). Standalone JSON `tool_calls` payloads are not treated as executable tool calls by default.
+- `tool_calls` shown inside fenced markdown code blocks (for example, ```json ... ```) are treated as examples, not executable calls.

 ---

@@ -381,6 +410,21 @@ Business auth required. Returns OpenAI-compatible embeddings shape.

 > Requires `embeddings.provider`. Current supported values: `mock` / `deterministic` / `builtin`. If missing/unsupported, returns standard error shape with HTTP 501.

+### `POST /v1/files`
+
+Business auth required. OpenAI Files-compatible upload endpoint; currently only `multipart/form-data` is supported.
+
+| Field | Type | Required | Notes |
+| --- | --- | --- | --- |
+| `file` | file | ✅ | Binary payload |
+| `purpose` | string | ❌ | Forwarded purpose field |
+
+Constraints and behavior:
+
+- `Content-Type` must be `multipart/form-data` (otherwise `400`).
+- Total request size limit is `100 MiB` (over-limit returns `413`).
+- Success returns an OpenAI `file` object (`id/object/bytes/filename/purpose/status`, etc.) and includes `account_id` for source-account tracing.
+
 ---

 ## Claude-Compatible API
@@ -408,7 +452,7 @@ No auth required.
 }
 ```

-> Note: the example is partial; the real response includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases.
+> Note: the example is partial; besides the current primary aliases, the real response also includes Claude 4.x snapshots plus historical 3.x / 2.x / 1.x IDs and common aliases.

 ### `POST /anthropic/v1/messages`

@@ -713,6 +757,26 @@ Exports full config in three forms: `config`, `json`, and `base64`.

 **Response**: `{"success": true, "total_keys": 2}`

+### `GET /admin/proxies`
+
+Lists proxy configs (password is never returned; use `has_password` as a marker).
+
+### `POST /admin/proxies`
+
+Adds a proxy. Request accepts `id` (optional; auto-generated when omitted), `name`, `type` (`http` / `socks5`), `host`, `port`, `username`, `password`.
+
+### `PUT /admin/proxies/{proxyID}`
+
+Updates a proxy. If `password` is an empty string, the existing secret is preserved.
+
+### `DELETE /admin/proxies/{proxyID}`
+
+Deletes a proxy and automatically clears `proxy_id` on all accounts that reference it.
+
+### `POST /admin/proxies/test`
+
+Tests proxy connectivity: provide `proxy_id` to test a saved proxy; omit it to run a one-off test using proxy fields in the request body.
+
 ### `GET /admin/accounts`

 **Query params**:
@@ -720,7 +784,7 @@ Exports full config in three forms: `config`, `json`, and `base64`.
 | Param | Default | Range |
 | --- | --- | --- |
 | `page` | `1` | ≥ 1 |
-| `page_size` | `10` | 1–100 |
+| `page_size` | `10` | 1–5000 |
 | `q` | empty | Filter by identifier / email / mobile |

 **Response**:
@@ -761,6 +825,14 @@ Returned items also include `test_status`, usually `ok` or `failed`.

 **Response**: `{"success": true, "total_accounts": 5}`

+### `PUT /admin/accounts/{identifier}/proxy`
+
+Updates proxy binding for a specific account.
+
+- Request body: `{"proxy_id":"..."}`.
+- Use empty `proxy_id` to unbind proxy.
+- `identifier` supports email / mobile / token-only synthetic id.
+
 ### `GET /admin/queue/status`

 ```json
--- a/API.md
+++ b/API.md
@@ -37,7 +37,7 @@

 - OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上，由 `internal/server/router.go` 负责装配。
 - 适配器层职责收敛为：**请求归一化 → DeepSeek 调用 → 协议形态渲染**，减少历史版本中“同能力多处实现”的分叉。
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致：优先结构化解析（JSON/XML/invoke/markup），并在流式场景执行防泄漏筛分。
+- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致：当前以 XML/Markup 家族解析为主（含 `<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml 变体），并在流式场景执行防泄漏筛分。
 - `Admin API` 将配置与运行时策略分开：`/admin/config*` 管静态配置，`/admin/settings*` 管运行时行为。

 ---
@@ -108,6 +108,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
 | POST | `/v1/responses` | 业务 | OpenAI Responses 接口（流式/非流式） |
 | GET | `/v1/responses/{response_id}` | 业务 | 查询已生成 response（内存 TTL） |
 | POST | `/v1/embeddings` | 业务 | OpenAI Embeddings 接口 |
+| POST | `/v1/files` | 业务 | OpenAI Files 上传（multipart/form-data） |
 | GET | `/anthropic/v1/models` | 无 | Claude 模型列表 |
 | POST | `/anthropic/v1/messages` | 业务 | Claude 消息接口 |
 | POST | `/anthropic/v1/messages/count_tokens` | 业务 | Claude token 计数 |
@@ -131,9 +132,15 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
 | GET | `/admin/config/export` | Admin | 导出完整配置（含 `config`/`json`/`base64`） |
 | POST | `/admin/keys` | Admin | 添加 API key |
 | DELETE | `/admin/keys/{key}` | Admin | 删除 API key |
+| GET | `/admin/proxies` | Admin | 代理列表 |
+| POST | `/admin/proxies` | Admin | 添加代理 |
+| PUT | `/admin/proxies/{proxyID}` | Admin | 更新代理（留空 password 表示保留原密码） |
+| DELETE | `/admin/proxies/{proxyID}` | Admin | 删除代理（自动解绑引用该代理的账号） |
+| POST | `/admin/proxies/test` | Admin | 测试代理连通性 |
 | GET | `/admin/accounts` | Admin | 分页账号列表 |
 | POST | `/admin/accounts` | Admin | 添加账号 |
 | DELETE | `/admin/accounts/{identifier}` | Admin | 删除账号 |
+| PUT | `/admin/accounts/{identifier}/proxy` | Admin | 为账号绑定/解绑代理 |
 | GET | `/admin/queue/status` | Admin | 账号队列状态 |
 | POST | `/admin/accounts/test` | Admin | 测试单个账号 |
 | POST | `/admin/accounts/test-all` | Admin | 测试全部账号 |
@@ -173,7 +180,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`

 ### `GET /v1/models`

-无需鉴权。返回当前支持的模型列表。
+无需鉴权。返回当前支持的 DeepSeek 原生模型列表。

 **响应示例**：

@@ -184,11 +191,21 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
    {"id": "deepseek-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
-    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
+    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-expert-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
+    {"id": "deepseek-vision-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
  ]
 }
 ```

+> 说明：`/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID；常见 alias 仅用于请求入参解析，不会在该接口中单独展开返回。
+
 ### 模型 alias 解析策略

 对 `chat` / `responses` / `embeddings` 的 `model` 字段采用“宽进严出”：
@@ -198,6 +215,13 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
 3. 未命中时按模型家族规则回退（如 `o*`、`gpt-*`、`claude-*`）。
 4. 仍未命中则返回 `invalid_request_error`。

+当前内置默认 alias（节选）：
+
+- OpenAI：`gpt-4o`、`gpt-4.1`、`gpt-4.1-mini`、`gpt-4.1-nano`、`gpt-5`、`gpt-5-mini`、`gpt-5-codex`
+- OpenAI Reasoning：`o1`、`o1-mini`、`o3`、`o3-mini`
+- Claude：`claude-sonnet-4-5`、`claude-haiku-4-5`、`claude-opus-4-6`（及 `claude-3-5-sonnet` / `claude-3-5-haiku` / `claude-3-opus` 兼容别名）
+- Gemini：`gemini-2.5-pro`、`gemini-2.5-flash`
+
 ### `POST /v1/chat/completions`

 **请求头**：
@@ -211,7 +235,7 @@ Content-Type: application/json

 | 字段 | 类型 | 必填 | 说明 |
 | --- | --- | --- | --- |
-| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias（如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`） |
+| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias（如 `gpt-5`、`gpt-5-mini`、`gpt-5-codex`、`o3`、`claude-opus-4-6`、`gemini-2.5-pro`、`gemini-2.5-flash` 等） |
 | `messages` | array | ✅ | OpenAI 风格消息数组 |
 | `stream` | boolean | ❌ | 默认 `false` |
 | `tools` | array | ❌ | Function Calling 定义 |
@@ -302,12 +326,12 @@ data: [DONE]
 }
 ```

-**流式**：命中高置信特征后立即输出 `delta.tool_calls`（不等待完整 JSON 闭合），并持续发送 arguments 增量；已确认的 toolcall 原始 JSON 不会回流到 `delta.content`。
+**流式**：命中高置信特征后立即输出 `delta.tool_calls`（不等待完整工具参数闭合），并持续发送 arguments 增量；已确认的工具调用片段不会回流到 `delta.content`。

 补充说明：

 - **非代码块上下文**下，工具负载即使与普通文本混合，也会按特征识别并产出可执行 tool call（前后普通文本仍可透传）。
- 解析器以 XML/Markup 为最高优先级，并兼容 JSON、ANTML、text-kv 等格式输入；最终按客户端协议转译为对应 tool call 结构（OpenAI/Claude/Gemini）。
+- 解析器当前走 XML/Markup 家族（包含 `<tool_call>`、`<function_call>`、`<invoke>`、`tool_use`、antml 风格）；纯 JSON `tool_calls` 片段默认不会直接作为可执行调用解析。
 - Markdown fenced code block（例如 ```json ... ```）中的 `tool_calls` 仅视为示例文本，不会被执行。

 ---
@@ -387,6 +411,21 @@ data: [DONE]

 > 需配置 `embeddings.provider`。当前支持：`mock` / `deterministic` / `builtin`。未配置或不支持时返回标准错误结构（HTTP 501）。

+### `POST /v1/files`
+
+需要业务鉴权。兼容 OpenAI Files 上传接口，当前仅支持 `multipart/form-data`。
+
+| 字段 | 类型 | 必填 | 说明 |
+| --- | --- | --- | --- |
+| `file` | file | ✅ | 上传文件二进制 |
+| `purpose` | string | ❌ | 透传到上游用途字段 |
+
+约束与行为：
+
+- 请求必须为 `multipart/form-data`，否则返回 `400`。
+- 请求体总大小上限 `100 MiB`（超限返回 `413`）。
+- 成功返回 OpenAI `file` 对象（`id/object/bytes/filename/purpose/status` 等字段），并附带 `account_id` 便于定位来源账号。
+
 ---

 ## Claude 兼容接口
@@ -414,7 +453,7 @@ data: [DONE]
 }
 ```

-> 说明：示例仅展示部分模型；实际返回包含 Claude 1.x/2.x/3.x/4.x 历史模型 ID 与常见别名。
+> 说明：示例仅展示部分模型；实际返回除当前主别名外，还包含 Claude 4.x snapshots，以及 3.x / 2.x / 1.x 历史模型 ID 与常见别名。

 ### `POST /anthropic/v1/messages`

@@ -719,6 +758,26 @@ data: {"type":"message_stop"}

 **响应**：`{"success": true, "total_keys": 2}`

+### `GET /admin/proxies`
+
+列出代理配置（密码不回传，仅返回 `has_password` 标记）。
+
+### `POST /admin/proxies`
+
+新增代理。请求体支持 `id`（可选，未传则自动生成）、`name`、`type`（`http` / `socks5`）、`host`、`port`、`username`、`password`。
+
+### `PUT /admin/proxies/{proxyID}`
+
+更新指定代理。若请求中 `password` 为空字符串，则保留原密码。
+
+### `DELETE /admin/proxies/{proxyID}`
+
+删除代理，并自动清空所有引用该代理账号的 `proxy_id`。
+
+### `POST /admin/proxies/test`
+
+测试代理连通性：传 `proxy_id` 时测试已保存代理；不传时按请求体代理字段做临时连通性测试。
+
 ### `GET /admin/accounts`

 **查询参数**：
@@ -726,7 +785,7 @@ data: {"type":"message_stop"}
 | 参数 | 默认 | 范围 |
 | --- | --- | --- |
 | `page` | `1` | ≥ 1 |
-| `page_size` | `10` | 1–100 |
+| `page_size` | `10` | 1–5000 |
 | `q` | 空 | 按 identifier / email / mobile 过滤 |

 **响应**：
@@ -765,6 +824,14 @@ data: {"type":"message_stop"}

 **响应**：`{"success": true, "total_accounts": 5}`

+### `PUT /admin/accounts/{identifier}/proxy`
+
+更新指定账号绑定代理。
+
+- 请求体：`{"proxy_id":"..."}`；
+- `proxy_id` 传空字符串时表示解绑代理；
+- `identifier` 支持 email / mobile / token-only 合成标识。
+
 ### `GET /admin/queue/status`

 ```json
--- a/README.MD
+++ b/README.MD
@@ -18,6 +18,8 @@

 文档入口：[文档导航](docs/README.md) / [架构说明](docs/ARCHITECTURE.md) / [接口文档](API.md)

+【感谢Linux.do社区及GitHub社区各位开发者对项目的支持与贡献】
+
 > **重要免责声明**
 >
 > 本仓库仅供学习、研究、个人实验和内部验证使用，不提供任何形式的商业授权、适用性保证或结果保证。
@@ -85,7 +87,7 @@ flowchart LR
 - **统一路由内核**：所有协议入口统一汇聚到 `internal/server/router.go`，并在同一路由树中注册 OpenAI / Claude / Gemini / Admin / WebUI 路由，避免多入口行为漂移。
 - **统一执行链路**：Claude / Gemini 入口先经 `internal/translatorcliproxy` 做协议转换，再进入 `openai.ChatCompletions` 统一处理工具调用与流式语义，最后再转换回原协议响应。
 - **适配器分层更清晰**：`internal/adapter/{claude,gemini}` 负责入口/出口协议封装，`internal/adapter/openai` 负责核心执行，DeepSeek 侧调用只保留在 OpenAI 内核中。
- **Tool Calling 双运行时对齐**：Go 侧（`internal/toolcall`）与 Vercel Node 侧（`internal/js/helpers/stream-tool-sieve`）保持一致的解析/防泄漏语义，覆盖 JSON / XML / invoke / text-kv 多风格输入。
+- **Tool Calling 双运行时对齐**：Go 侧（`internal/toolcall`）与 Vercel Node 侧（`internal/js/helpers/stream-tool-sieve`）保持一致的解析/防泄漏语义；当前以 XML/Markup 家族为主（`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml 变体）。
 - **配置与运行时设置解耦**：静态配置（`config`）与运行时策略（`settings`）通过 Admin API 分离管理，支持热更新和密码轮换失效旧 JWT。
 - **流式能力升级**：`/v1/responses` 与 `/v1/chat/completions` 共享更一致的工具调用增量输出策略，降低不同 SDK 下的行为差异。
 - **可观测与可运维增强**：`/healthz`、`/readyz`、`/admin/version`、`/admin/dev/captures` 形成排障闭环，便于发布后验证。
@@ -94,14 +96,14 @@ flowchart LR

 | 能力 | 说明 |
 | --- | --- |
-| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings` |
+| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files` |
 | Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`（及快捷路径 `/v1/messages`、`/messages`） |
 | Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`（及 `/v1/models/{model}:*` 路径） |
 | 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 |
 | 并发队列控制 | 每账号 in-flight 上限 + 等待队列，动态计算建议并发值 |
 | DeepSeek PoW | 纯 Go 高性能实现（DeepSeekHashV1），毫秒级响应 |
 | Tool Calling | 防泄漏处理：非代码块高置信特征识别、`delta.tool_calls` 早发、结构化增量输出 |
-| Admin API | 配置管理、运行时设置热更新、账号测试 / 批量测试、会话清理、导入导出、Vercel 同步、版本检查 |
+| Admin API | 配置管理、运行时设置热更新、代理管理、账号测试 / 批量测试、会话清理、导入导出、Vercel 同步、版本检查 |
 | WebUI 管理台 | `/admin` 单页应用（中英文双语、深色模式） |
 | 运维探针 | `GET /healthz`（存活）、`GET /readyz`（就绪） |

@@ -118,33 +120,42 @@ flowchart LR

 ## 模型支持

-### OpenAI 接口
+### OpenAI 接口（`GET /v1/models`）

-| 模型 | thinking | search |
-| --- | --- | --- |
-| `deepseek-chat` | ❌ | ❌ |
-| `deepseek-reasoner` | ✅ | ❌ |
-| `deepseek-chat-search` | ❌ | ✅ |
-| `deepseek-reasoner-search` | ✅ | ✅ |
+| 模型类型 | 模型 ID | thinking | search |
+| --- | --- | --- | --- |
+| default | `deepseek-chat` | ❌ | ❌ |
+| default | `deepseek-reasoner` | ✅ | ❌ |
+| default | `deepseek-chat-search` | ❌ | ✅ |
+| default | `deepseek-reasoner-search` | ✅ | ✅ |
+| expert | `deepseek-expert-chat` | ❌ | ❌ |
+| expert | `deepseek-expert-reasoner` | ✅ | ❌ |
+| expert | `deepseek-expert-chat-search` | ❌ | ✅ |
+| expert | `deepseek-expert-reasoner-search` | ✅ | ✅ |
+| vision | `deepseek-vision-chat` | ❌ | ❌ |
+| vision | `deepseek-vision-reasoner` | ✅ | ❌ |
+| vision | `deepseek-vision-chat-search` | ❌ | ✅ |
+| vision | `deepseek-vision-reasoner-search` | ✅ | ✅ |

-### Claude 接口
+除原生模型外，也支持常见 alias 输入（如 `gpt-5`、`gpt-5-mini`、`gpt-5-codex`、`gpt-4.1`、`o3`、`claude-opus-4-6`、`claude-sonnet-4-5`、`gemini-2.5-pro`、`gemini-2.5-flash` 等），但 `/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID。

-| 模型 | 默认映射 |
+### Claude 接口（`GET /anthropic/v1/models`）
+
+| 当前常用模型 | 默认映射 |
 | --- | --- |
 | `claude-sonnet-4-5` | `deepseek-chat` |
 | `claude-haiku-4-5`（兼容 `claude-3-5-haiku-latest`） | `deepseek-chat` |
 | `claude-opus-4-6` | `deepseek-reasoner` |

 可通过配置中的 `claude_mapping` 或 `claude_model_mapping` 覆盖映射关系。
-另外，`/anthropic/v1/models` 现已包含 Claude 1.x/2.x/3.x/4.x 历史模型 ID 与常见别名，便于旧客户端直接兼容。
-
+`/anthropic/v1/models` 除上述当前主别名外，还会返回 Claude 4.x snapshots，以及 3.x / 2.x / 1.x 历史模型 ID 与常见 alias，便于旧客户端直接兼容。

 #### Claude Code 接入避坑（实测）

 - `ANTHROPIC_BASE_URL` 推荐直接指向 DS2API 根地址（例如 `http://127.0.0.1:5001`），Claude Code 会请求 `/v1/messages?beta=true`。
 - `ANTHROPIC_API_KEY` 需要与 `config.json` 中 `keys` 一致；建议同时保留常规 key 与 `sk-ant-*` 形态 key，兼容不同客户端校验习惯。
 - 若系统设置了代理，建议对 DS2API 地址配置 `NO_PROXY=127.0.0.1,localhost,<你的主机IP>`，避免本地回环请求被代理拦截。
- 如遇“工具调用输出成文本、未执行”问题，请升级到包含 Claude 工具调用多格式解析（JSON/XML/ANTML/invoke）的版本。
+- 如遇“工具调用输出成文本、未执行”问题，请优先检查模型输出是否为受支持的 XML/Markup 工具块（例如 `<tool_call>` / `<function_call>` / `<invoke>` / `tool_use`），而不是纯 JSON `tool_calls` 片段。

 ### Gemini 接口

@@ -152,6 +163,15 @@ Gemini 适配器将模型名通过 `model_aliases` 或内置规则映射到 Deep

 ## 快速开始

+### 部署方式优先级建议
+
+推荐按以下顺序选择部署方式：
+
+1. **下载 Release 构建包运行**：最省事，产物已编译完成，最适合大多数用户。
+2. **Docker / GHCR 镜像部署**：适合需要容器化、编排或云环境部署。
+3. **Vercel 部署**：适合已有 Vercel 环境且接受其平台约束的场景。
+4. **本地源码运行 / 自行编译**：适合开发、调试或需要自行修改代码的场景。
+
 ### 通用第一步（所有部署方式）

 把 `config.json` 作为唯一配置源（推荐做法）：
@@ -165,29 +185,19 @@ cp config.example.json config.json
 - 本地运行：直接读取 `config.json`
 - Docker / Vercel：由 `config.json` 生成 `DS2API_CONFIG_JSON`（Base64）注入环境变量，也可以直接写原始 JSON

-### 方式一：本地运行
+### 方式一：下载 Release 构建包

-**前置要求**：Go 1.26+，Node.js `20.19+` 或 `22.12+`（仅在需要构建 WebUI 时）
+每次发布 Release 时，GitHub Actions 会自动构建多平台二进制包：

 ```bash
-# 1. 克隆仓库
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
-
-# 2. 配置
+# 下载对应平台的压缩包后
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api_<tag>_linux_amd64
 cp config.example.json config.json
-# 编辑 config.json，填入你的 DeepSeek 账号信息和 API key
-
-# 3. 启动
-go run ./cmd/ds2api
+# 编辑 config.json
+./ds2api
 ```

-默认本地访问地址：`http://127.0.0.1:5001`
-
-服务实际绑定：`0.0.0.0:5001`，因此同一局域网设备通常也可以通过你的内网 IP 访问。
-
-> **WebUI 自动构建**：本地首次启动时，若 `static/admin` 不存在，会自动尝试执行 `npm ci`（仅在缺少依赖时）和 `npm run build -- --outDir static/admin --emptyOutDir`（需要本机有 Node.js）。你也可以手动构建：`./scripts/build-webui.sh`
-
 ### 方式二：Docker 运行

 ```bash
@@ -241,35 +251,28 @@ base64 < config.json | tr -d '\n'

 详细部署说明请参阅 [部署指南](docs/DEPLOY.md)。

-### 方式四：下载 Release 构建包
+### 方式四：本地源码运行

-每次发布 Release 时，GitHub Actions 会自动构建多平台二进制包：
+**前置要求**：Go 1.26+，Node.js `20.19+` 或 `22.12+`（仅在需要构建 WebUI 时）

 ```bash
-# 下载对应平台的压缩包后
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api_<tag>_linux_amd64
+# 1. 克隆仓库
+git clone https://github.com/CJackHwang/ds2api.git
+cd ds2api
+
+# 2. 配置
 cp config.example.json config.json
-# 编辑 config.json
-./ds2api
+# 编辑 config.json，填入你的 DeepSeek 账号信息和 API key
+
+# 3. 启动
+go run ./cmd/ds2api
 ```

-### 方式五：OpenCode CLI 接入
+默认本地访问地址：`http://127.0.0.1:5001`

-1. 复制示例配置：
+服务实际绑定：`0.0.0.0:5001`，因此同一局域网设备通常也可以通过你的内网 IP 访问。

-```bash
-cp opencode.json.example opencode.json
-```
-
-2. 编辑 `opencode.json`：
- 将 `baseURL` 改为你的 DS2API 地址（例如 `https://your-domain.com/v1`）
- 将 `apiKey` 改为你的 DS2API key（对应 `config.keys`）
-
-3. 在项目目录启动 OpenCode CLI（按你的安装方式运行 `opencode`）。
-
-> 建议优先使用 OpenAI 兼容路径（`/v1/*`），即示例里的 `@ai-sdk/openai-compatible` provider。
-> 若客户端支持 `wire_api`，可分别测试 `responses` 与 `chat`，DS2API 两条链路都兼容。
+> **WebUI 自动构建**：本地首次启动时，若 `static/admin` 不存在，会自动尝试执行 `npm ci`（仅在缺少依赖时）和 `npm run build -- --outDir static/admin --emptyOutDir`（需要本机有 Node.js）。你也可以手动构建：`./scripts/build-webui.sh`

 ## 配置说明

@@ -290,8 +293,12 @@ cp opencode.json.example opencode.json
  ],
  "model_aliases": {
    "gpt-4o": "deepseek-chat",
+    "gpt-5": "deepseek-chat",
+    "gpt-5-mini": "deepseek-chat",
    "gpt-5-codex": "deepseek-reasoner",
-    "o3": "deepseek-reasoner"
+    "o3": "deepseek-reasoner",
+    "claude-opus-4-6": "deepseek-reasoner",
+    "gemini-2.5-flash": "deepseek-chat"
  },
  "compat": {
    "wide_input_strict_output": true,
@@ -395,7 +402,7 @@ Gemini 路由还可以使用 `x-goog-api-key`，或在没有认证头时使用 `
 当请求中带 `tools` 时，DS2API 会做防泄漏处理与结构化转译：

 1. 只在**非代码块上下文**启用执行型 toolcall 识别（代码块示例默认不触发）
-2. 解析层以 XML/Markup 为最高优先级，同时兼容 JSON / ANTML / invoke / text-kv，并统一归一到内部工具调用结构
+2. 解析层当前以 XML/Markup 家族为准（`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml 变体）；纯 JSON `tool_calls` 片段默认不作为可执行调用解析
 3. `responses` 流式严格使用官方 item 生命周期事件（`response.output_item.*`、`response.content_part.*`、`response.function_call_arguments.*`）
 4. `responses` 支持并执行 `tool_choice`（`auto`/`none`/`required`/强制函数）；`required` 违规时非流式返回 `422`，流式返回 `response.failed`
 5. 客户端请求哪种协议，就按该协议返回工具调用（OpenAI/Claude/Gemini 各自原生结构）；模型侧优先约束输出规范 XML，再由兼容层转译
--- a/README.en.md
+++ b/README.en.md
@@ -85,7 +85,7 @@ For the full module-by-module architecture and directory responsibilities, see [
 - **Unified routing core**: all protocol entries are now centralized through `internal/server/router.go`, with OpenAI / Claude / Gemini / Admin / WebUI routes registered in one tree to avoid multi-entry drift.
 - **Unified execution chain**: Claude/Gemini entries are translated by `internal/translatorcliproxy`, then executed through `openai.ChatCompletions` for shared tool-calling and stream semantics, then translated back to the client protocol.
 - **Cleaner adapter boundaries**: `internal/adapter/{claude,gemini}` handles protocol wrappers, while `internal/adapter/openai` remains the execution core; upstream DeepSeek calls are retained only in the OpenAI core.
- **Tool-calling parity across runtimes**: Go (`internal/toolcall`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) follow aligned parsing/anti-leak semantics across JSON / XML / invoke / text-kv inputs.
+- **Tool-calling parity across runtimes**: Go (`internal/toolcall`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) share aligned parsing/anti-leak semantics, now centered on XML/Markup-family payloads (`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml variants).
 - **Config/runtime separation**: static config (`config`) and runtime policy (`settings`) are managed independently via Admin APIs, enabling hot updates and password rotation with JWT invalidation.
 - **Streaming behavior upgrade**: `/v1/responses` and `/v1/chat/completions` now share a more consistent incremental tool-call emission strategy across SDK ecosystems.
 - **Improved operability**: `/healthz`, `/readyz`, `/admin/version`, and `/admin/dev/captures` form a tighter post-deploy diagnostics loop.
@@ -94,14 +94,14 @@ For the full module-by-module architecture and directory responsibilities, see [

 | Capability | Details |
 | --- | --- |
-| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings` |
+| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files` |
 | Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) |
 | Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
 | Multi-account rotation | Auto token refresh, email/mobile dual login |
 | Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency |
 | DeepSeek PoW | Pure Go high-performance solver (DeepSeekHashV1), ms-level response |
 | Tool Calling | Anti-leak handling: non-code-block feature match, early `delta.tool_calls`, structured incremental output |
-| Admin API | Config management, runtime settings hot-reload, account testing/batch test, session cleanup, import/export, Vercel sync, version check |
+| Admin API | Config management, runtime settings hot-reload, proxy management, account testing/batch test, session cleanup, import/export, Vercel sync, version check |
 | WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode) |
 | Health Probes | `GET /healthz` (liveness), `GET /readyz` (readiness) |

@@ -118,33 +118,42 @@ For the full module-by-module architecture and directory responsibilities, see [

 ## Model Support

-### OpenAI Endpoint
+### OpenAI Endpoint (`GET /v1/models`)

-| Model | thinking | search |
-| --- | --- | --- |
-| `deepseek-chat` | ❌ | ❌ |
-| `deepseek-reasoner` | ✅ | ❌ |
-| `deepseek-chat-search` | ❌ | ✅ |
-| `deepseek-reasoner-search` | ✅ | ✅ |
+| Family | Model ID | thinking | search |
+| --- | --- | --- | --- |
+| default | `deepseek-chat` | ❌ | ❌ |
+| default | `deepseek-reasoner` | ✅ | ❌ |
+| default | `deepseek-chat-search` | ❌ | ✅ |
+| default | `deepseek-reasoner-search` | ✅ | ✅ |
+| expert | `deepseek-expert-chat` | ❌ | ❌ |
+| expert | `deepseek-expert-reasoner` | ✅ | ❌ |
+| expert | `deepseek-expert-chat-search` | ❌ | ✅ |
+| expert | `deepseek-expert-reasoner-search` | ✅ | ✅ |
+| vision | `deepseek-vision-chat` | ❌ | ❌ |
+| vision | `deepseek-vision-reasoner` | ✅ | ❌ |
+| vision | `deepseek-vision-chat-search` | ❌ | ✅ |
+| vision | `deepseek-vision-reasoner-search` | ✅ | ✅ |

-### Claude Endpoint
+Besides native IDs, DS2API also accepts common aliases as input (for example `gpt-5`, `gpt-5-mini`, `gpt-5-codex`, `gpt-4.1`, `o3`, `claude-opus-4-6`, `claude-sonnet-4-5`, `gemini-2.5-pro`, `gemini-2.5-flash`), but `/v1/models` returns normalized DeepSeek native model IDs.

-| Model | Default Mapping |
+### Claude Endpoint (`GET /anthropic/v1/models`)
+
+| Current common model | Default Mapping |
 | --- | --- |
 | `claude-sonnet-4-5` | `deepseek-chat` |
 | `claude-haiku-4-5` (compatible with `claude-3-5-haiku-latest`) | `deepseek-chat` |
 | `claude-opus-4-6` | `deepseek-reasoner` |

 Override mapping via `claude_mapping` or `claude_model_mapping` in config.
-In addition, `/anthropic/v1/models` now includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases for legacy client compatibility.
-
+Besides the current primary aliases above, `/anthropic/v1/models` also returns Claude 4.x snapshots plus historical 3.x / 2.x / 1.x IDs and common aliases for legacy client compatibility.

 #### Claude Code integration pitfalls (validated)

 - Set `ANTHROPIC_BASE_URL` to the DS2API root URL (for example `http://127.0.0.1:5001`). Claude Code sends requests to `/v1/messages?beta=true`.
 - `ANTHROPIC_API_KEY` must match an entry in `keys` from `config.json`. Keeping both a regular key and an `sk-ant-*` style key improves client compatibility.
 - If your environment has proxy variables, set `NO_PROXY=127.0.0.1,localhost,<your_host_ip>` for DS2API to avoid proxy interception of local traffic.
- If tool calls are rendered as plain text and not executed, upgrade to a build that includes multi-format Claude tool-call parsing (JSON/XML/ANTML/invoke).
+- If tool calls are rendered as plain text and not executed, first verify the model output uses supported XML/Markup tool blocks (`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use`) rather than standalone JSON `tool_calls`.

 ### Gemini Endpoint

@@ -152,6 +161,15 @@ The Gemini adapter maps model names to DeepSeek native models via `model_aliases

 ## Quick Start

+### Recommended deployment priority
+
+Recommended order when choosing a deployment method:
+
+1. **Download and run release binaries**: the easiest path for most users because the artifacts are already built.
+2. **Docker / GHCR image deployment**: suitable for containerized, orchestrated, or cloud environments.
+3. **Vercel deployment**: suitable if you already use Vercel and accept its platform constraints.
+4. **Run from source / build locally**: suitable for development, debugging, or when you need to modify the code yourself.
+
 ### Universal First Step (all deployment modes)

 Use `config.json` as the single source of truth (recommended):
@@ -165,47 +183,37 @@ Recommended per deployment mode:
 - Local run: read `config.json` directly
 - Docker / Vercel: generate Base64 from `config.json` and inject as `DS2API_CONFIG_JSON`, or paste raw JSON directly

-### Option 1: Local Run
+### Option 1: Download Release Binaries

-**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)
+GitHub Actions automatically builds multi-platform archives on each Release:

 ```bash
-# 1. Clone
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
-
-# 2. Configure
+# After downloading the archive for your platform
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api_<tag>_linux_amd64
 cp config.example.json config.json
-# Edit config.json with your DeepSeek account info and API keys
-
-# 3. Start
-go run ./cmd/ds2api
+# Edit config.json
+./ds2api
 ```

-Default local URL: `http://127.0.0.1:5001`
-
-The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.
-
-> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`
-
-### Option 2: Docker
+### Option 2: Docker / GHCR

 ```bash
-# 1. Prepare env file and config file
+# Pull prebuilt image
+docker pull ghcr.io/cjackhwang/ds2api:latest
+
+# Or run a pinned version
+# docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+
+# Prepare env file and config file
 cp .env.example .env
 cp config.example.json config.json

-# 2. Edit .env (at least set DS2API_ADMIN_KEY; optionally set DS2API_HOST_PORT to change the host port)
-#    DS2API_ADMIN_KEY=replace-with-a-strong-secret
-
-# 3. Start
+# Start with compose
 docker-compose up -d
-
-# 4. View logs
-docker-compose logs -f
 ```

-The default `docker-compose.yml` maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
+The default `docker-compose.yml` uses `ghcr.io/cjackhwang/ds2api:latest` and maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).

 Rebuild after updates: `docker-compose up -d --build`

@@ -241,35 +249,28 @@ base64 < config.json | tr -d '\n'

 For detailed deployment instructions, see the [Deployment Guide](docs/DEPLOY.en.md).

-### Option 4: Download Release Binaries
+### Option 4: Local Run

-GitHub Actions automatically builds multi-platform archives on each Release:
+**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)

 ```bash
-# After downloading the archive for your platform
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api_<tag>_linux_amd64
+# 1. Clone
+git clone https://github.com/CJackHwang/ds2api.git
+cd ds2api
+
+# 2. Configure
 cp config.example.json config.json
-# Edit config.json
-./ds2api
+# Edit config.json with your DeepSeek account info and API keys
+
+# 3. Start
+go run ./cmd/ds2api
 ```

-### Option 5: OpenCode CLI
+Default local URL: `http://127.0.0.1:5001`

-1. Copy the example config:
+The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.

-```bash
-cp opencode.json.example opencode.json
-```
-
-2. Edit `opencode.json`:
- Set `baseURL` to your DS2API endpoint (for example, `https://your-domain.com/v1`)
- Set `apiKey` to your DS2API key (from `config.keys`)
-
-3. Start OpenCode CLI in the project directory (run `opencode` using your installed method).
-
-> Recommended: use the OpenAI-compatible path (`/v1/*`) via `@ai-sdk/openai-compatible` as shown in the example.
-> If your client supports `wire_api`, test both `responses` and `chat`; DS2API supports both paths.
+> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`

 ## Configuration

@@ -290,8 +291,12 @@ cp opencode.json.example opencode.json
  ],
  "model_aliases": {
    "gpt-4o": "deepseek-chat",
+    "gpt-5": "deepseek-chat",
+    "gpt-5-mini": "deepseek-chat",
    "gpt-5-codex": "deepseek-reasoner",
-    "o3": "deepseek-reasoner"
+    "o3": "deepseek-reasoner",
+    "claude-opus-4-6": "deepseek-reasoner",
+    "gemini-2.5-flash": "deepseek-chat"
  },
  "compat": {
    "wide_input_strict_output": true,
@@ -395,7 +400,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
 When `tools` is present in the request, DS2API performs anti-leak handling:

 1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored)
-2. The parser prioritizes XML/Markup, while also accepting JSON / ANTML / invoke / text-kv, and normalizes everything into the internal tool-call structure
+2. The parser currently targets XML/Markup-family tool syntax (`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml variants); standalone JSON `tool_calls` payloads are not treated as executable calls by default
 3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
 4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream
 5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation
--- a/2
+++ b/2
@@ -1 +1 @@
-3.2.0
+3.5.0
--- a/docs/ARCHITECTURE.en.md
+++ b/docs/ARCHITECTURE.en.md
@@ -116,7 +116,7 @@ flowchart LR
 - `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI.
 - `internal/deepseek`: upstream request/session/PoW/SSE handling.
 - `internal/stream` + `internal/sse`: stream parsing and incremental assembly.
- `internal/toolcall`: JSON/XML/invoke/text-kv tool-call parsing + anti-leak sieve.
+- `internal/toolcall`: XML/Markup-family tool-call parsing + anti-leak sieve (`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml variants).
 - `internal/admin`: config/accounts/vercel sync/version/dev-capture endpoints.
 - `internal/config`: config loading/validation + runtime settings hot-reload.
 - `internal/account`: managed account pool, inflight slots, waiting queue.
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -116,7 +116,7 @@ flowchart LR
 - `internal/translatorcliproxy`：Claude/Gemini 与 OpenAI 结构互转。
 - `internal/deepseek`：上游请求、会话、PoW、SSE 消费。
 - `internal/stream` + `internal/sse`：流式解析与增量处理。
- `internal/toolcall`：JSON/XML/invoke/text-kv 工具调用解析及防泄漏筛分。
+- `internal/toolcall`：以 XML/Markup 家族为核心的工具调用解析与防泄漏筛分（`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml 变体）。
 - `internal/admin`：配置管理、账号管理、Vercel 同步、版本检查、开发抓包。
 - `internal/config`：配置加载、校验、运行时 settings 热更新。
 - `internal/account`：托管账号池、并发槽位、等待队列。
--- a/docs/DEPLOY.en.md
+++ b/docs/DEPLOY.en.md
@@ -10,11 +10,12 @@ Doc map: [Index](./README.md) | [Architecture](./ARCHITECTURE.en.md) | [API](../

 ## Table of Contents

+- [Recommended deployment priority](#recommended-deployment-priority)
 - [Prerequisites](#0-prerequisites)
- [1. Local Run](#1-local-run)
- [2. Docker Deployment](#2-docker-deployment)
+- [1. Download Release Binaries](#1-download-release-binaries)
+- [2. Docker / GHCR Deployment](#2-docker--ghcr-deployment)
 - [3. Vercel Deployment](#3-vercel-deployment)
- [4. Download Release Binaries](#4-download-release-binaries)
+- [4. Local Run from Source](#4-local-run-from-source)
 - [5. Reverse Proxy (Nginx)](#5-reverse-proxy-nginx)
 - [6. Linux systemd Service](#6-linux-systemd-service)
 - [7. Post-Deploy Checks](#7-post-deploy-checks)
@@ -22,6 +23,17 @@ Doc map: [Index](./README.md) | [Architecture](./ARCHITECTURE.en.md) | [API](../

 ---

+## Recommended deployment priority
+
+Recommended order when choosing a deployment method:
+
+1. **Download and run release binaries**: the easiest path for most users because the artifacts are already built.
+2. **Docker / GHCR image deployment**: suitable for containerized, orchestrated, or cloud environments.
+3. **Vercel deployment**: suitable if you already use Vercel and accept its platform constraints.
+4. **Run from source / build locally**: suitable for development, debugging, or when you need to modify the code yourself.
+
+---
+
 ## 0. Prerequisites

 | Dependency | Minimum Version | Notes |
@@ -48,70 +60,59 @@ Use `config.json` as the single source of truth:

 ---

-## 1. Local Run
+## 1. Download Release Binaries

-### 1.1 Basic Steps
+Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
+
+- **Trigger**: only on Release `published` (no build on normal push)
+- **Outputs**: multi-platform binary archives + `sha256sums.txt`
+- **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
+
+| Platform | Architecture | Format |
+| --- | --- | --- |
+| Linux | amd64, arm64 | `.tar.gz` |
+| macOS | amd64, arm64 | `.tar.gz` |
+| Windows | amd64 | `.zip` |
+
+Each archive includes:
+
+- `ds2api` executable (`ds2api.exe` on Windows)
+- `static/admin/` (built WebUI assets)
+- `config.example.json`, `.env.example`
+- `README.MD`, `README.en.md`, `LICENSE`
+
+### Usage

 ```bash
-# Clone
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
+# 1. Download the archive for your platform
+# 2. Extract
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api_<tag>_linux_amd64

-# Copy and edit config
+# 3. Configure
 cp config.example.json config.json
-# Open config.json and fill in:
-#   - keys: your API access keys
-#   - accounts: DeepSeek accounts (email or mobile + password)
+# Edit config.json

-# Start
-go run ./cmd/ds2api
-```
-
-Default local access URL: `http://127.0.0.1:5001`; the server actually binds to `0.0.0.0:5001` (override with `PORT`).
-
-### 1.2 WebUI Build
-
-On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
-
-Manual build:
-
-```bash
-./scripts/build-webui.sh
-```
-
-Or step by step:
-
-```bash
-cd webui
-npm install
-npm run build
-# Output goes to static/admin/
-```
-
-Control auto-build via environment variable:
-
-```bash
-# Disable auto-build
-DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
-
-# Force enable auto-build
-DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
-```
-
-### 1.3 Compile to Binary
-
-```bash
-go build -o ds2api ./cmd/ds2api
+# 4. Start
 ./ds2api
 ```

+### Maintainer Release Flow
+
+1. Create and publish a GitHub Release (with tag, for example `vX.Y.Z`)
+2. Wait for the `Release Artifacts` workflow to complete
+3. Download the matching archive from Release Assets
+
 ---

-## 2. Docker Deployment
+## 2. Docker / GHCR Deployment

 ### 2.1 Basic Steps

 ```bash
+# Pull prebuilt image
+docker pull ghcr.io/cjackhwang/ds2api:latest
+
 # Copy env template and config file
 cp .env.example .env
 cp config.example.json config.json
@@ -128,7 +129,13 @@ docker-compose up -d
 docker-compose logs -f
 ```

-The default `docker-compose.yml` maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
+The default `docker-compose.yml` directly uses `ghcr.io/cjackhwang/ds2api:latest` and maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
+
+If you want a pinned version instead of `latest`, you can also pull a specific tag directly:
+
+```bash
+docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+```

 ### 2.2 Update

@@ -350,57 +357,61 @@ If API responses return Vercel HTML `Authentication Required`:

 ---

-## 4. Download Release Binaries
+## 4. Local Run from Source

-Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
-
- **Trigger**: only on Release `published` (no build on normal push)
- **Outputs**: multi-platform binary archives + `sha256sums.txt`
- **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
-
-| Platform | Architecture | Format |
-| --- | --- | --- |
-| Linux | amd64, arm64 | `.tar.gz` |
-| macOS | amd64, arm64 | `.tar.gz` |
-| Windows | amd64 | `.zip` |
-
-Each archive includes:
-
- `ds2api` executable (`ds2api.exe` on Windows)
- `static/admin/` (built WebUI assets)
- `config.example.json`, `.env.example`
- `README.MD`, `README.en.md`, `LICENSE`
-
-### Usage
+### 4.1 Basic Steps

 ```bash
-# 1. Download the archive for your platform
-# 2. Extract
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api_<tag>_linux_amd64
+# Clone
+git clone https://github.com/CJackHwang/ds2api.git
+cd ds2api

-# 3. Configure
+# Copy and edit config
 cp config.example.json config.json
-# Edit config.json
+# Open config.json and fill in:
+#   - keys: your API access keys
+#   - accounts: DeepSeek accounts (email or mobile + password)

-# 4. Start
-./ds2api
+# Start
+go run ./cmd/ds2api
 ```

-### Maintainer Release Flow
+Default local access URL: `http://127.0.0.1:5001`; the server actually binds to `0.0.0.0:5001` (override with `PORT`).

-1. Create and publish a GitHub Release (with tag, for example `vX.Y.Z`)
-2. Wait for the `Release Artifacts` workflow to complete
-3. Download the matching archive from Release Assets
+### 4.2 WebUI Build

-### Pull from GHCR (Optional)
+On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
+
+Manual build:

 ```bash
-# latest
-docker pull ghcr.io/cjackhwang/ds2api:latest
+./scripts/build-webui.sh
+```

-# specific version (example)
-docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+Or step by step:
+
+```bash
+cd webui
+npm install
+npm run build
+# Output goes to static/admin/
+```
+
+Control auto-build via environment variable:
+
+```bash
+# Disable auto-build
+DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
+
+# Force enable auto-build
+DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
+```
+
+### 4.3 Compile to Binary
+
+```bash
+go build -o ds2api ./cmd/ds2api
+./ds2api
 ```

 ---
--- a/docs/DEPLOY.md
+++ b/docs/DEPLOY.md
@@ -10,11 +10,12 @@

 ## 目录

+- [部署方式优先级建议](#部署方式优先级建议)
 - [前置要求](#0-前置要求)
- [一、本地运行](#一本地运行)
- [二、Docker 部署](#二docker-部署)
+- [一、下载 Release 构建包](#一下载-release-构建包)
+- [二、Docker / GHCR 部署](#二docker--ghcr-部署)
 - [三、Vercel 部署](#三vercel-部署)
- [四、下载 Release 构建包](#四下载-release-构建包)
+- [四、本地源码运行](#四本地源码运行)
 - [五、反向代理（Nginx）](#五反向代理nginx)
 - [六、Linux systemd 服务化](#六linux-systemd-服务化)
 - [七、部署后检查](#七部署后检查)
@@ -22,6 +23,17 @@

 ---

+## 部署方式优先级建议
+
+推荐按以下顺序选择部署方式：
+
+1. **下载 Release 构建包运行**：最省事，产物已编译完成，最适合大多数用户。
+2. **Docker / GHCR 镜像部署**：适合需要容器化、编排或云环境部署。
+3. **Vercel 部署**：适合已有 Vercel 环境且接受其平台约束的场景。
+4. **本地源码运行 / 自行编译**：适合开发、调试或需要自行修改代码的场景。
+
+---
+
 ## 0. 前置要求

 | 依赖 | 最低版本 | 说明 |
@@ -48,70 +60,59 @@ cp config.example.json config.json

 ---

-## 一、本地运行
+## 一、下载 Release 构建包

-### 1.1 基本步骤
+仓库内置 GitHub Actions 工作流：`.github/workflows/release-artifacts.yml`
+
+- **触发条件**：仅在 Release `published` 时触发（普通 push 不会构建）
+- **构建产物**：多平台二进制压缩包 + `sha256sums.txt`
+- **容器镜像发布**：仅发布到 GHCR（`ghcr.io/cjackhwang/ds2api`）
+
+| 平台 | 架构 | 文件格式 |
+| --- | --- | --- |
+| Linux | amd64, arm64 | `.tar.gz` |
+| macOS | amd64, arm64 | `.tar.gz` |
+| Windows | amd64 | `.zip` |
+
+每个压缩包包含：
+
+- `ds2api` 可执行文件（Windows 为 `ds2api.exe`）
+- `static/admin/`（WebUI 构建产物）
+- `config.example.json`、`.env.example`
+- `README.MD`、`README.en.md`、`LICENSE`
+
+### 使用步骤

 ```bash
-# 克隆仓库
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
+# 1. 下载对应平台的压缩包
+# 2. 解压
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api_<tag>_linux_amd64

-# 复制并编辑配置
+# 3. 配置
 cp config.example.json config.json
-# 使用你喜欢的编辑器打开 config.json，填入：
-#   - keys: 你的 API 访问密钥
-#   - accounts: DeepSeek 账号（email 或 mobile + password）
+# 编辑 config.json

-# 启动服务
-go run ./cmd/ds2api
-```
-
-默认本地访问地址是 `http://127.0.0.1:5001`；服务实际绑定 `0.0.0.0:5001`，可通过 `PORT` 环境变量覆盖。
-
-### 1.2 WebUI 构建
-
-本地首次启动时，若 `static/admin/` 不存在，服务会自动尝试构建 WebUI（需要 Node.js/npm；缺依赖时会先执行 `npm ci`，再执行 `npm run build -- --outDir static/admin --emptyOutDir`）。
-
-你也可以手动构建：
-
-```bash
-./scripts/build-webui.sh
-```
-
-或手动执行：
-
-```bash
-cd webui
-npm install
-npm run build
-# 产物输出到 static/admin/
-```
-
-通过环境变量控制自动构建行为：
-
-```bash
-# 强制关闭自动构建
-DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
-
-# 强制开启自动构建
-DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
-```
-
-### 1.3 编译为二进制文件
-
-```bash
-go build -o ds2api ./cmd/ds2api
+# 4. 启动
 ./ds2api
 ```

+### 维护者发布步骤
+
+1. 在 GitHub 创建并发布 Release（带 tag，如 `vX.Y.Z`）
+2. 等待 Actions 工作流 `Release Artifacts` 完成
+3. 在 Release 的 Assets 下载对应平台压缩包
+
 ---

-## 二、Docker 部署
+## 二、Docker / GHCR 部署

 ### 2.1 基本步骤

 ```bash
+# 拉取预编译镜像
+docker pull ghcr.io/cjackhwang/ds2api:latest
+
 # 复制环境变量模板和配置文件
 cp .env.example .env
 cp config.example.json config.json
@@ -128,7 +129,13 @@ docker-compose up -d
 docker-compose logs -f
 ```

-默认 `docker-compose.yml` 会把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`，请设置 `DS2API_HOST_PORT=5001`（或者手动调整 `ports` 配置）。
+默认 `docker-compose.yml` 直接使用 `ghcr.io/cjackhwang/ds2api:latest`，并把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`，请设置 `DS2API_HOST_PORT=5001`（或者手动调整 `ports` 配置）。
+
+如需固定版本，也可以直接拉取指定 tag：
+
+```bash
+docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+```

 ### 2.2 更新

@@ -350,57 +357,61 @@ No Output Directory named "public" found after the Build completed.

 ---

-## 四、下载 Release 构建包
+## 四、本地源码运行

-仓库内置 GitHub Actions 工作流：`.github/workflows/release-artifacts.yml`
-
- **触发条件**：仅在 Release `published` 时触发（普通 push 不会构建）
- **构建产物**：多平台二进制压缩包 + `sha256sums.txt`
- **容器镜像发布**：仅发布到 GHCR（`ghcr.io/cjackhwang/ds2api`）
-
-| 平台 | 架构 | 文件格式 |
-| --- | --- | --- |
-| Linux | amd64, arm64 | `.tar.gz` |
-| macOS | amd64, arm64 | `.tar.gz` |
-| Windows | amd64 | `.zip` |
-
-每个压缩包包含：
-
- `ds2api` 可执行文件（Windows 为 `ds2api.exe`）
- `static/admin/`（WebUI 构建产物）
- `config.example.json`、`.env.example`
- `README.MD`、`README.en.md`、`LICENSE`
-
-### 使用步骤
+### 4.1 基本步骤

 ```bash
-# 1. 下载对应平台的压缩包
-# 2. 解压
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api_<tag>_linux_amd64
+# 克隆仓库
+git clone https://github.com/CJackHwang/ds2api.git
+cd ds2api

-# 3. 配置
+# 复制并编辑配置
 cp config.example.json config.json
-# 编辑 config.json
+# 使用你喜欢的编辑器打开 config.json，填入：
+#   - keys: 你的 API 访问密钥
+#   - accounts: DeepSeek 账号（email 或 mobile + password）

-# 4. 启动
-./ds2api
+# 启动服务
+go run ./cmd/ds2api
 ```

-### 维护者发布步骤
+默认本地访问地址是 `http://127.0.0.1:5001`；服务实际绑定 `0.0.0.0:5001`，可通过 `PORT` 环境变量覆盖。

-1. 在 GitHub 创建并发布 Release（带 tag，如 `vX.Y.Z`）
-2. 等待 Actions 工作流 `Release Artifacts` 完成
-3. 在 Release 的 Assets 下载对应平台压缩包
+### 4.2 WebUI 构建

-### 拉取 GHCR 镜像（可选）
+本地首次启动时，若 `static/admin/` 不存在，服务会自动尝试构建 WebUI（需要 Node.js/npm；缺依赖时会先执行 `npm ci`，再执行 `npm run build -- --outDir static/admin --emptyOutDir`）。
+
+你也可以手动构建：

 ```bash
-# latest
-docker pull ghcr.io/cjackhwang/ds2api:latest
+./scripts/build-webui.sh
+```

-# 指定版本（示例）
-docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+或手动执行：
+
+```bash
+cd webui
+npm install
+npm run build
+# 产物输出到 static/admin/
+```
+
+通过环境变量控制自动构建行为：
+
+```bash
+# 强制关闭自动构建
+DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
+
+# 强制开启自动构建
+DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
+```
+
+### 4.3 编译为二进制文件
+
+```bash
+go build -o ds2api ./cmd/ds2api
+./ds2api
 ```

 ---
--- a/docs/toolcall-semantics.md
+++ b/docs/toolcall-semantics.md
@@ -1,74 +1,74 @@
 # Tool call parsing semantics（Go/Node 统一语义）

-本文档描述当前代码中 `ParseToolCallsDetailed` / `parseToolCallsDetailed` 的**实际行为**，用于对齐 Go 与 Node Runtime。
+本文档描述当前代码中工具调用解析链路的**实际行为**（以 `internal/toolcall` 与 `internal/js/helpers/stream-tool-sieve` 为准）。

 文档导航：[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [测试指南](./TESTING.md)

-## 1) 输出结构（当前实现）
+## 1) 当前输出结构

- `calls`：解析得到的工具调用列表（`name` + `input`）。
- `sawToolCallSyntax`：检测到工具调用语法特征时为 `true`（例如 `tool_calls`、`<tool_call>`、`<function_call>`、`<invoke>`、`function.name:`）。
- `rejectedByPolicy`：当前实现固定为 `false`（预留字段，尚未启用 allow-list 拒绝）。
+`ParseToolCallsDetailed` / `parseToolCallsDetailed` 返回：
+
+- `calls`：解析出的工具调用列表（`name` + `input`）。
+- `sawToolCallSyntax`：检测到工具调用语法特征时为 `true`。
+- `rejectedByPolicy`：当前实现固定为 `false`（预留字段）。
 - `rejectedToolNames`：当前实现固定为空数组（预留字段）。

-> 说明：`filterToolCallsDetailed` 当前仅做结构清洗，不做工具名策略拒绝。
+> 当前 `filterToolCallsDetailed` 仅做结构清洗，不做 allow-list 工具名硬拒绝。

-## 2) 解析管线
+## 2) 解析范围（重点）

-1. **示例保护**：若判定为 fenced code block 示例上下文，则跳过执行型解析。
-2. **候选片段构建**：从完整文本中构建候选（原文、围绕 `tool_calls` 的 JSON 片段、首尾大括号切片等）。
-3. **按序尝试解析（命中即停）**：
-   - 对“明显 JSON 工具载荷候选”（以 `{`/`[` 开头且包含 `tool_calls`/`\"function\"`）先走 JSON 解析，避免 JSON 字符串内偶发 XML 片段误命中；
-   - 其余候选优先 XML 解析（`<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / `antml:function_call` 等）；
-   - JSON 解析（`{"tool_calls": [...]}`、列表、单对象）；
-   - Markup 解析；
-   - Text-KV 回退（如 `function.name:` + `function.arguments:`）。
-4. **兜底**：候选全部失败后，再对全文做 XML / Text-KV 回退。
+当前版本的可执行解析以 **XML/Markup 家族**为主：

-## 3) XML 能力边界（当前）
+- `<tool_call>...</tool_call>`
+- `<function_call>...</function_call>`
+- `<invoke ...>...</invoke>`（含自闭合）
+- `<tool_use>...</tool_use>`
+- antml 变体（如 `antml:function_call` / `antml:argument`）

-当前已支持输入端的“多 XML/标记风格”解析，包括但不限于：
+并支持在这些标记块内部解析：

- `<tool_call><tool_name>...</tool_name><parameters>...</parameters></tool_call>`
- `<function_call>tool</function_call><function parameter name="x">...</function parameter>`
- `<invoke name="tool"><parameter name="x">...</parameter></invoke>`
- `antml:function_call` / `antml:argument` / `antml:parameters`
- `tool_use` 家族标签
+- JSON 参数字符串
+- 标签参数（`<parameter name="...">...`）
+- key/value 风格子标签

-但**输出端仍统一转换为 OpenAI 兼容 JSON 事件/对象**（`message.tool_calls`、`delta.tool_calls`、`response.function_call_arguments.*`）。
+## 3) 不应再假设的行为

-## 4) 关于“是否可以封装成 XML 再喂给模型”
+以下说法在当前实现中已不成立：

-结论：**可以做，而且当前解析器已经能兼容 XML 作为输入格式之一**，但代码里并没有 `toolcall.prefer_xml_output` 这个开关。现有可调配置只有：
+1. “纯 JSON `tool_calls` 片段会被直接当作可执行工具调用解析”。
+2. “存在 `toolcall.mode` / `toolcall.early_emit_confidence` 等可配置开关可以改变解析策略”。

- `toolcall.mode`：`feature_match` / `off`
- `toolcall.early_emit_confidence`：`high` / `low` / `off`
+当前策略在代码中固定为：

-推荐思路仍然是“输入兼容层 + 输出按客户端协议渲染”：
+- 特征匹配开启（feature-match on）
+- 高置信度早发开启（early emit on）
+- policy 拒绝字段保留但未启用

-1. **Prompt 约束层**：如果你要尝试 XML-first，可以在系统提示词里约束模型输出规范 XML tool block（例如 `<tool_calls><tool_call>...</tool_call></tool_calls>`）。
-2. **解析兼容层**：继续在 parser 中同时接受 JSON / XML / ANTML / invoke / text-kv。
-3. **协议归一层**：无论模型输出什么格式，统一落到内部 `ParsedToolCall`。
-4. **对外渲染层**：根据客户端请求协议渲染（OpenAI / Claude / Gemini 各自格式）。
+## 4) 流式与防泄漏语义

-这样可以同时获得：
+在流式链路中（OpenAI / Claude / Gemini 统一内核）：

- 减少模型端 JSON 转义/引号错误；
- 不破坏现有 SDK / 客户端生态；
- 逐步灰度（按模型、按租户、按请求开关）。
+- 工具调用片段会被优先提取为结构化增量输出；
+- 已识别的工具调用原始片段不会作为普通文本再次回流；
+- fenced code block 中的示例内容按文本处理，不作为可执行工具调用。

-## 5) 落地建议（低风险迭代）
+## 5) 落地建议（按当前实现）

- 继续使用现有的 `toolcall.mode=feature_match` 和 `toolcall.early_emit_confidence=high` 作为默认策略。
- 如果要试 XML-first，把它放在 prompt 层或上游模板层，不要假设代码里已有专门的 XML 输出开关。
- 增加观测指标：
-  - `toolcall_parse_source`（json/xml/markup/textkv）；
-  - `toolcall_parse_success_rate`；
-  - `toolcall_malformed_rate`；
-  - `toolcall_repair_rate`。
- 先在 `responses` 链路灰度，再扩展 `chat.completions`。
+1. Prompt 里优先约束模型输出 XML/Markup 工具块。
+2. 执行器侧继续做工具名白名单与参数 schema 校验（不要依赖 parser 代替安全策略）。
+3. 需要兼容历史“纯 JSON tool_calls”模型输出时，请在上游模板层把输出规范化为 XML/Markup 风格再进入 DS2API。

-## 6) 兼容性提醒
+## 6) 回归验证建议

- 上游模型若输出混合文本 + XML，仍可能出现“半结构化”噪声，需要依赖现有 sieve 增量消费策略。
- XML 不等于安全：仍需做 tool 名、参数 schema、执行权限的服务端校验。
+可直接运行：
+
+```bash
+go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
+node --test tests/node/stream-tool-sieve.test.js
+```
+
+重点覆盖：
+
+- `<tool_call>` / `<function_call>` / `<invoke>` / `tool_use` / antml 变体
+- 参数 JSON 修复与解析
+- 流式增量下的工具调用提取与文本防泄漏
--- a/go.mod
+++ b/go.mod
@@ -18,7 +18,7 @@ require (
 	github.com/tidwall/pretty v1.2.1 // indirect
 	github.com/tidwall/sjson v1.2.5 // indirect
 	golang.org/x/crypto v0.49.0 // indirect
-	golang.org/x/net v0.52.0 // indirect
+	golang.org/x/net v0.52.0
 	golang.org/x/sys v0.42.0 // indirect
 	gopkg.in/yaml.v3 v3.0.1 // indirect
 )
--- a/internal/adapter/claude/handler_stream_test.go
+++ b/internal/adapter/claude/handler_stream_test.go
@@ -138,77 +138,6 @@ func TestHandleClaudeStreamRealtimeThinkingDelta(t *testing.T) {
 	}
 }

-func TestHandleClaudeStreamRealtimeToolSafety(t *testing.T) {
-	h := &Handler{}
-	resp := makeClaudeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
-		`data: {"p":"response/content","v":",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
-
-	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "use tool"}}, false, false, []string{"search"})
-
-	frames := parseClaudeFrames(t, rec.Body.String())
-	for _, f := range findClaudeFrames(frames, "content_block_delta") {
-		delta, _ := f.Payload["delta"].(map[string]any)
-		if delta["type"] == "text_delta" && strings.Contains(asString(delta["text"]), `"tool_calls"`) {
-			t.Fatalf("raw tool_calls JSON leaked in text delta: body=%s", rec.Body.String())
-		}
-	}
-
-	foundToolUse := false
-	for _, f := range findClaudeFrames(frames, "content_block_start") {
-		contentBlock, _ := f.Payload["content_block"].(map[string]any)
-		if contentBlock["type"] == "tool_use" {
-			foundToolUse = true
-			break
-		}
-	}
-	if !foundToolUse {
-		t.Fatalf("expected tool_use block in stream, body=%s", rec.Body.String())
-	}
-
-	foundToolUseStop := false
-	for _, f := range findClaudeFrames(frames, "message_delta") {
-		delta, _ := f.Payload["delta"].(map[string]any)
-		if delta["stop_reason"] == "tool_use" {
-			foundToolUseStop = true
-			break
-		}
-	}
-	if !foundToolUseStop {
-		t.Fatalf("expected stop_reason=tool_use, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleClaudeStreamRealtimeToolDetectionFromThinkingFallback(t *testing.T) {
-	h := &Handler{}
-	resp := makeClaudeSSEHTTPResponse(
-		`data: {"p":"response/thinking_content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
-		`data: {"p":"response/thinking_content","v":",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
-
-	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "use tool"}}, true, false, []string{"search"})
-
-	frames := parseClaudeFrames(t, rec.Body.String())
-	foundToolUse := false
-	for _, f := range findClaudeFrames(frames, "content_block_start") {
-		contentBlock, _ := f.Payload["content_block"].(map[string]any)
-		if contentBlock["type"] == "tool_use" && contentBlock["name"] == "search" {
-			foundToolUse = true
-			break
-		}
-	}
-	if !foundToolUse {
-		t.Fatalf("expected tool_use block from thinking fallback, body=%s", rec.Body.String())
-	}
-}
-
 func TestHandleClaudeStreamRealtimeSkipsThinkingFallbackWhenFinalTextExists(t *testing.T) {
 	h := &Handler{}
 	resp := makeClaudeSSEHTTPResponse(
--- a/internal/adapter/claude/handler_util_test.go
+++ b/internal/adapter/claude/handler_util_test.go
@@ -96,7 +96,7 @@ func TestNormalizeClaudeMessagesToolUseToAssistantToolCalls(t *testing.T) {
 	if !containsStr(content, "<tool_calls>") || !containsStr(content, "<tool_name>search_web</tool_name>") {
 		t.Fatalf("expected assistant content to include XML tool call history, got %q", content)
 	}
-	if !containsStr(content, `<parameters>{"query":"latest"}</parameters>`) {
+	if !containsStr(content, "<parameters>\n      <query><![CDATA[latest]]></query>\n    </parameters>") {
 		t.Fatalf("expected assistant content to include serialized parameters, got %q", content)
 	}
 }
--- a/internal/adapter/claude/proxy_vercel_test.go
+++ b/internal/adapter/claude/proxy_vercel_test.go
@@ -34,11 +34,13 @@ func (s openAIProxyStub) ChatCompletions(w http.ResponseWriter, _ *http.Request)

 type openAIProxyCaptureStub struct {
 	seenModel string
+	seenReq   map[string]any
 }

 func (s *openAIProxyCaptureStub) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
+	s.seenReq = req
 	if m, ok := req["model"].(string); ok {
 		s.seenModel = m
 	}
@@ -84,3 +86,33 @@ func TestClaudeProxyViaOpenAIPreservesClaudeMapping(t *testing.T) {
 		t.Fatalf("expected mapped proxy model deepseek-reasoner, got %q", got)
 	}
 }
+
+func TestClaudeProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
+	openAI := &openAIProxyCaptureStub{}
+	h := &Handler{OpenAI: openAI}
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":[{"type":"text","text":"hello"},{"type":"image","source":{"type":"base64","media_type":"image/png","data":"QUJDRA=="}}]}],"stream":false}`))
+	rec := httptest.NewRecorder()
+
+	h.Messages(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
+	}
+	messages, _ := openAI.seenReq["messages"].([]any)
+	if len(messages) != 1 {
+		t.Fatalf("expected one translated message, got %#v", openAI.seenReq)
+	}
+	msg, _ := messages[0].(map[string]any)
+	content, _ := msg["content"].([]any)
+	if len(content) != 2 {
+		t.Fatalf("expected translated content blocks, got %#v", msg)
+	}
+	imageBlock, _ := content[1].(map[string]any)
+	if strings.TrimSpace(asString(imageBlock["type"])) != "image_url" {
+		t.Fatalf("expected image_url block, got %#v", imageBlock)
+	}
+	imageURL, _ := imageBlock["image_url"].(map[string]any)
+	if !strings.HasPrefix(strings.TrimSpace(asString(imageURL["url"])), "data:image/png;base64,") {
+		t.Fatalf("expected translated data url, got %#v", imageBlock)
+	}
+}
--- a/internal/adapter/claude/standard_request.go
+++ b/internal/adapter/claude/standard_request.go
@@ -36,7 +36,7 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
 		thinkingEnabled = false
 		searchEnabled = false
 	}
-	finalPrompt := deepseek.MessagesPrepare(toMessageMaps(dsPayload["messages"]))
+	finalPrompt := deepseek.MessagesPrepareWithThinking(toMessageMaps(dsPayload["messages"]), thinkingEnabled)
 	toolNames := extractClaudeToolNames(toolsRequested)
 	if len(toolNames) == 0 && len(toolsRequested) > 0 {
 		toolNames = []string{"__any_tool__"}
--- a/internal/adapter/gemini/convert_request.go
+++ b/internal/adapter/gemini/convert_request.go
@@ -28,7 +28,7 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
 	}

 	toolsRaw := convertGeminiTools(req["tools"])
-	finalPrompt, toolNames := openai.BuildPromptForAdapter(messagesRaw, toolsRaw, "")
+	finalPrompt, toolNames := openai.BuildPromptForAdapter(messagesRaw, toolsRaw, "", thinkingEnabled)
 	passThrough := collectGeminiPassThrough(req)

 	return util.StandardRequest{
--- a/internal/adapter/gemini/handler_test.go
+++ b/internal/adapter/gemini/handler_test.go
@@ -82,11 +82,17 @@ func (s geminiOpenAIErrorStub) ChatCompletions(w http.ResponseWriter, _ *http.Re
 }

 type geminiOpenAISuccessStub struct {
-	stream bool
-	body   string
+	stream  bool
+	body    string
+	seenReq map[string]any
 }

-func (s geminiOpenAISuccessStub) ChatCompletions(w http.ResponseWriter, _ *http.Request) {
+func (s *geminiOpenAISuccessStub) ChatCompletions(w http.ResponseWriter, r *http.Request) {
+	if r != nil {
+		var req map[string]any
+		_ = json.NewDecoder(r.Body).Decode(&req)
+		s.seenReq = req
+	}
 	if s.stream {
 		w.Header().Set("Content-Type", "text/event-stream")
 		w.WriteHeader(http.StatusOK)
@@ -144,7 +150,7 @@ func TestGeminiRoutesRegistered(t *testing.T) {
 func TestGenerateContentReturnsFunctionCallParts(t *testing.T) {
 	h := &Handler{
 		Store: testGeminiConfig{},
-		OpenAI: geminiOpenAISuccessStub{
+		OpenAI: &geminiOpenAISuccessStub{
 			body: `{"id":"chatcmpl-1","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","tool_calls":[{"id":"call_1","type":"function","function":{"name":"eval_javascript","arguments":"{\"code\":\"1+1\"}"}}]},"finish_reason":"tool_calls"}]}`,
 		},
 	}
@@ -184,7 +190,7 @@ func TestGenerateContentReturnsFunctionCallParts(t *testing.T) {
 }

 func TestGenerateContentMixedToolSnippetAlsoTriggersFunctionCall(t *testing.T) {
-	h := &Handler{Store: testGeminiConfig{}, OpenAI: geminiOpenAISuccessStub{}}
+	h := &Handler{Store: testGeminiConfig{}, OpenAI: &geminiOpenAISuccessStub{}}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)

@@ -217,7 +223,7 @@ func TestGenerateContentMixedToolSnippetAlsoTriggersFunctionCall(t *testing.T) {
 func TestStreamGenerateContentEmitsSSE(t *testing.T) {
 	h := &Handler{
 		Store:  testGeminiConfig{},
-		OpenAI: geminiOpenAISuccessStub{stream: true},
+		OpenAI: &geminiOpenAISuccessStub{stream: true},
 	}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
@@ -251,6 +257,39 @@ func TestStreamGenerateContentEmitsSSE(t *testing.T) {
 	}
 }

+func TestGeminiProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
+	openAI := &geminiOpenAISuccessStub{}
+	h := &Handler{Store: testGeminiConfig{}, OpenAI: openAI}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	body := `{"contents":[{"role":"user","parts":[{"text":"hello"},{"inlineData":{"mimeType":"image/png","data":"QUJDRA=="}}]}]}`
+	req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:generateContent", strings.NewReader(body))
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	messages, _ := openAI.seenReq["messages"].([]any)
+	if len(messages) != 1 {
+		t.Fatalf("expected one translated message, got %#v", openAI.seenReq)
+	}
+	msg, _ := messages[0].(map[string]any)
+	content, _ := msg["content"].([]any)
+	if len(content) != 2 {
+		t.Fatalf("expected translated content blocks, got %#v", msg)
+	}
+	imageBlock, _ := content[1].(map[string]any)
+	if strings.TrimSpace(asString(imageBlock["type"])) != "image_url" {
+		t.Fatalf("expected image_url block, got %#v", imageBlock)
+	}
+	imageURL, _ := imageBlock["image_url"].(map[string]any)
+	if !strings.HasPrefix(strings.TrimSpace(asString(imageURL["url"])), "data:image/png;base64,") {
+		t.Fatalf("expected translated data url, got %#v", imageBlock)
+	}
+}
+
 func TestGenerateContentOpenAIProxyErrorUsesGeminiEnvelope(t *testing.T) {
 	h := &Handler{
 		Store: testGeminiConfig{},
--- a/internal/adapter/openai/chat_stream_runtime.go
+++ b/internal/adapter/openai/chat_stream_runtime.go
@@ -98,6 +98,19 @@ func (s *chatStreamRuntime) sendDone() {
 	}
 }

+func (s *chatStreamRuntime) sendFailedChunk(status int, message, code string) {
+	s.sendChunk(map[string]any{
+		"status_code": status,
+		"error": map[string]any{
+			"message": message,
+			"type":    openAIErrorType(status),
+			"code":    code,
+			"param":   nil,
+		},
+	})
+	s.sendDone()
+}
+
 func (s *chatStreamRuntime) finalize(finishReason string) {
 	finalThinking := s.thinking.String()
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
@@ -168,6 +181,21 @@ func (s *chatStreamRuntime) finalize(finishReason string) {
 	if len(detected.Calls) > 0 || s.toolCallsEmitted {
 		finishReason = "tool_calls"
 	}
+	if len(detected.Calls) == 0 && !s.toolCallsEmitted && strings.TrimSpace(finalText) == "" {
+		status := http.StatusTooManyRequests
+		message := "Upstream model returned empty output."
+		code := "upstream_empty_output"
+		if strings.TrimSpace(finalThinking) != "" {
+			message = "Upstream model returned reasoning without visible output."
+		}
+		if finishReason == "content_filter" {
+			status = http.StatusBadRequest
+			message = "Upstream content filtered the response and returned no output."
+			code = "content_filter"
+		}
+		s.sendFailedChunk(status, message, code)
+		return
+	}
 	usage := openaifmt.BuildChatUsage(s.finalPrompt, finalThinking, finalText)
 	s.sendChunk(openaifmt.BuildChatStreamChunk(
 		s.completionID,
@@ -184,6 +212,9 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
 		return streamengine.ParsedDecision{}
 	}
 	if parsed.ContentFilter {
+		if strings.TrimSpace(s.text.String()) == "" {
+			return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("content_filter")}
+		}
 		return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
 	}
 	if parsed.ErrorMessage != "" {
--- a/internal/adapter/openai/deps.go
+++ b/internal/adapter/openai/deps.go
@@ -18,6 +18,7 @@ type AuthResolver interface {
 type DeepSeekCaller interface {
 	CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
 	GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
+	UploadFile(ctx context.Context, a *auth.RequestAuth, req deepseek.UploadFileRequest, maxAttempts int) (*deepseek.UploadFileResult, error)
 	CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
 	DeleteSessionForToken(ctx context.Context, token string, sessionID string) (*deepseek.DeleteSessionResult, error)
 	DeleteAllSessionsForToken(ctx context.Context, token string) error
--- a/internal/adapter/openai/embeddings_handler.go
+++ b/internal/adapter/openai/embeddings_handler.go
@@ -26,8 +26,13 @@ func (h *Handler) Embeddings(w http.ResponseWriter, r *http.Request) {
 	}
 	defer h.Auth.Release(a)

+	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		if strings.Contains(strings.ToLower(err.Error()), "too large") {
+			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
+			return
+		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
--- a/internal/adapter/openai/file_inline_upload.go
+++ b/internal/adapter/openai/file_inline_upload.go
@@ -0,0 +1,382 @@
+package openai
+
+import (
+	"context"
+	"crypto/sha256"
+	"encoding/base64"
+	"fmt"
+	"mime"
+	"net/http"
+	"net/url"
+	"path/filepath"
+	"strings"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/deepseek"
+)
+
+const maxInlineFilesPerRequest = 50
+
+type inlineFileUploadError struct {
+	status  int
+	message string
+	err     error
+}
+
+func (e *inlineFileUploadError) Error() string {
+	if e == nil {
+		return ""
+	}
+	if strings.TrimSpace(e.message) != "" {
+		return e.message
+	}
+	if e.err != nil {
+		return e.err.Error()
+	}
+	return "inline file processing failed"
+}
+
+type inlineUploadState struct {
+	ctx          context.Context
+	handler      *Handler
+	auth         *auth.RequestAuth
+	uploadedByID map[string]string
+	uploadCount  int
+}
+
+type inlineDecodedFile struct {
+	Data            []byte
+	ContentType     string
+	Filename        string
+	ReplacementType string
+}
+
+func (h *Handler) preprocessInlineFileInputs(ctx context.Context, a *auth.RequestAuth, req map[string]any) error {
+	if h == nil || h.DS == nil || len(req) == 0 {
+		return nil
+	}
+	state := &inlineUploadState{
+		ctx:          ctx,
+		handler:      h,
+		auth:         a,
+		uploadedByID: map[string]string{},
+	}
+	for _, key := range []string{"messages", "input", "attachments"} {
+		if raw, ok := req[key]; ok {
+			updated, err := state.walk(raw)
+			if err != nil {
+				return err
+			}
+			req[key] = updated
+		}
+	}
+	if refIDs := collectOpenAIRefFileIDs(req); len(refIDs) > 0 {
+		req["ref_file_ids"] = stringsToAnySlice(refIDs)
+	}
+	return nil
+}
+
+func writeOpenAIInlineFileError(w http.ResponseWriter, err error) {
+	inlineErr, ok := err.(*inlineFileUploadError)
+	if !ok || inlineErr == nil {
+		writeOpenAIError(w, http.StatusInternalServerError, "Failed to process file input.")
+		return
+	}
+	status := inlineErr.status
+	if status == 0 {
+		status = http.StatusInternalServerError
+	}
+	message := strings.TrimSpace(inlineErr.message)
+	if message == "" {
+		message = "Failed to process file input."
+	}
+	writeOpenAIError(w, status, message)
+}
+
+func (s *inlineUploadState) walk(raw any) (any, error) {
+	switch x := raw.(type) {
+	case []any:
+		out := make([]any, len(x))
+		for i, item := range x {
+			updated, err := s.walk(item)
+			if err != nil {
+				return nil, err
+			}
+			out[i] = updated
+		}
+		return out, nil
+	case map[string]any:
+		if replacement, replaced, err := s.tryUploadBlock(x); replaced || err != nil {
+			return replacement, err
+		}
+		for _, key := range []string{"messages", "input", "attachments", "content", "files", "items", "data", "source", "file", "image_url"} {
+			if nested, ok := x[key]; ok {
+				updated, err := s.walk(nested)
+				if err != nil {
+					return nil, err
+				}
+				x[key] = updated
+			}
+		}
+		return x, nil
+	default:
+		return raw, nil
+	}
+}
+
+func (s *inlineUploadState) tryUploadBlock(block map[string]any) (map[string]any, bool, error) {
+	decoded, ok, err := decodeOpenAIInlineFileBlock(block)
+	if err != nil {
+		return nil, true, &inlineFileUploadError{status: http.StatusBadRequest, message: err.Error(), err: err}
+	}
+	if !ok {
+		return nil, false, nil
+	}
+	if s.uploadCount >= maxInlineFilesPerRequest {
+		return nil, true, fmt.Errorf("exceeded maximum of %d inline files per request", maxInlineFilesPerRequest)
+	}
+	fileID, err := s.uploadInlineFile(decoded)
+	if err != nil {
+		return nil, true, &inlineFileUploadError{status: http.StatusInternalServerError, message: "Failed to upload inline file.", err: err}
+	}
+	s.uploadCount++
+	replacement := map[string]any{
+		"type":    decoded.ReplacementType,
+		"file_id": fileID,
+	}
+	if decoded.Filename != "" {
+		replacement["filename"] = decoded.Filename
+	}
+	if decoded.ContentType != "" {
+		replacement["mime_type"] = decoded.ContentType
+	}
+	return replacement, true, nil
+}
+
+func (s *inlineUploadState) uploadInlineFile(file inlineDecodedFile) (string, error) {
+	sum := sha256.Sum256(append([]byte(file.ContentType+"\x00"+file.Filename+"\x00"), file.Data...))
+	cacheKey := fmt.Sprintf("%x", sum[:])
+	if fileID, ok := s.uploadedByID[cacheKey]; ok && strings.TrimSpace(fileID) != "" {
+		return fileID, nil
+	}
+	contentType := strings.TrimSpace(file.ContentType)
+	if contentType == "" {
+		contentType = http.DetectContentType(file.Data)
+	}
+	result, err := s.handler.DS.UploadFile(s.ctx, s.auth, deepseek.UploadFileRequest{
+		Filename:    file.Filename,
+		ContentType: contentType,
+		Data:        file.Data,
+	}, 3)
+	if err != nil {
+		return "", err
+	}
+	fileID := strings.TrimSpace(result.ID)
+	if fileID == "" {
+		return "", fmt.Errorf("upload succeeded without file id")
+	}
+	s.uploadedByID[cacheKey] = fileID
+	return fileID, nil
+}
+
+func decodeOpenAIInlineFileBlock(block map[string]any) (inlineDecodedFile, bool, error) {
+	if block == nil {
+		return inlineDecodedFile{}, false, nil
+	}
+	if strings.TrimSpace(asString(block["file_id"])) != "" {
+		return inlineDecodedFile{}, false, nil
+	}
+	if nested, ok := block["file"].(map[string]any); ok {
+		decoded, matched, err := decodeOpenAIInlineFileBlock(nested)
+		if err != nil || !matched {
+			return decoded, matched, err
+		}
+		if decoded.Filename == "" {
+			decoded.Filename = pickInlineFilename(block, decoded.ContentType, defaultInlinePrefix(decoded.ReplacementType))
+		}
+		return decoded, true, nil
+	}
+	blockType := strings.ToLower(strings.TrimSpace(asString(block["type"])))
+	if raw, matched := extractInlineImageDataURL(block); matched {
+		data, contentType, err := decodeInlinePayload(raw, contentTypeFromMap(block))
+		if err != nil {
+			return inlineDecodedFile{}, true, fmt.Errorf("invalid image input")
+		}
+		return inlineDecodedFile{
+			Data:            data,
+			ContentType:     contentType,
+			Filename:        pickInlineFilename(block, contentType, "image"),
+			ReplacementType: "input_image",
+		}, true, nil
+	}
+	if raw, matched := extractInlineFilePayload(block, blockType); matched {
+		data, contentType, err := decodeInlinePayload(raw, contentTypeFromMap(block))
+		if err != nil {
+			return inlineDecodedFile{}, true, fmt.Errorf("invalid file input")
+		}
+		return inlineDecodedFile{
+			Data:            data,
+			ContentType:     contentType,
+			Filename:        pickInlineFilename(block, contentType, defaultInlinePrefix(blockType)),
+			ReplacementType: "input_file",
+		}, true, nil
+	}
+	return inlineDecodedFile{}, false, nil
+}
+
+func extractInlineImageDataURL(block map[string]any) (string, bool) {
+	imageURL := block["image_url"]
+	switch x := imageURL.(type) {
+	case string:
+		if isDataURL(x) {
+			return strings.TrimSpace(x), true
+		}
+	case map[string]any:
+		if raw := strings.TrimSpace(asString(x["url"])); isDataURL(raw) {
+			return raw, true
+		}
+	}
+	if raw := strings.TrimSpace(asString(block["url"])); isDataURL(raw) {
+		return raw, true
+	}
+	return "", false
+}
+
+func extractInlineFilePayload(block map[string]any, blockType string) (string, bool) {
+	for _, value := range []any{block["file_data"], block["base64"], block["data"]} {
+		if raw := strings.TrimSpace(asString(value)); raw != "" {
+			if strings.Contains(blockType, "file") || block["file_data"] != nil || block["filename"] != nil || block["file_name"] != nil || block["name"] != nil {
+				return raw, true
+			}
+		}
+	}
+	return "", false
+}
+
+func decodeInlinePayload(raw string, explicitContentType string) ([]byte, string, error) {
+	raw = strings.TrimSpace(raw)
+	if raw == "" {
+		return nil, "", fmt.Errorf("empty payload")
+	}
+	if isDataURL(raw) {
+		return decodeDataURL(raw, explicitContentType)
+	}
+	decoded, err := decodeBase64Flexible(raw)
+	if err != nil {
+		return nil, "", err
+	}
+	contentType := strings.TrimSpace(explicitContentType)
+	if contentType == "" && len(decoded) > 0 {
+		contentType = http.DetectContentType(decoded)
+	}
+	return decoded, contentType, nil
+}
+
+func decodeDataURL(raw string, explicitContentType string) ([]byte, string, error) {
+	raw = strings.TrimSpace(raw)
+	if !isDataURL(raw) {
+		return nil, "", fmt.Errorf("unsupported data url")
+	}
+	header, payload, ok := strings.Cut(raw, ",")
+	if !ok {
+		return nil, "", fmt.Errorf("invalid data url")
+	}
+	meta := strings.TrimSpace(strings.TrimPrefix(header, "data:"))
+	contentType := strings.TrimSpace(explicitContentType)
+	if contentType == "" {
+		contentType = "application/octet-stream"
+		if meta != "" {
+			parts := strings.Split(meta, ";")
+			if len(parts) > 0 && strings.TrimSpace(parts[0]) != "" {
+				contentType = strings.TrimSpace(parts[0])
+			}
+		}
+	}
+	if strings.Contains(strings.ToLower(meta), ";base64") {
+		decoded, err := decodeBase64Flexible(payload)
+		if err != nil {
+			return nil, "", err
+		}
+		return decoded, contentType, nil
+	}
+	decoded, err := url.PathUnescape(payload)
+	if err != nil {
+		return nil, "", err
+	}
+	return []byte(decoded), contentType, nil
+}
+
+func decodeBase64Flexible(raw string) ([]byte, error) {
+	raw = strings.TrimSpace(raw)
+	for _, enc := range []*base64.Encoding{base64.StdEncoding, base64.RawStdEncoding, base64.URLEncoding, base64.RawURLEncoding} {
+		decoded, err := enc.DecodeString(raw)
+		if err == nil {
+			return decoded, nil
+		}
+	}
+	return nil, fmt.Errorf("invalid base64 payload")
+}
+
+func contentTypeFromMap(block map[string]any) string {
+	for _, value := range []any{block["mime_type"], block["mimeType"], block["content_type"], block["contentType"], block["media_type"], block["mediaType"]} {
+		if contentType := strings.TrimSpace(asString(value)); contentType != "" {
+			return contentType
+		}
+	}
+	if imageURL, ok := block["image_url"].(map[string]any); ok {
+		for _, value := range []any{imageURL["mime_type"], imageURL["mimeType"], imageURL["content_type"], imageURL["contentType"]} {
+			if contentType := strings.TrimSpace(asString(value)); contentType != "" {
+				return contentType
+			}
+		}
+	}
+	return ""
+}
+
+func pickInlineFilename(block map[string]any, contentType string, prefix string) string {
+	for _, value := range []any{block["filename"], block["file_name"], block["name"]} {
+		if name := strings.TrimSpace(asString(value)); name != "" {
+			return filepath.Base(name)
+		}
+	}
+	if prefix == "" {
+		prefix = "upload"
+	}
+	ext := ".bin"
+	if parsedType := strings.TrimSpace(contentType); parsedType != "" {
+		if comma := strings.Index(parsedType, ";"); comma >= 0 {
+			parsedType = strings.TrimSpace(parsedType[:comma])
+		}
+		if exts, err := mime.ExtensionsByType(parsedType); err == nil && len(exts) > 0 && strings.TrimSpace(exts[0]) != "" {
+			ext = exts[0]
+		}
+	}
+	return prefix + ext
+}
+
+func defaultInlinePrefix(blockType string) string {
+	blockType = strings.ToLower(strings.TrimSpace(blockType))
+	if strings.Contains(blockType, "image") {
+		return "image"
+	}
+	return "upload"
+}
+
+func isDataURL(raw string) bool {
+	return strings.HasPrefix(strings.ToLower(strings.TrimSpace(raw)), "data:")
+}
+
+func stringsToAnySlice(items []string) []any {
+	out := make([]any, 0, len(items))
+	for _, item := range items {
+		trimmed := strings.TrimSpace(item)
+		if trimmed == "" {
+			continue
+		}
+		out = append(out, trimmed)
+	}
+	if len(out) == 0 {
+		return nil
+	}
+	return out
+}
--- a/internal/adapter/openai/file_inline_upload_test.go
+++ b/internal/adapter/openai/file_inline_upload_test.go
@@ -0,0 +1,274 @@
+package openai
+
+import (
+	"context"
+	"encoding/json"
+	"errors"
+	"net/http"
+	"net/http/httptest"
+	"strings"
+	"testing"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/deepseek"
+)
+
+type inlineUploadDSStub struct {
+	uploadCalls    []deepseek.UploadFileRequest
+	lastCtx        context.Context
+	completionReq  map[string]any
+	createSession  string
+	uploadErr      error
+	completionResp *http.Response
+}
+
+func (m *inlineUploadDSStub) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	if strings.TrimSpace(m.createSession) == "" {
+		return "session-id", nil
+	}
+	return m.createSession, nil
+}
+
+func (m *inlineUploadDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	return "pow", nil
+}
+
+func (m *inlineUploadDSStub) UploadFile(ctx context.Context, _ *auth.RequestAuth, req deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
+	m.lastCtx = ctx
+	m.uploadCalls = append(m.uploadCalls, req)
+	if m.uploadErr != nil {
+		return nil, m.uploadErr
+	}
+	return &deepseek.UploadFileResult{
+		ID:       "file-inline-1",
+		Filename: req.Filename,
+		Bytes:    int64(len(req.Data)),
+		Status:   "uploaded",
+		Purpose:  req.Purpose,
+	}, nil
+}
+
+func (m *inlineUploadDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
+	m.completionReq = payload
+	if m.completionResp != nil {
+		return m.completionResp, nil
+	}
+	return makeOpenAISSEHTTPResponse(
+		`data: {"p":"response/content","v":"ok"}`,
+		`data: [DONE]`,
+	), nil
+}
+
+func (m *inlineUploadDSStub) DeleteSessionForToken(_ context.Context, _ string, _ string) (*deepseek.DeleteSessionResult, error) {
+	return &deepseek.DeleteSessionResult{Success: true}, nil
+}
+
+func (m *inlineUploadDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
+	return nil
+}
+
+func TestPreprocessInlineFileInputsReplacesDataURLAndCollectsRefFileIDs(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &Handler{DS: ds}
+	req := map[string]any{
+		"messages": []any{
+			map[string]any{
+				"role": "user",
+				"content": []any{
+					map[string]any{
+						"type":      "image_url",
+						"image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="},
+					},
+				},
+			},
+		},
+	}
+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
+
+	if err := h.preprocessInlineFileInputs(ctx, &auth.RequestAuth{DeepSeekToken: "token"}, req); err != nil {
+		t.Fatalf("preprocess failed: %v", err)
+	}
+	if len(ds.uploadCalls) != 1 {
+		t.Fatalf("expected 1 upload, got %d", len(ds.uploadCalls))
+	}
+	if ds.lastCtx != ctx {
+		t.Fatalf("expected upload to use request context")
+	}
+	if ds.uploadCalls[0].ContentType != "image/png" {
+		t.Fatalf("expected image/png, got %q", ds.uploadCalls[0].ContentType)
+	}
+	if ds.uploadCalls[0].Filename != "image.png" {
+		t.Fatalf("expected inferred filename image.png, got %q", ds.uploadCalls[0].Filename)
+	}
+	messages, _ := req["messages"].([]any)
+	first, _ := messages[0].(map[string]any)
+	content, _ := first["content"].([]any)
+	block, _ := content[0].(map[string]any)
+	if block["type"] != "input_image" {
+		t.Fatalf("expected input_image replacement, got %#v", block)
+	}
+	if block["file_id"] != "file-inline-1" {
+		t.Fatalf("expected file-inline-1 replacement id, got %#v", block)
+	}
+	refIDs, _ := req["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
+		t.Fatalf("unexpected ref_file_ids: %#v", req["ref_file_ids"])
+	}
+}
+
+func TestPreprocessInlineFileInputsDeduplicatesIdenticalPayloads(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &Handler{DS: ds}
+	req := map[string]any{
+		"messages": []any{
+			map[string]any{
+				"role": "user",
+				"content": []any{
+					map[string]any{"type": "image_url", "image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="}},
+					map[string]any{"type": "image_url", "image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="}},
+				},
+			},
+		},
+	}
+
+	if err := h.preprocessInlineFileInputs(context.Background(), &auth.RequestAuth{DeepSeekToken: "token"}, req); err != nil {
+		t.Fatalf("preprocess failed: %v", err)
+	}
+	if len(ds.uploadCalls) != 1 {
+		t.Fatalf("expected deduplicated single upload, got %d", len(ds.uploadCalls))
+	}
+	refIDs, _ := req["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
+		t.Fatalf("unexpected ref_file_ids after dedupe: %#v", req["ref_file_ids"])
+	}
+}
+
+func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	h.ChatCompletions(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 1 {
+		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
+	}
+	if ds.completionReq == nil {
+		t.Fatal("expected completion payload to be captured")
+	}
+	refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
+		t.Fatalf("unexpected completion ref_file_ids: %#v", ds.completionReq["ref_file_ids"])
+	}
+}
+
+func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+	reqBody := `{"model":"deepseek-chat","input":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"input_image","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 1 {
+		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
+	}
+	refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
+		t.Fatalf("unexpected completion ref_file_ids: %#v", ds.completionReq["ref_file_ids"])
+	}
+}
+
+func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
+	ds := &inlineUploadDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,%%%"}}]}],"stream":false}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	h.ChatCompletions(rec, req)
+
+	if rec.Code != http.StatusBadRequest {
+		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if ds.completionReq != nil {
+		t.Fatalf("did not expect completion call on upload decode error")
+	}
+}
+
+func TestResponsesInlineUploadFailureReturnsInternalServerError(t *testing.T) {
+	ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+	reqBody := `{"model":"deepseek-chat","input":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusInternalServerError {
+		t.Fatalf("expected 500, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if ds.completionReq != nil {
+		t.Fatalf("did not expect completion call after upload failure")
+	}
+}
+
+func TestVercelPrepareUploadsInlineFilesBeforeLeasePayload(t *testing.T) {
+	t.Setenv("VERCEL", "1")
+	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+	ds := &inlineUploadDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":true}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_prepare=1", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(ds.uploadCalls) != 1 {
+		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
+	}
+	var out map[string]any
+	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
+		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
+	}
+	payload, _ := out["payload"].(map[string]any)
+	if payload == nil {
+		t.Fatalf("expected payload in prepare response, got %#v", out)
+	}
+	refIDs, _ := payload["ref_file_ids"].([]any)
+	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
+		t.Fatalf("unexpected payload ref_file_ids: %#v", payload["ref_file_ids"])
+	}
+}
--- a/internal/adapter/openai/file_refs.go
+++ b/internal/adapter/openai/file_refs.go
@@ -0,0 +1,94 @@
+package openai
+
+import "strings"
+
+func collectOpenAIRefFileIDs(req map[string]any) []string {
+	if len(req) == 0 {
+		return nil
+	}
+	out := make([]string, 0, 4)
+	seen := map[string]struct{}{}
+	for _, key := range []string{
+		"ref_file_ids",
+		"file_ids",
+		"attachments",
+		"messages",
+		"input",
+	} {
+		raw := req[key]
+		if raw == nil {
+			continue
+		}
+		// Skip top-level strings for 'messages' and 'input' as they are likely plain text content,
+		// not file IDs. String file IDs are expected in 'ref_file_ids' or 'file_ids'.
+		if key == "messages" || key == "input" {
+			if _, ok := raw.(string); ok {
+				continue
+			}
+		}
+		appendOpenAIRefFileIDs(&out, seen, raw)
+	}
+	if len(out) == 0 {
+		return nil
+	}
+	return out
+}
+
+func appendOpenAIRefFileIDs(out *[]string, seen map[string]struct{}, raw any) {
+	switch x := raw.(type) {
+	case string:
+		addOpenAIRefFileID(out, seen, x)
+	case []string:
+		for _, item := range x {
+			addOpenAIRefFileID(out, seen, item)
+		}
+	case []any:
+		for _, item := range x {
+			appendOpenAIRefFileIDs(out, seen, item)
+		}
+	case map[string]any:
+		if fileID := strings.TrimSpace(asString(x["file_id"])); fileID != "" {
+			addOpenAIRefFileID(out, seen, fileID)
+		}
+		if strings.Contains(strings.ToLower(strings.TrimSpace(asString(x["type"]))), "file") {
+			if fileID := strings.TrimSpace(asString(x["id"])); fileID != "" {
+				addOpenAIRefFileID(out, seen, fileID)
+			}
+		}
+		if fileMap, ok := x["file"].(map[string]any); ok {
+			if fileID := strings.TrimSpace(asString(fileMap["file_id"])); fileID != "" {
+				addOpenAIRefFileID(out, seen, fileID)
+			}
+			if fileID := strings.TrimSpace(asString(fileMap["id"])); fileID != "" {
+				addOpenAIRefFileID(out, seen, fileID)
+			}
+		}
+		// Recurse into potential containers. Note: we do NOT recurse into 'content' or 'input'
+		// if they are plain strings (handled by the top-level switch), but they are usually
+		// nested inside the map branch anyway.
+		// To be safe, we only recurse into these known container keys.
+		for _, key := range []string{"ref_file_ids", "file_ids", "attachments", "messages", "input", "content", "files", "items", "data", "source"} {
+			if nested, ok := x[key]; ok {
+				// If it's a message content that is a string, we must NOT treat it as an ID.
+				if key == "content" || key == "input" {
+					if _, ok := nested.(string); ok {
+						continue
+					}
+				}
+				appendOpenAIRefFileIDs(out, seen, nested)
+			}
+		}
+	}
+}
+
+func addOpenAIRefFileID(out *[]string, seen map[string]struct{}, fileID string) {
+	fileID = strings.TrimSpace(fileID)
+	if fileID == "" {
+		return
+	}
+	if _, ok := seen[fileID]; ok {
+		return
+	}
+	seen[fileID] = struct{}{}
+	*out = append(*out, fileID)
+}
--- a/internal/adapter/openai/files_route_test.go
+++ b/internal/adapter/openai/files_route_test.go
@@ -0,0 +1,202 @@
+package openai
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"errors"
+	"mime/multipart"
+	"net/http"
+	"net/http/httptest"
+	"testing"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/deepseek"
+)
+
+type managedFilesAuthStub struct{}
+
+func (managedFilesAuthStub) Determine(_ *http.Request) (*auth.RequestAuth, error) {
+	return &auth.RequestAuth{
+		UseConfigToken: true,
+		DeepSeekToken:  "managed-token",
+		CallerID:       "caller:test",
+		AccountID:      "acct-123",
+		TriedAccounts:  map[string]bool{},
+	}, nil
+}
+
+func (managedFilesAuthStub) DetermineCaller(_ *http.Request) (*auth.RequestAuth, error) {
+	return &auth.RequestAuth{
+		UseConfigToken: true,
+		DeepSeekToken:  "managed-token",
+		CallerID:       "caller:test",
+		AccountID:      "acct-123",
+		TriedAccounts:  map[string]bool{},
+	}, nil
+}
+
+func (managedFilesAuthStub) Release(_ *auth.RequestAuth) {}
+
+type filesRouteDSStub struct {
+	lastReq deepseek.UploadFileRequest
+	upload  *deepseek.UploadFileResult
+	err     error
+}
+
+func (m *filesRouteDSStub) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	return "", nil
+}
+
+func (m *filesRouteDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
+	return "", nil
+}
+
+func (m *filesRouteDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, req deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
+	m.lastReq = req
+	if m.err != nil {
+		return nil, m.err
+	}
+	if m.upload != nil {
+		return m.upload, nil
+	}
+	return &deepseek.UploadFileResult{ID: "file-123", Filename: req.Filename, Bytes: int64(len(req.Data)), Purpose: req.Purpose, Status: "uploaded"}, nil
+}
+
+func (m *filesRouteDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
+	return nil, errors.New("not implemented")
+}
+
+func (m *filesRouteDSStub) DeleteSessionForToken(_ context.Context, _ string, _ string) (*deepseek.DeleteSessionResult, error) {
+	return &deepseek.DeleteSessionResult{Success: true}, nil
+}
+
+func (m *filesRouteDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
+	return nil
+}
+
+func newMultipartUploadRequest(t *testing.T, purpose string, filename string, data []byte) *http.Request {
+	t.Helper()
+	var body bytes.Buffer
+	writer := multipart.NewWriter(&body)
+	if purpose != "" {
+		if err := writer.WriteField("purpose", purpose); err != nil {
+			t.Fatalf("write purpose failed: %v", err)
+		}
+	}
+	part, err := writer.CreateFormFile("file", filename)
+	if err != nil {
+		t.Fatalf("create form file failed: %v", err)
+	}
+	if _, err := part.Write(data); err != nil {
+		t.Fatalf("write file failed: %v", err)
+	}
+	if err := writer.Close(); err != nil {
+		t.Fatalf("close writer failed: %v", err)
+	}
+	req := httptest.NewRequest(http.MethodPost, "/v1/files", &body)
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", writer.FormDataContentType())
+	return req
+}
+
+func TestFilesRouteUploadSuccess(t *testing.T) {
+	ds := &filesRouteDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if ds.lastReq.Filename != "notes.txt" {
+		t.Fatalf("expected filename notes.txt, got %q", ds.lastReq.Filename)
+	}
+	if ds.lastReq.Purpose != "assistants" {
+		t.Fatalf("expected purpose assistants, got %q", ds.lastReq.Purpose)
+	}
+	if string(ds.lastReq.Data) != "hello world" {
+		t.Fatalf("unexpected uploaded data: %q", string(ds.lastReq.Data))
+	}
+	var out map[string]any
+	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
+		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
+	}
+	if out["object"] != "file" {
+		t.Fatalf("expected file object, got %#v", out)
+	}
+	if out["id"] != "file-123" {
+		t.Fatalf("expected file id file-123, got %#v", out["id"])
+	}
+	if out["filename"] != "notes.txt" {
+		t.Fatalf("expected filename notes.txt, got %#v", out["filename"])
+	}
+}
+
+func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
+	ds := &filesRouteDSStub{}
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: managedFilesAuthStub{}, DS: ds}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	var out map[string]any
+	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
+		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
+	}
+	if out["account_id"] != "acct-123" {
+		t.Fatalf("expected account_id acct-123, got %#v", out["account_id"])
+	}
+}
+
+func TestFilesRouteRejectsNonMultipart(t *testing.T) {
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	req := httptest.NewRequest(http.MethodPost, "/v1/files", bytes.NewBufferString(`{"purpose":"assistants"}`))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusBadRequest {
+		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
+	}
+}
+
+func TestFilesRouteRequiresFileField(t *testing.T) {
+	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
+	r := chi.NewRouter()
+	RegisterRoutes(r, h)
+
+	var body bytes.Buffer
+	writer := multipart.NewWriter(&body)
+	if err := writer.WriteField("purpose", "assistants"); err != nil {
+		t.Fatalf("write field failed: %v", err)
+	}
+	if err := writer.Close(); err != nil {
+		t.Fatalf("close writer failed: %v", err)
+	}
+	req := httptest.NewRequest(http.MethodPost, "/v1/files", &body)
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", writer.FormDataContentType())
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusBadRequest {
+		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
+	}
+}
--- a/internal/adapter/openai/handler_chat.go
+++ b/internal/adapter/openai/handler_chat.go
@@ -5,6 +5,7 @@ import (
 	"encoding/json"
 	"io"
 	"net/http"
+	"strings"
 	"time"

 	"ds2api/internal/auth"
@@ -43,11 +44,20 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {

 	r = r.WithContext(auth.WithAuth(r.Context(), a))

+	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		if strings.Contains(strings.ToLower(err.Error()), "too large") {
+			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
+			return
+		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
+	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
+		writeOpenAIInlineFileError(w, err)
+		return
+	}
 	stdReq, err := normalizeOpenAIChatRequest(h.Store, req, requestTraceID(r))
 	if err != nil {
 		writeOpenAIError(w, http.StatusBadRequest, err.Error())
@@ -127,7 +137,7 @@ func (h *Handler) handleNonStream(w http.ResponseWriter, ctx context.Context, re
 	stripReferenceMarkers := h.compatStripReferenceMarkers()
 	finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
 	finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
-	if writeUpstreamEmptyOutputError(w, finalThinking, finalText, result.ContentFilter) {
+	if writeUpstreamEmptyOutputError(w, finalText, result.ContentFilter) {
 		return
 	}
 	respBody := openaifmt.BuildChatCompletion(completionID, model, finalPrompt, finalThinking, finalText, toolNames)
--- a/internal/adapter/openai/handler_chat_auto_delete_test.go
+++ b/internal/adapter/openai/handler_chat_auto_delete_test.go
@@ -27,6 +27,10 @@ func (m *autoDeleteModeDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _
 	return "pow", nil
 }

+func (m *autoDeleteModeDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, _ deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
+	return &deepseek.UploadFileResult{ID: "file-id", Filename: "file.txt", Bytes: 1, Status: "uploaded"}, nil
+}
+
 func (m *autoDeleteModeDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	return m.resp, nil
 }
--- a/internal/adapter/openai/handler_files.go
+++ b/internal/adapter/openai/handler_files.go
@@ -0,0 +1,104 @@
+package openai
+
+import (
+	"io"
+	"net/http"
+	"strings"
+	"time"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/deepseek"
+)
+
+const openAIUploadMaxMemory = 32 << 20
+
+func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
+	a, err := h.Auth.Determine(r)
+	if err != nil {
+		status := http.StatusUnauthorized
+		detail := err.Error()
+		if err == auth.ErrNoAccount {
+			status = http.StatusTooManyRequests
+		}
+		writeOpenAIError(w, status, detail)
+		return
+	}
+	defer h.Auth.Release(a)
+	if !strings.HasPrefix(strings.ToLower(strings.TrimSpace(r.Header.Get("Content-Type"))), "multipart/form-data") {
+		writeOpenAIError(w, http.StatusBadRequest, "content-type must be multipart/form-data")
+		return
+	}
+	// Enforce a hard cap on the total request body size to prevent OOM
+	r.Body = http.MaxBytesReader(w, r.Body, openAIUploadMaxSize)
+	if err := r.ParseMultipartForm(openAIUploadMaxMemory); err != nil {
+		if strings.Contains(strings.ToLower(err.Error()), "too large") {
+			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "file size exceeds limit")
+			return
+		}
+		writeOpenAIError(w, http.StatusBadRequest, "invalid multipart form")
+		return
+	}
+	if r.MultipartForm != nil {
+		defer func() { _ = r.MultipartForm.RemoveAll() }()
+	}
+	r = r.WithContext(auth.WithAuth(r.Context(), a))
+	file, header, err := r.FormFile("file")
+	if err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "file is required")
+		return
+	}
+	defer func() { _ = file.Close() }()
+	data, err := io.ReadAll(file)
+	if err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "failed to read uploaded file")
+		return
+	}
+	contentType := strings.TrimSpace(header.Header.Get("Content-Type"))
+	if contentType == "" && len(data) > 0 {
+		contentType = http.DetectContentType(data)
+	}
+	result, err := h.DS.UploadFile(r.Context(), a, deepseek.UploadFileRequest{
+		Filename:    header.Filename,
+		ContentType: contentType,
+		Purpose:     strings.TrimSpace(r.FormValue("purpose")),
+		Data:        data,
+	}, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusInternalServerError, "Failed to upload file.")
+		return
+	}
+	if result != nil && result.AccountID == "" {
+		result.AccountID = a.AccountID
+	}
+	writeJSON(w, http.StatusOK, buildOpenAIFileObject(result))
+}
+
+func buildOpenAIFileObject(result *deepseek.UploadFileResult) map[string]any {
+	if result == nil {
+		obj := map[string]any{
+			"id":             "",
+			"object":         "file",
+			"bytes":          0,
+			"created_at":     time.Now().Unix(),
+			"filename":       "",
+			"purpose":        "",
+			"status":         "uploaded",
+			"status_details": nil,
+		}
+		return obj
+	}
+	obj := map[string]any{
+		"id":             result.ID,
+		"object":         "file",
+		"bytes":          result.Bytes,
+		"created_at":     time.Now().Unix(),
+		"filename":       result.Filename,
+		"purpose":        result.Purpose,
+		"status":         result.Status,
+		"status_details": nil,
+	}
+	if result.AccountID != "" {
+		obj["account_id"] = result.AccountID
+	}
+	return obj
+}
--- a/internal/adapter/openai/handler_routes.go
+++ b/internal/adapter/openai/handler_routes.go
@@ -13,6 +13,13 @@ import (
 	"ds2api/internal/util"
 )

+const (
+	// openAIUploadMaxSize limits total multipart request body size (100 MiB).
+	openAIUploadMaxSize = 100 << 20
+	// openAIGeneralMaxSize limits total JSON request body size (100 MiB).
+	openAIGeneralMaxSize = 100 << 20
+)
+
 // writeJSON is a package-internal alias kept to avoid mass-renaming across
 // every call-site in this package.
 var writeJSON = util.WriteJSON
@@ -46,6 +53,7 @@ func RegisterRoutes(r chi.Router, h *Handler) {
 	r.Post("/v1/chat/completions", h.ChatCompletions)
 	r.Post("/v1/responses", h.Responses)
 	r.Get("/v1/responses/{response_id}", h.GetResponseByID)
+	r.Post("/v1/files", h.UploadFile)
 	r.Post("/v1/embeddings", h.Embeddings)
 }

--- a/internal/adapter/openai/handler_toolcall_test.go
+++ b/internal/adapter/openai/handler_toolcall_test.go
@@ -3,7 +3,6 @@ package openai
 import (
 	"context"
 	"encoding/json"
-	"fmt"
 	"io"
 	"net/http"
 	"net/http/httptest"
@@ -59,21 +58,6 @@ func parseSSEDataFrames(t *testing.T, body string) ([]map[string]any, bool) {
 	return frames, done
 }

-func streamHasRawToolJSONContent(frames []map[string]any) bool {
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			content, _ := delta["content"].(string)
-			if strings.Contains(content, `"tool_calls"`) {
-				return true
-			}
-		}
-	}
-	return false
-}
-
 func streamHasToolCallsDelta(frames []map[string]any) bool {
 	for _, frame := range frames {
 		choices, _ := frame["choices"].([]any)
@@ -101,180 +85,7 @@ func streamFinishReason(frames []map[string]any) string {
 	return ""
 }

-func streamToolCallArgumentChunks(frames []map[string]any) []string {
-	out := make([]string, 0, 4)
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			toolCalls, _ := delta["tool_calls"].([]any)
-			for _, tc := range toolCalls {
-				tcm, _ := tc.(map[string]any)
-				fn, _ := tcm["function"].(map[string]any)
-				if args, ok := fn["arguments"].(string); ok && args != "" {
-					out = append(out, args)
-				}
-			}
-		}
-	}
-	return out
-}
-
-func TestHandleNonStreamToolCallInterceptsChatModel(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-
-	h.handleNonStream(rec, context.Background(), resp, "cid1", "deepseek-chat", "prompt", false, []string{"search"})
-	if rec.Code != http.StatusOK {
-		t.Fatalf("unexpected status: %d", rec.Code)
-	}
-
-	out := decodeJSONBody(t, rec.Body.String())
-	choices, _ := out["choices"].([]any)
-	if len(choices) != 1 {
-		t.Fatalf("unexpected choices: %#v", out["choices"])
-	}
-	choice, _ := choices[0].(map[string]any)
-	if choice["finish_reason"] != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
-	}
-	msg, _ := choice["message"].(map[string]any)
-	if msg["content"] != nil {
-		t.Fatalf("expected content nil, got %#v", msg["content"])
-	}
-	toolCalls, _ := msg["tool_calls"].([]any)
-	if len(toolCalls) != 1 {
-		t.Fatalf("expected 1 tool call, got %#v", msg["tool_calls"])
-	}
-}
-
-func TestHandleNonStreamToolCallInterceptsReasonerModel(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/thinking_content","v":"先想一下"}`,
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-
-	h.handleNonStream(rec, context.Background(), resp, "cid2", "deepseek-reasoner", "prompt", true, []string{"search"})
-	if rec.Code != http.StatusOK {
-		t.Fatalf("unexpected status: %d", rec.Code)
-	}
-
-	out := decodeJSONBody(t, rec.Body.String())
-	choices, _ := out["choices"].([]any)
-	choice, _ := choices[0].(map[string]any)
-	msg, _ := choice["message"].(map[string]any)
-	if msg["reasoning_content"] != "先想一下" {
-		t.Fatalf("expected reasoning_content, got %#v", msg["reasoning_content"])
-	}
-	if msg["content"] != nil {
-		t.Fatalf("expected content nil, got %#v", msg["content"])
-	}
-	if choice["finish_reason"] != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
-	}
-}
-
-func TestHandleNonStreamUnknownToolIntercepted(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"not_in_schema\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-
-	h.handleNonStream(rec, context.Background(), resp, "cid2b", "deepseek-chat", "prompt", false, []string{"search"})
-	if rec.Code != http.StatusOK {
-		t.Fatalf("unexpected status: %d", rec.Code)
-	}
-
-	out := decodeJSONBody(t, rec.Body.String())
-	choices, _ := out["choices"].([]any)
-	choice, _ := choices[0].(map[string]any)
-	if choice["finish_reason"] != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
-	}
-	msg, _ := choice["message"].(map[string]any)
-	toolCalls, _ := msg["tool_calls"].([]any)
-	if len(toolCalls) != 1 {
-		t.Fatalf("expected tool_calls for unknown schema name, got %#v", msg["tool_calls"])
-	}
-}
-
-func TestHandleNonStreamEmbeddedToolCallExamplePromotesToolCall(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"下面是示例："}`,
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: {"p":"response/content","v":"请勿执行。"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-
-	h.handleNonStream(rec, context.Background(), resp, "cid2c", "deepseek-chat", "prompt", false, []string{"search"})
-	if rec.Code != http.StatusOK {
-		t.Fatalf("unexpected status: %d", rec.Code)
-	}
-
-	out := decodeJSONBody(t, rec.Body.String())
-	choices, _ := out["choices"].([]any)
-	choice, _ := choices[0].(map[string]any)
-	if choice["finish_reason"] != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
-	}
-	msg, _ := choice["message"].(map[string]any)
-	toolCalls, _ := msg["tool_calls"].([]any)
-	if len(toolCalls) != 1 {
-		t.Fatalf("expected one tool_call field for embedded example: %#v", msg["tool_calls"])
-	}
-	content, _ := msg["content"].(string)
-	if strings.Contains(content, `"tool_calls"`) {
-		t.Fatalf("expected raw tool_calls json stripped from content, got %#v", content)
-	}
-}
-
-func TestHandleNonStreamFencedToolCallExampleDoesNotPromoteToolCall(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		"data: {\"p\":\"response/content\",\"v\":\"```json\\n{\\\"tool_calls\\\":[{\\\"name\\\":\\\"search\\\",\\\"input\\\":{\\\"q\\\":\\\"go\\\"}}]}\\n```\"}",
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-
-	h.handleNonStream(rec, context.Background(), resp, "cid2d", "deepseek-chat", "prompt", false, []string{"search"})
-	if rec.Code != http.StatusOK {
-		t.Fatalf("unexpected status: %d", rec.Code)
-	}
-
-	out := decodeJSONBody(t, rec.Body.String())
-	choices, _ := out["choices"].([]any)
-	choice, _ := choices[0].(map[string]any)
-	if choice["finish_reason"] == "tool_calls" {
-		t.Fatalf("expected fenced example to remain content-only, got finish_reason=%#v", choice["finish_reason"])
-	}
-	msg, _ := choice["message"].(map[string]any)
-	toolCalls, _ := msg["tool_calls"].([]any)
-	if len(toolCalls) != 0 {
-		t.Fatalf("expected no tool_call field for fenced example: %#v", msg["tool_calls"])
-	}
-	content, _ := msg["content"].(string)
-	if !strings.Contains(content, `"tool_calls"`) {
-		t.Fatalf("expected fenced example content preserved, got %q", content)
-	}
-}
-
 // Backward-compatible alias for historical test name used in CI logs.
-func TestHandleNonStreamFencedToolCallExamplePromotesToolCall(t *testing.T) {
-	TestHandleNonStreamFencedToolCallExampleDoesNotPromoteToolCall(t)
-}
-
 func TestHandleNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
@@ -313,190 +124,22 @@ func TestHandleNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutp
 	}
 }

-func TestHandleStreamToolCallInterceptsWithoutRawContentLeak(t *testing.T) {
+func TestHandleNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
-		`data: {"p":"response/content","v":",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: {"p":"response/thinking_content","v":"Only thinking"}`,
 		`data: [DONE]`,
 	)
 	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)

-	h.handleStream(rec, req, resp, "cid3", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	h.handleNonStream(rec, context.Background(), resp, "cid-thinking-only", "deepseek-reasoner", "prompt", true, nil)
+	if rec.Code != http.StatusTooManyRequests {
+		t.Fatalf("expected status 429 for thinking-only upstream output, got %d body=%s", rec.Code, rec.Body.String())
 	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	foundToolIndex := false
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			toolCalls, _ := delta["tool_calls"].([]any)
-			for _, tc := range toolCalls {
-				tcm, _ := tc.(map[string]any)
-				if _, ok := tcm["index"].(float64); ok {
-					foundToolIndex = true
-				}
-			}
-		}
-	}
-	if !foundToolIndex {
-		t.Fatalf("expected stream tool_calls item with index, body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamToolCallLargeArgumentsStillIntercepted(t *testing.T) {
-	h := &Handler{}
-	large := strings.Repeat("a", 9000)
-	payload := fmt.Sprintf(`{"tool_calls":[{"name":"search","input":{"q":"%s"}}]}`, large)
-	splitAt := len(payload) / 2
-	resp := makeSSEHTTPResponse(
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, payload[:splitAt]),
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, payload[splitAt:]),
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid3-large", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamReasonerToolCallInterceptsWithoutRawContentLeak(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/thinking_content","v":"思考中"}`,
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid4", "deepseek-reasoner", "prompt", true, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	foundToolIndex := false
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			toolCalls, _ := delta["tool_calls"].([]any)
-			for _, tc := range toolCalls {
-				tcm, _ := tc.(map[string]any)
-				if _, ok := tcm["index"].(float64); ok {
-					foundToolIndex = true
-				}
-			}
-		}
-	}
-	if !foundToolIndex {
-		t.Fatalf("expected stream tool_calls item with index, body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-
-	hasThinkingDelta := false
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if _, ok := delta["reasoning_content"]; ok {
-				hasThinkingDelta = true
-			}
-		}
-	}
-	if !hasThinkingDelta {
-		t.Fatalf("expected reasoning_content delta in reasoner stream: %s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamUnknownToolEmitsToolCall(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"not_in_schema\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid5", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta for unknown schema name, body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("did not expect raw tool_calls json leak for unknown schema name: %s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamUnknownToolNoArgsEmitsToolCall(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"not_in_schema\"}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid5b", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta for unknown schema name (no args), body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("did not expect raw tool_calls json leak for unknown schema name (no args): %s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
+	out := decodeJSONBody(t, rec.Body.String())
+	errObj, _ := out["error"].(map[string]any)
+	if asString(errObj["code"]) != "upstream_empty_output" {
+		t.Fatalf("expected code=upstream_empty_output, got %#v", out)
 	}
 }

@@ -538,287 +181,6 @@ func TestHandleStreamToolsPlainTextStreamsBeforeFinish(t *testing.T) {
 	}
 }

-func TestHandleStreamToolCallMixedWithPlainTextSegments(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"下面是示例："}`,
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: {"p":"response/content","v":"请勿执行。"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid7", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta in mixed prose stream, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "下面是示例：") || !strings.Contains(got, "请勿执行。") {
-		t.Fatalf("expected pre/post plain text to pass sieve, got=%q", got)
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls for mixed prose, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamToolCallAfterLeadingTextRemainsText(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"我将调用工具。"}`,
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid7b", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "我将调用工具。") {
-		t.Fatalf("expected leading text to keep streaming, got=%q", got)
-	}
-
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamToolCallWithSameChunkTrailingTextRemainsText(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}接下来我会继续说明。"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid7c", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "接下来我会继续说明。") {
-		t.Fatalf("expected trailing plain text to be preserved, got=%q", got)
-	}
-
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamFencedToolCallSnippetPromotesToolCall(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, "下面是调用示例：\n```json\n"),
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, "{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}\n```\n仅示例，不要执行。"),
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid7f", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta for fenced snippet, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if strings.Contains(strings.ToLower(got), "tool_calls") {
-		t.Fatalf("expected raw fenced tool_calls snippet stripped from content, got=%q", got)
-	}
-	if strings.Contains(strings.ToLower(got), "```json") || strings.Contains(got, "\n```\n") {
-		t.Fatalf("expected consumed fenced tool payload to not leave empty code fence, got=%q", got)
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamStandaloneToolCallAfterClosedFenceKeepsFence(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, "先给一个代码示例：\n```text\nhello\n```\n"),
-		fmt.Sprintf(`data: {"p":"response/content","v":%q}`, "{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"),
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid7g", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta for standalone payload, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "```") {
-		t.Fatalf("expected closed fence before standalone tool json to be preserved, got=%q", got)
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamToolCallKeyAppearsLateRemainsText(t *testing.T) {
-	h := &Handler{}
-	spaces := strings.Repeat(" ", 200)
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{`+spaces+`"}`,
-		`data: {"p":"response/content","v":"\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
-		`data: {"p":"response/content","v":"后置正文C。"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid8", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "后置正文C。") {
-		t.Fatalf("expected stream to continue after tool json convergence, got=%q", got)
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamInvalidToolJSONDoesNotLeakRawObject(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"前置正文D。"}`,
-		`data: {"p":"response/content","v":"{'tool_calls':[{'name':'search','input':{'q':'go'}}]}"}`,
-		`data: {"p":"response/content","v":"后置正文E。"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid9", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if streamHasToolCallsDelta(frames) {
-		t.Fatalf("did not expect tool_calls delta for invalid json, body=%s", rec.Body.String())
-	}
-	content := strings.Builder{}
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			if c, ok := delta["content"].(string); ok {
-				content.WriteString(c)
-			}
-		}
-	}
-	got := content.String()
-	if !strings.Contains(got, "前置正文D。") || !strings.Contains(got, "后置正文E。") {
-		t.Fatalf("expected pre/post plain text to remain, got=%q", content.String())
-	}
-	if !strings.Contains(strings.ToLower(got), "tool_calls") {
-		t.Fatalf("expected invalid embedded tool-like json to pass through as text, got=%q", got)
-	}
-}
-
 func TestHandleStreamIncompleteCapturedToolJSONFlushesAsTextOnFinalize(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
@@ -852,108 +214,3 @@ func TestHandleStreamIncompleteCapturedToolJSONFlushesAsTextOnFinalize(t *testin
 		t.Fatalf("expected incomplete capture to flush as plain text instead of stalling, got=%q", content.String())
 	}
 }
-
-func TestHandleStreamToolCallArgumentsEmitAsSingleCompletedChunk(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go"}`,
-		`data: {"p":"response/content","v":"lang\",\"page\":1}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid11", "deepseek-chat", "prompt", false, false, []string{"search"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-	if streamHasRawToolJSONContent(frames) {
-		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
-	}
-	argChunks := streamToolCallArgumentChunks(frames)
-	if len(argChunks) == 0 {
-		t.Fatalf("expected tool call arguments chunk, got=%v body=%s", argChunks, rec.Body.String())
-	}
-	joined := strings.Join(argChunks, "")
-	if !strings.Contains(joined, `"q":"golang"`) || !strings.Contains(joined, `"page":1`) {
-		t.Fatalf("unexpected merged arguments stream: %q", joined)
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
-
-func TestHandleStreamMultiToolCallDoesNotMergeNamesOrArguments(t *testing.T) {
-	h := &Handler{}
-	resp := makeSSEHTTPResponse(
-		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search_web\",\"input\":{\"query\":\"latest ai news\"}},{"}`,
-		`data: {"p":"response/content","v":"\"name\":\"eval_javascript\",\"input\":{\"code\":\"1+1\"}}]}"}`,
-		`data: [DONE]`,
-	)
-	rec := httptest.NewRecorder()
-	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
-
-	h.handleStream(rec, req, resp, "cid12", "deepseek-chat", "prompt", false, false, []string{"search_web", "eval_javascript"})
-
-	frames, done := parseSSEDataFrames(t, rec.Body.String())
-	if !done {
-		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
-	}
-	if !streamHasToolCallsDelta(frames) {
-		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
-	}
-
-	foundSearch := false
-	foundEval := false
-	foundIndex1 := false
-	toolCallsDeltaLens := make([]int, 0, 2)
-	for _, frame := range frames {
-		choices, _ := frame["choices"].([]any)
-		for _, item := range choices {
-			choice, _ := item.(map[string]any)
-			delta, _ := choice["delta"].(map[string]any)
-			rawToolCalls, hasToolCalls := delta["tool_calls"]
-			if !hasToolCalls {
-				continue
-			}
-			toolCalls, _ := rawToolCalls.([]any)
-			toolCallsDeltaLens = append(toolCallsDeltaLens, len(toolCalls))
-			for _, tc := range toolCalls {
-				tcm, _ := tc.(map[string]any)
-				if idx, ok := tcm["index"].(float64); ok && int(idx) == 1 {
-					foundIndex1 = true
-				}
-				fn, _ := tcm["function"].(map[string]any)
-				name, _ := fn["name"].(string)
-				switch name {
-				case "search_web":
-					foundSearch = true
-				case "eval_javascript":
-					foundEval = true
-				case "search_webeval_javascript":
-					t.Fatalf("unexpected merged tool name: %s, body=%s", name, rec.Body.String())
-				}
-				if args, ok := fn["arguments"].(string); ok && strings.Contains(args, `}{"`) {
-					t.Fatalf("unexpected concatenated tool arguments: %q, body=%s", args, rec.Body.String())
-				}
-			}
-		}
-	}
-	if !foundSearch || !foundEval {
-		t.Fatalf("expected both tool names in stream deltas, foundSearch=%v foundEval=%v body=%s", foundSearch, foundEval, rec.Body.String())
-	}
-	if len(toolCallsDeltaLens) != 1 || toolCallsDeltaLens[0] != 2 {
-		t.Fatalf("expected exactly one tool_calls delta with two calls, got lens=%v body=%s", toolCallsDeltaLens, rec.Body.String())
-	}
-	if !foundIndex1 {
-		t.Fatalf("expected second tool call index in stream deltas, body=%s", rec.Body.String())
-	}
-	if streamFinishReason(frames) != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
-	}
-}
--- a/internal/adapter/openai/leaked_output_sanitize.go
+++ b/internal/adapter/openai/leaked_output_sanitize.go
@@ -2,13 +2,21 @@ package openai

 import (
 	"regexp"
+	"strings"
 )

 var emptyJSONFencePattern = regexp.MustCompile("(?is)```json\\s*```")
 var leakedToolCallArrayPattern = regexp.MustCompile(`(?is)\[\{\s*"function"\s*:\s*\{[\s\S]*?\}\s*,\s*"id"\s*:\s*"call[^"]*"\s*,\s*"type"\s*:\s*"function"\s*}\]`)
 var leakedToolResultBlobPattern = regexp.MustCompile(`(?is)<\s*\|\s*tool\s*\|\s*>\s*\{[\s\S]*?"tool_call_id"\s*:\s*"call[^"]*"\s*}`)

-// leakedMetaMarkerPattern matches DeepSeek special tokens in BOTH forms:
+var leakedThinkTagPattern = regexp.MustCompile(`(?is)</?\s*think\s*>`)
+
+// leakedBOSMarkerPattern matches DeepSeek BOS markers in BOTH forms:
+//   - ASCII underscore: <｜begin_of_sentence｜>
+//   - U+2581 variant:   <｜begin▁of▁sentence｜>
+var leakedBOSMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*begin[_▁]of[_▁]sentence\s*[｜\|]>`)
+
+// leakedMetaMarkerPattern matches the remaining DeepSeek special tokens in BOTH forms:
 //   - ASCII underscore: <｜end_of_sentence｜>, <｜end_of_toolresults｜>, <｜end_of_instructions｜>
 //   - U+2581 variant:   <｜end▁of▁sentence｜>, <｜end▁of▁toolresults｜>, <｜end▁of▁instructions｜>
 var leakedMetaMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*(?:assistant|tool|end[_▁]of[_▁]sentence|end[_▁]of[_▁]thinking|end[_▁]of[_▁]toolresults|end[_▁]of[_▁]instructions)\s*[｜\|]>`)
@@ -35,11 +43,48 @@ func sanitizeLeakedOutput(text string) string {
 	out := emptyJSONFencePattern.ReplaceAllString(text, "")
 	out = leakedToolCallArrayPattern.ReplaceAllString(out, "")
 	out = leakedToolResultBlobPattern.ReplaceAllString(out, "")
+	out = stripDanglingThinkSuffix(out)
+	out = leakedThinkTagPattern.ReplaceAllString(out, "")
+	out = leakedBOSMarkerPattern.ReplaceAllString(out, "")
 	out = leakedMetaMarkerPattern.ReplaceAllString(out, "")
 	out = sanitizeLeakedAgentXMLBlocks(out)
 	return out
 }

+func stripDanglingThinkSuffix(text string) string {
+	matches := leakedThinkTagPattern.FindAllStringIndex(text, -1)
+	if len(matches) == 0 {
+		return text
+	}
+	depth := 0
+	lastOpen := -1
+	for _, loc := range matches {
+		tag := strings.ToLower(text[loc[0]:loc[1]])
+		compact := strings.ReplaceAll(strings.ReplaceAll(strings.TrimSpace(tag), " ", ""), "\t", "")
+		if strings.HasPrefix(compact, "</") {
+			if depth > 0 {
+				depth--
+				if depth == 0 {
+					lastOpen = -1
+				}
+			}
+			continue
+		}
+		if depth == 0 {
+			lastOpen = loc[0]
+		}
+		depth++
+	}
+	if depth == 0 || lastOpen < 0 {
+		return text
+	}
+	prefix := text[:lastOpen]
+	if strings.TrimSpace(prefix) == "" {
+		return ""
+	}
+	return prefix
+}
+
 func sanitizeLeakedAgentXMLBlocks(text string) string {
 	out := text
 	for _, pattern := range leakedAgentXMLBlockPatterns {
--- a/internal/adapter/openai/leaked_output_sanitize_test.go
+++ b/internal/adapter/openai/leaked_output_sanitize_test.go
@@ -26,6 +26,22 @@ func TestSanitizeLeakedOutputRemovesStandaloneMetaMarkers(t *testing.T) {
 	}
 }

+func TestSanitizeLeakedOutputRemovesThinkAndBosMarkers(t *testing.T) {
+	raw := "A<think>B</think>C<｜begin▁of▁sentence｜>D<| begin_of_sentence |>E<｜begin_of_sentence｜>F"
+	got := sanitizeLeakedOutput(raw)
+	if got != "ABCDEF" {
+		t.Fatalf("unexpected sanitize result for think/BOS markers: %q", got)
+	}
+}
+
+func TestSanitizeLeakedOutputRemovesDanglingThinkBlock(t *testing.T) {
+	raw := "Answer prefix<think>internal reasoning that never closes"
+	got := sanitizeLeakedOutput(raw)
+	if got != "Answer prefix" {
+		t.Fatalf("unexpected sanitize result for dangling think block: %q", got)
+	}
+}
+
 func TestSanitizeLeakedOutputRemovesAgentXMLLeaks(t *testing.T) {
 	raw := "Done.<attempt_completion><result>Some final answer</result></attempt_completion>"
 	got := sanitizeLeakedOutput(raw)
--- a/internal/adapter/openai/prompt_build.go
+++ b/internal/adapter/openai/prompt_build.go
@@ -5,22 +5,22 @@ import (
 	"ds2api/internal/util"
 )

-func buildOpenAIFinalPrompt(messagesRaw []any, toolsRaw any, traceID string) (string, []string) {
-	return buildOpenAIFinalPromptWithPolicy(messagesRaw, toolsRaw, traceID, util.DefaultToolChoicePolicy())
+func buildOpenAIFinalPrompt(messagesRaw []any, toolsRaw any, traceID string, thinkingEnabled bool) (string, []string) {
+	return buildOpenAIFinalPromptWithPolicy(messagesRaw, toolsRaw, traceID, util.DefaultToolChoicePolicy(), thinkingEnabled)
 }

-func buildOpenAIFinalPromptWithPolicy(messagesRaw []any, toolsRaw any, traceID string, toolPolicy util.ToolChoicePolicy) (string, []string) {
+func buildOpenAIFinalPromptWithPolicy(messagesRaw []any, toolsRaw any, traceID string, toolPolicy util.ToolChoicePolicy, thinkingEnabled bool) (string, []string) {
 	messages := normalizeOpenAIMessagesForPrompt(messagesRaw, traceID)
 	toolNames := []string{}
 	if tools, ok := toolsRaw.([]any); ok && len(tools) > 0 {
 		messages, toolNames = injectToolPrompt(messages, tools, toolPolicy)
 	}
-	return deepseek.MessagesPrepare(messages), toolNames
+	return deepseek.MessagesPrepareWithThinking(messages, thinkingEnabled), toolNames
 }

 // BuildPromptForAdapter exposes the OpenAI-compatible prompt building flow so
 // other protocol adapters (for example Gemini) can reuse the same tool/history
 // normalization logic and remain behavior-compatible with chat/completions.
-func BuildPromptForAdapter(messagesRaw []any, toolsRaw any, traceID string) (string, []string) {
-	return buildOpenAIFinalPrompt(messagesRaw, toolsRaw, traceID)
+func BuildPromptForAdapter(messagesRaw []any, toolsRaw any, traceID string, thinkingEnabled bool) (string, []string) {
+	return buildOpenAIFinalPrompt(messagesRaw, toolsRaw, traceID, thinkingEnabled)
 }
--- a/internal/adapter/openai/prompt_build_test.go
+++ b/internal/adapter/openai/prompt_build_test.go
@@ -40,7 +40,7 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes
 		},
 	}

-	finalPrompt, toolNames := buildOpenAIFinalPrompt(messages, tools, "")
+	finalPrompt, toolNames := buildOpenAIFinalPrompt(messages, tools, "", false)
 	if len(toolNames) != 1 || toolNames[0] != "get_weather" {
 		t.Fatalf("unexpected tool names: %#v", toolNames)
 	}
@@ -73,8 +73,8 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
 		},
 	}

-	finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "")
-	if !strings.Contains(finalPrompt, "Remember: Output ONLY the <tool_calls>...</tool_calls> XML block when calling tools.") {
+	finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
+	if !strings.Contains(finalPrompt, "Remember: The ONLY valid way to use tools is the <tool_calls> XML block at the end of your response.") {
 		t.Fatalf("vercel prepare finalPrompt missing final tool-call anchor instruction: %q", finalPrompt)
 	}
 	if !strings.Contains(finalPrompt, "TOOL CALL FORMAT") {
@@ -87,3 +87,17 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
 		t.Fatalf("vercel prepare finalPrompt should not require fenced tool calls: %q", finalPrompt)
 	}
 }
+
+func TestBuildOpenAIFinalPromptWithThinkingAddsContinuationContract(t *testing.T) {
+	messages := []any{
+		map[string]any{"role": "user", "content": "继续回答上一个问题"},
+	}
+
+	finalPrompt, _ := buildOpenAIFinalPrompt(messages, nil, "", true)
+	if !strings.Contains(finalPrompt, "Continue the conversation from the full prior context") {
+		t.Fatalf("expected continuation contract in thinking prompt, got=%q", finalPrompt)
+	}
+	if !strings.Contains(finalPrompt, "final user-facing answer only in reasoning") {
+		t.Fatalf("expected visible-answer contract in thinking prompt, got=%q", finalPrompt)
+	}
+}
--- a/internal/adapter/openai/responses_embeddings_test.go
+++ b/internal/adapter/openai/responses_embeddings_test.go
@@ -156,6 +156,33 @@ func TestNormalizeResponsesInputAsMessagesFunctionCallItemPreservesConcatenatedA
 	}
 }

+func TestCollectOpenAIRefFileIDs(t *testing.T) {
+	got := collectOpenAIRefFileIDs(map[string]any{
+		"ref_file_ids": []any{"file-top", "file-dup"},
+		"attachments": []any{
+			map[string]any{"file_id": "file-attachment"},
+		},
+		"input": []any{
+			map[string]any{
+				"type": "message",
+				"content": []any{
+					map[string]any{"type": "input_file", "file_id": "file-input"},
+					map[string]any{"type": "input_file", "id": "file-dup"},
+				},
+			},
+		},
+	})
+	want := []string{"file-top", "file-dup", "file-attachment", "file-input"}
+	if len(got) != len(want) {
+		t.Fatalf("expected %d file ids, got %#v", len(want), got)
+	}
+	for i, id := range want {
+		if got[i] != id {
+			t.Fatalf("unexpected file ids at %d: got=%#v want=%#v", i, got, want)
+		}
+	}
+}
+
 func TestExtractEmbeddingInputs(t *testing.T) {
 	got := extractEmbeddingInputs([]any{"a", "b"})
 	if len(got) != 2 || got[0] != "a" || got[1] != "b" {
--- a/internal/adapter/openai/responses_handler.go
+++ b/internal/adapter/openai/responses_handler.go
@@ -65,11 +65,20 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
 		return
 	}

+	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		if strings.Contains(strings.ToLower(err.Error()), "too large") {
+			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
+			return
+		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
+	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
+		writeOpenAIInlineFileError(w, err)
+		return
+	}
 	traceID := requestTraceID(r)
 	stdReq, err := normalizeOpenAIResponsesRequest(h.Store, req, traceID)
 	if err != nil {
@@ -117,7 +126,7 @@ func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Res
 	stripReferenceMarkers := h.compatStripReferenceMarkers()
 	sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
 	sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
-	if writeUpstreamEmptyOutputError(w, sanitizedThinking, sanitizedText, result.ContentFilter) {
+	if writeUpstreamEmptyOutputError(w, sanitizedText, result.ContentFilter) {
 		return
 	}
 	textParsed := toolcall.ParseStandaloneToolCallsDetailed(sanitizedText, toolNames)
--- a/internal/adapter/openai/responses_stream_runtime_core.go
+++ b/internal/adapter/openai/responses_stream_runtime_core.go
@@ -99,6 +99,30 @@ func newResponsesStreamRuntime(
 	}
 }

+func (s *responsesStreamRuntime) failResponse(message, code string) {
+	s.failed = true
+	failedResp := map[string]any{
+		"id":          s.responseID,
+		"type":        "response",
+		"object":      "response",
+		"model":       s.model,
+		"status":      "failed",
+		"output":      []any{},
+		"output_text": "",
+		"error": map[string]any{
+			"message": message,
+			"type":    "invalid_request_error",
+			"code":    code,
+			"param":   nil,
+		},
+	}
+	if s.persistResponse != nil {
+		s.persistResponse(failedResp)
+	}
+	s.sendEvent("response.failed", openaifmt.BuildResponsesFailedPayload(s.responseID, s.model, message, code))
+	s.sendDone()
+}
+
 func (s *responsesStreamRuntime) finalize() {
 	finalThinking := s.thinking.String()
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
@@ -121,28 +145,16 @@ func (s *responsesStreamRuntime) finalize() {
 	s.closeMessageItem()

 	if s.toolChoice.IsRequired() && len(detected) == 0 {
-		s.failed = true
-		message := "tool_choice requires at least one valid tool call."
-		failedResp := map[string]any{
-			"id":          s.responseID,
-			"type":        "response",
-			"object":      "response",
-			"model":       s.model,
-			"status":      "failed",
-			"output":      []any{},
-			"output_text": "",
-			"error": map[string]any{
-				"message": message,
-				"type":    "invalid_request_error",
-				"code":    "tool_choice_violation",
-				"param":   nil,
-			},
+		s.failResponse("tool_choice requires at least one valid tool call.", "tool_choice_violation")
+		return
+	}
+	if len(detected) == 0 && strings.TrimSpace(finalText) == "" {
+		code := "upstream_empty_output"
+		message := "Upstream model returned empty output."
+		if finalThinking != "" {
+			message = "Upstream model returned reasoning without visible output."
 		}
-		if s.persistResponse != nil {
-			s.persistResponse(failedResp)
-		}
-		s.sendEvent("response.failed", openaifmt.BuildResponsesFailedPayload(s.responseID, s.model, message, "tool_choice_violation"))
-		s.sendDone()
+		s.failResponse(message, code)
 		return
 	}
 	s.closeIncompleteFunctionItems()
--- a/internal/adapter/openai/responses_stream_test.go
+++ b/internal/adapter/openai/responses_stream_test.go
@@ -12,149 +12,6 @@ import (
 	"ds2api/internal/util"
 )

-func TestHandleResponsesStreamToolCallsHideRawOutputTextInCompleted(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	rawToolJSON := `{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}`
-	streamBody := sseLine(rawToolJSON) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"read_file"}, util.DefaultToolChoicePolicy(), "")
-
-	completed, ok := extractSSEEventPayload(rec.Body.String(), "response.completed")
-	if !ok {
-		t.Fatalf("expected response.completed event, body=%s", rec.Body.String())
-	}
-	responseObj, _ := completed["response"].(map[string]any)
-	outputText, _ := responseObj["output_text"].(string)
-	if outputText != "" {
-		t.Fatalf("expected empty output_text for tool_calls response, got output_text=%q", outputText)
-	}
-	output, _ := responseObj["output"].([]any)
-	if len(output) == 0 {
-		t.Fatalf("expected structured output entries, got %#v", responseObj["output"])
-	}
-	hasFunctionCall := false
-	hasLegacyWrapper := false
-	for _, item := range output {
-		m, _ := item.(map[string]any)
-		if m == nil {
-			continue
-		}
-		if m["type"] == "function_call" {
-			hasFunctionCall = true
-		}
-		if m["type"] == "tool_calls" {
-			hasLegacyWrapper = true
-		}
-	}
-	if !hasFunctionCall {
-		t.Fatalf("expected function_call item, got %#v", responseObj["output"])
-	}
-	if hasLegacyWrapper {
-		t.Fatalf("did not expect legacy tool_calls wrapper, got %#v", responseObj["output"])
-	}
-	if strings.Contains(outputText, `"tool_calls"`) {
-		t.Fatalf("raw tool_calls JSON leaked in output_text: %q", outputText)
-	}
-}
-
-func TestHandleResponsesStreamUsesOfficialOutputItemEvents(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	streamBody := sseLine(`{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}`) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"read_file"}, util.DefaultToolChoicePolicy(), "")
-	body := rec.Body.String()
-	if !strings.Contains(body, "event: response.output_item.added") {
-		t.Fatalf("expected response.output_item.added event, body=%s", body)
-	}
-	if !strings.Contains(body, "event: response.output_item.done") {
-		t.Fatalf("expected response.output_item.done event, body=%s", body)
-	}
-	if !strings.Contains(body, "event: response.function_call_arguments.done") {
-		t.Fatalf("expected response.function_call_arguments.done event, body=%s", body)
-	}
-	if strings.Contains(body, "event: response.output_tool_call.delta") || strings.Contains(body, "event: response.output_tool_call.done") {
-		t.Fatalf("legacy response.output_tool_call.* event must not appear, body=%s", body)
-	}
-
-	addedPayloads := extractAllSSEEventPayloads(body, "response.output_item.added")
-	hasFunctionCallAdded := false
-	for _, payload := range addedPayloads {
-		item, _ := payload["item"].(map[string]any)
-		if item == nil || asString(item["type"]) != "function_call" {
-			continue
-		}
-		hasFunctionCallAdded = true
-		if asString(item["arguments"]) != "" {
-			t.Fatalf("expected in-progress function_call.arguments to start empty string, got %#v", item["arguments"])
-		}
-	}
-	if !hasFunctionCallAdded {
-		t.Fatalf("expected function_call output_item.added payload, body=%s", body)
-	}
-
-	donePayload, ok := extractSSEEventPayload(body, "response.function_call_arguments.done")
-	if !ok {
-		t.Fatalf("expected to parse response.function_call_arguments.done payload, body=%s", body)
-	}
-	doneCallID := strings.TrimSpace(asString(donePayload["call_id"]))
-	if doneCallID == "" {
-		t.Fatalf("expected non-empty call_id in done payload, payload=%#v", donePayload)
-	}
-	completed, ok := extractSSEEventPayload(body, "response.completed")
-	if !ok {
-		t.Fatalf("expected response.completed payload, body=%s", body)
-	}
-	responseObj, _ := completed["response"].(map[string]any)
-	output, _ := responseObj["output"].([]any)
-	var completedCallID string
-	for _, item := range output {
-		m, _ := item.(map[string]any)
-		if m == nil || m["type"] != "function_call" {
-			continue
-		}
-		completedCallID = strings.TrimSpace(asString(m["call_id"]))
-		if completedCallID != "" {
-			break
-		}
-	}
-	if completedCallID == "" {
-		t.Fatalf("expected function_call.call_id in completed output, output=%#v", output)
-	}
-	if completedCallID != doneCallID {
-		t.Fatalf("expected completed call_id to match stream done call_id, done=%q completed=%q", doneCallID, completedCallID)
-	}
-}
-
 func TestHandleResponsesStreamDoesNotEmitReasoningTextCompatEvents(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
@@ -181,51 +38,6 @@ func TestHandleResponsesStreamDoesNotEmitReasoningTextCompatEvents(t *testing.T)
 	}
 }

-func TestHandleResponsesStreamMultiToolCallKeepsNameAndCallIDAligned(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	streamBody := sseLine(`{"tool_calls":[{"name":"search_web","input":{"query":"latest ai news"}},`) +
-		sseLine(`{"name":"eval_javascript","input":{"code":"1+1"}}]}`) +
-		"data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"search_web", "eval_javascript"}, util.DefaultToolChoicePolicy(), "")
-
-	body := rec.Body.String()
-	donePayloads := extractAllSSEEventPayloads(body, "response.function_call_arguments.done")
-	if len(donePayloads) != 2 {
-		t.Fatalf("expected two response.function_call_arguments.done events, got %d body=%s", len(donePayloads), body)
-	}
-	seenNames := map[string]string{}
-	for _, payload := range donePayloads {
-		name := strings.TrimSpace(asString(payload["name"]))
-		callID := strings.TrimSpace(asString(payload["call_id"]))
-		if name != "search_web" && name != "eval_javascript" {
-			t.Fatalf("unexpected tool name in done payload: %#v", payload)
-		}
-		if callID == "" {
-			t.Fatalf("expected non-empty call_id in done payload: %#v", payload)
-		}
-		seenNames[name] = callID
-	}
-	if seenNames["search_web"] == seenNames["eval_javascript"] {
-		t.Fatalf("expected distinct call_id per tool, got %#v", seenNames)
-	}
-}
-
 func TestHandleResponsesStreamEmitsOutputTextDoneBeforeContentPartDone(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
@@ -297,123 +109,6 @@ func TestHandleResponsesStreamOutputTextDeltaCarriesItemIndexes(t *testing.T) {
 	}
 }

-func TestHandleResponsesStreamThinkingAndMixedToolExampleEmitsFunctionCall(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(path, value string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": path,
-			"v": value,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	streamBody := sseLine("response/thinking_content", "thinking...") +
-		sseLine("response/content", "先读取文件。") +
-		sseLine("response/content", `{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}`) +
-		"data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-reasoner", "prompt", true, false, []string{"read_file"}, util.DefaultToolChoicePolicy(), "")
-
-	addedPayloads := extractAllSSEEventPayloads(rec.Body.String(), "response.output_item.added")
-	if len(addedPayloads) < 1 {
-		t.Fatalf("expected at least one output_item.added event, got %d body=%s", len(addedPayloads), rec.Body.String())
-	}
-
-	completedPayload, ok := extractSSEEventPayload(rec.Body.String(), "response.completed")
-	if !ok {
-		t.Fatalf("expected response.completed payload, body=%s", rec.Body.String())
-	}
-	responseObj, _ := completedPayload["response"].(map[string]any)
-	output, _ := responseObj["output"].([]any)
-	hasMessage := false
-	hasFunctionCall := false
-	for _, item := range output {
-		m, _ := item.(map[string]any)
-		if m == nil {
-			continue
-		}
-		if asString(m["type"]) == "message" {
-			hasMessage = true
-		}
-		if asString(m["type"]) == "function_call" {
-			hasFunctionCall = true
-		}
-	}
-	if !hasMessage {
-		t.Fatalf("expected message output for mixed prose tool example, output=%#v", output)
-	}
-	if !hasFunctionCall {
-		t.Fatalf("expected function_call output for mixed prose tool example, output=%#v", output)
-	}
-}
-
-func TestHandleResponsesStreamToolChoiceNoneStillAllowsFunctionCall(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	streamBody := sseLine(`{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}`) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-	policy := util.ToolChoicePolicy{Mode: util.ToolChoiceNone}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, nil, policy, "")
-	body := rec.Body.String()
-	if !strings.Contains(body, "event: response.function_call_arguments.done") {
-		t.Fatalf("expected function_call events for tool_choice=none, body=%s", body)
-	}
-}
-
-func TestHandleResponsesStreamMalformedToolJSONFallsBackToText(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	// invalid JSON (NaN) should remain plain text in strict mode.
-	streamBody := sseLine(`{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"},"x":NaN}]}`) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"read_file"}, util.DefaultToolChoicePolicy(), "")
-	body := rec.Body.String()
-	if strings.Contains(body, "event: response.function_call_arguments.delta") || strings.Contains(body, "event: response.function_call_arguments.done") {
-		t.Fatalf("did not expect function_call events for malformed payload in strict mode, body=%s", body)
-	}
-	if !strings.Contains(body, "event: response.output_text.delta") {
-		t.Fatalf("expected response.output_text.delta for malformed payload, body=%s", body)
-	}
-	if !strings.Contains(body, "event: response.completed") {
-		t.Fatalf("expected response.completed event, body=%s", body)
-	}
-}
-
 func TestHandleResponsesStreamRequiredToolChoiceFailure(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
@@ -448,7 +143,7 @@ func TestHandleResponsesStreamRequiredToolChoiceFailure(t *testing.T) {
 	}
 }

-func TestHandleResponsesStreamRequiredToolChoiceIgnoresThinkingToolPayload(t *testing.T) {
+func TestHandleResponsesStreamFailsWhenUpstreamHasOnlyThinking(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
 	rec := httptest.NewRecorder()
@@ -461,53 +156,13 @@ func TestHandleResponsesStreamRequiredToolChoiceIgnoresThinkingToolPayload(t *te
 		return "data: " + string(b) + "\n"
 	}

-	streamBody := sseLine("response/thinking_content", `{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}`) +
-		sseLine("response/content", "plain text only") +
-		"data: [DONE]\n"
+	streamBody := sseLine("response/thinking_content", "Only thinking") + "data: [DONE]\n"
 	resp := &http.Response{
 		StatusCode: http.StatusOK,
 		Body:       io.NopCloser(strings.NewReader(streamBody)),
 	}

-	policy := util.ToolChoicePolicy{
-		Mode:    util.ToolChoiceRequired,
-		Allowed: map[string]struct{}{"read_file": {}},
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", true, false, []string{"read_file"}, policy, "")
-	body := rec.Body.String()
-	if !strings.Contains(body, "event: response.failed") {
-		t.Fatalf("expected response.failed event for required tool_choice violation, body=%s", body)
-	}
-	if strings.Contains(body, "event: response.completed") {
-		t.Fatalf("did not expect response.completed after failure, body=%s", body)
-	}
-}
-
-func TestHandleResponsesStreamRequiredMalformedToolPayloadFails(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
-	}
-
-	streamBody := sseLine(`{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"},"x":NaN}]}`) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-	policy := util.ToolChoicePolicy{
-		Mode:    util.ToolChoiceRequired,
-		Allowed: map[string]struct{}{"read_file": {}},
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"read_file"}, policy, "")
+	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-reasoner", "prompt", true, false, nil, util.DefaultToolChoicePolicy(), "")

 	body := rec.Body.String()
 	if !strings.Contains(body, "event: response.failed") {
@@ -516,31 +171,13 @@ func TestHandleResponsesStreamRequiredMalformedToolPayloadFails(t *testing.T) {
 	if strings.Contains(body, "event: response.completed") {
 		t.Fatalf("did not expect response.completed, body=%s", body)
 	}
-}
-
-func TestHandleResponsesStreamAllowsUnknownToolName(t *testing.T) {
-	h := &Handler{}
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
-	rec := httptest.NewRecorder()
-
-	sseLine := func(v string) string {
-		b, _ := json.Marshal(map[string]any{
-			"p": "response/content",
-			"v": v,
-		})
-		return "data: " + string(b) + "\n"
+	payload, ok := extractSSEEventPayload(body, "response.failed")
+	if !ok {
+		t.Fatalf("expected response.failed payload, body=%s", body)
 	}
-
-	streamBody := sseLine(`{"tool_calls":[{"name":"not_in_schema","input":{"q":"go"}}]}`) + "data: [DONE]\n"
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body:       io.NopCloser(strings.NewReader(streamBody)),
-	}
-
-	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, false, []string{"read_file"}, util.DefaultToolChoicePolicy(), "")
-	body := rec.Body.String()
-	if !strings.Contains(body, "event: response.function_call_arguments.done") {
-		t.Fatalf("expected function_call events for unknown tool, body=%s", body)
+	errObj, _ := payload["error"].(map[string]any)
+	if asString(errObj["code"]) != "upstream_empty_output" {
+		t.Fatalf("expected code=upstream_empty_output, got %#v", payload)
 	}
 }

@@ -597,36 +234,6 @@ func TestHandleResponsesNonStreamRequiredToolChoiceIgnoresThinkingToolPayload(t
 	}
 }

-func TestHandleResponsesNonStreamToolChoiceNoneStillAllowsFunctionCall(t *testing.T) {
-	h := &Handler{}
-	rec := httptest.NewRecorder()
-	resp := &http.Response{
-		StatusCode: http.StatusOK,
-		Body: io.NopCloser(strings.NewReader(
-			`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"read_file\",\"input\":{\"path\":\"README.MD\"}}]}"}` + "\n" +
-				`data: [DONE]` + "\n",
-		)),
-	}
-	policy := util.ToolChoicePolicy{Mode: util.ToolChoiceNone}
-
-	h.handleResponsesNonStream(rec, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, nil, policy, "")
-	if rec.Code != http.StatusOK {
-		t.Fatalf("expected 200 for tool_choice=none handling, got %d body=%s", rec.Code, rec.Body.String())
-	}
-	out := decodeJSONBody(t, rec.Body.String())
-	output, _ := out["output"].([]any)
-	foundFunctionCall := false
-	for _, item := range output {
-		m, _ := item.(map[string]any)
-		if m != nil && m["type"] == "function_call" {
-			foundFunctionCall = true
-		}
-	}
-	if !foundFunctionCall {
-		t.Fatalf("expected function_call output item for tool_choice=none, got %#v", output)
-	}
-}
-
 func TestHandleResponsesNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
 	h := &Handler{}
 	rec := httptest.NewRecorder()
@@ -671,6 +278,28 @@ func TestHandleResponsesNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWi
 	}
 }

+func TestHandleResponsesNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
+	h := &Handler{}
+	rec := httptest.NewRecorder()
+	resp := &http.Response{
+		StatusCode: http.StatusOK,
+		Body: io.NopCloser(strings.NewReader(
+			`data: {"p":"response/thinking_content","v":"Only thinking"}` + "\n" +
+				`data: [DONE]` + "\n",
+		)),
+	}
+
+	h.handleResponsesNonStream(rec, resp, "owner-a", "resp_test", "deepseek-reasoner", "prompt", true, nil, util.DefaultToolChoicePolicy(), "")
+	if rec.Code != http.StatusTooManyRequests {
+		t.Fatalf("expected 429 for thinking-only upstream output, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	out := decodeJSONBody(t, rec.Body.String())
+	errObj, _ := out["error"].(map[string]any)
+	if asString(errObj["code"]) != "upstream_empty_output" {
+		t.Fatalf("expected code=upstream_empty_output, got %#v", out)
+	}
+}
+
 func extractSSEEventPayload(body, targetEvent string) (map[string]any, bool) {
 	scanner := bufio.NewScanner(strings.NewReader(body))
 	matched := false
@@ -696,30 +325,3 @@ func extractSSEEventPayload(body, targetEvent string) (map[string]any, bool) {
 	}
 	return nil, false
 }
-
-func extractAllSSEEventPayloads(body, targetEvent string) []map[string]any {
-	scanner := bufio.NewScanner(strings.NewReader(body))
-	matched := false
-	out := make([]map[string]any, 0, 2)
-	for scanner.Scan() {
-		line := strings.TrimSpace(scanner.Text())
-		if strings.HasPrefix(line, "event: ") {
-			evt := strings.TrimSpace(strings.TrimPrefix(line, "event: "))
-			matched = evt == targetEvent
-			continue
-		}
-		if !matched || !strings.HasPrefix(line, "data: ") {
-			continue
-		}
-		raw := strings.TrimSpace(strings.TrimPrefix(line, "data: "))
-		if raw == "" || raw == "[DONE]" {
-			continue
-		}
-		var payload map[string]any
-		if err := json.Unmarshal([]byte(raw), &payload); err != nil {
-			continue
-		}
-		out = append(out, payload)
-	}
-	return out
-}
--- a/internal/adapter/openai/standard_request.go
+++ b/internal/adapter/openai/standard_request.go
@@ -24,9 +24,10 @@ func normalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID
 		responseModel = resolvedModel
 	}
 	toolPolicy := util.DefaultToolChoicePolicy()
-	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy)
+	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy, thinkingEnabled)
 	toolNames = ensureToolDetectionEnabled(toolNames, req["tools"])
 	passThrough := collectOpenAIChatPassThrough(req)
+	refFileIDs := collectOpenAIRefFileIDs(req)

 	return util.StandardRequest{
 		Surface:        "openai_chat",
@@ -40,6 +41,7 @@ func normalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID
 		Stream:         util.ToBool(req["stream"]),
 		Thinking:       thinkingEnabled,
 		Search:         searchEnabled,
+		RefFileIDs:     refFileIDs,
 		PassThrough:    passThrough,
 	}, nil
 }
@@ -74,12 +76,13 @@ func normalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
 	if err != nil {
 		return util.StandardRequest{}, err
 	}
-	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy)
+	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy, thinkingEnabled)
 	toolNames = ensureToolDetectionEnabled(toolNames, req["tools"])
 	if !toolPolicy.IsNone() {
 		toolPolicy.Allowed = namesToSet(toolNames)
 	}
 	passThrough := collectOpenAIChatPassThrough(req)
+	refFileIDs := collectOpenAIRefFileIDs(req)

 	return util.StandardRequest{
 		Surface:        "openai_responses",
@@ -93,6 +96,7 @@ func normalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
 		Stream:         util.ToBool(req["stream"]),
 		Thinking:       thinkingEnabled,
 		Search:         searchEnabled,
+		RefFileIDs:     refFileIDs,
 		PassThrough:    passThrough,
 	}, nil
 }
--- a/internal/adapter/openai/standard_request_test.go
+++ b/internal/adapter/openai/standard_request_test.go
@@ -41,6 +41,36 @@ func TestNormalizeOpenAIChatRequest(t *testing.T) {
 	}
 }

+func TestNormalizeOpenAIChatRequestCollectsRefFileIDs(t *testing.T) {
+	store := newEmptyStoreForNormalizeTest(t)
+	req := map[string]any{
+		"model": "gpt-5-codex",
+		"messages": []any{
+			map[string]any{
+				"role": "user",
+				"content": []any{
+					map[string]any{"type": "input_text", "text": "hello"},
+					map[string]any{"type": "input_file", "file_id": "file-msg"},
+				},
+			},
+		},
+		"attachments": []any{
+			map[string]any{"file_id": "file-attachment"},
+		},
+		"ref_file_ids": []any{"file-top", "file-attachment"},
+	}
+	n, err := normalizeOpenAIChatRequest(store, req, "")
+	if err != nil {
+		t.Fatalf("normalize failed: %v", err)
+	}
+	if len(n.RefFileIDs) != 3 {
+		t.Fatalf("expected 3 distinct file ids, got %#v", n.RefFileIDs)
+	}
+	if n.RefFileIDs[0] != "file-top" || n.RefFileIDs[1] != "file-attachment" || n.RefFileIDs[2] != "file-msg" {
+		t.Fatalf("unexpected file ids: %#v", n.RefFileIDs)
+	}
+}
+
 func TestNormalizeOpenAIResponsesRequestInput(t *testing.T) {
 	store := newEmptyStoreForNormalizeTest(t)
 	req := map[string]any{
--- a/internal/adapter/openai/stream_status_test.go
+++ b/internal/adapter/openai/stream_status_test.go
@@ -50,6 +50,10 @@ func (m streamStatusDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int
 	return "pow", nil
 }

+func (m streamStatusDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, _ deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
+	return &deepseek.UploadFileResult{ID: "file-id", Filename: "file.txt", Bytes: 1, Status: "uploaded"}, nil
+}
+
 func (m streamStatusDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	return m.resp, nil
 }
@@ -142,53 +146,6 @@ func TestResponsesStreamStatusCapturedAs200(t *testing.T) {
 	}
 }

-func TestResponsesNonStreamMixedProseToolPayloadHandlerPath(t *testing.T) {
-	statuses := make([]int, 0, 1)
-	content, _ := json.Marshal(map[string]any{
-		"p": "response/content",
-		"v": "我来调用工具\n{\"tool_calls\":[{\"name\":\"read_file\",\"input\":{\"path\":\"README.MD\"}}]}",
-	})
-	h := &Handler{
-		Store: mockOpenAIConfig{wideInput: true},
-		Auth:  streamStatusAuthStub{},
-		DS:    streamStatusDSStub{resp: makeOpenAISSEHTTPResponse("data: "+string(content), "data: [DONE]")},
-	}
-	r := chi.NewRouter()
-	r.Use(captureStatusMiddleware(&statuses))
-	RegisterRoutes(r, h)
-
-	reqBody := `{"model":"deepseek-chat","input":"请调用工具","tools":[{"type":"function","function":{"name":"read_file","description":"read","parameters":{"type":"object","properties":{"path":{"type":"string"}}}}}],"stream":false}`
-	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
-	req.Header.Set("Authorization", "Bearer direct-token")
-	req.Header.Set("Content-Type", "application/json")
-	rec := httptest.NewRecorder()
-	r.ServeHTTP(rec, req)
-
-	if rec.Code != http.StatusOK {
-		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
-	}
-	if len(statuses) != 1 || statuses[0] != http.StatusOK {
-		t.Fatalf("expected captured status 200, got %#v", statuses)
-	}
-
-	var out map[string]any
-	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
-		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
-	}
-	outputText, _ := out["output_text"].(string)
-	if outputText != "" {
-		t.Fatalf("expected output_text hidden for mixed prose tool payload, got %q", outputText)
-	}
-	output, _ := out["output"].([]any)
-	if len(output) != 1 {
-		t.Fatalf("expected one output item, got %#v", output)
-	}
-	first, _ := output[0].(map[string]any)
-	if first["type"] != "function_call" {
-		t.Fatalf("expected function_call output item, got %#v", output)
-	}
-}
-
 func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T) {
 	statuses := make([]int, 0, 1)
 	h := &Handler{
@@ -239,6 +196,49 @@ func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T
 	}
 }

+func TestChatCompletionsStreamEmitsFailureFrameWhenUpstreamOutputEmpty(t *testing.T) {
+	statuses := make([]int, 0, 1)
+	h := &Handler{
+		Store: mockOpenAIConfig{wideInput: true},
+		Auth:  streamStatusAuthStub{},
+		DS:    streamStatusDSStub{resp: makeOpenAISSEHTTPResponse("data: [DONE]")},
+	}
+	r := chi.NewRouter()
+	r.Use(captureStatusMiddleware(&statuses))
+	RegisterRoutes(r, h)
+
+	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":"hi"}],"stream":true}`
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
+	req.Header.Set("Authorization", "Bearer direct-token")
+	req.Header.Set("Content-Type", "application/json")
+	rec := httptest.NewRecorder()
+	r.ServeHTTP(rec, req)
+
+	if rec.Code != http.StatusOK {
+		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
+	}
+	if len(statuses) != 1 || statuses[0] != http.StatusOK {
+		t.Fatalf("expected captured status 200, got %#v", statuses)
+	}
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if len(frames) != 1 {
+		t.Fatalf("expected one failure frame, got %#v body=%s", frames, rec.Body.String())
+	}
+	last := frames[0]
+	statusCode, ok := last["status_code"].(float64)
+	if !ok || int(statusCode) != http.StatusTooManyRequests {
+		t.Fatalf("expected status_code=429, got %#v body=%s", last["status_code"], rec.Body.String())
+	}
+	errObj, _ := last["error"].(map[string]any)
+	if asString(errObj["code"]) != "upstream_empty_output" {
+		t.Fatalf("expected code=upstream_empty_output, got %#v", last)
+	}
+}
+
 func TestResponsesStreamUsageIgnoresBatchAccumulatedTokenUsage(t *testing.T) {
 	statuses := make([]int, 0, 1)
 	h := &Handler{
--- a/internal/adapter/openai/tool_sieve_core.go
+++ b/internal/adapter/openai/tool_sieve_core.go
@@ -60,7 +60,7 @@ func processToolSieveChunk(state *toolStreamSieveState, chunk string, toolNames
 		if pending == "" {
 			break
 		}
-		start := findToolSegmentStart(pending)
+		start := findToolSegmentStart(state, pending)
 		if start >= 0 {
 			prefix := pending[:start]
 			if prefix != "" {
@@ -74,7 +74,7 @@ func processToolSieveChunk(state *toolStreamSieveState, chunk string, toolNames
 			continue
 		}

-		safe, hold := splitSafeContentForToolDetection(pending)
+		safe, hold := splitSafeContentForToolDetection(state, pending)
 		if safe == "" {
 			break
 		}
@@ -114,14 +114,10 @@ func flushToolSieve(state *toolStreamSieveState, toolNames []string) []toolStrea
 		} else {
 			content := state.capture.String()
 			if content != "" {
-				// If the captured text looks like an incomplete XML tool call block,
-				// swallow it to prevent leaking raw XML tags to the client.
-				if hasOpenXMLToolTag(content) {
-					// Drop it silently — incomplete tool call.
-				} else {
-					state.noteText(content)
-					events = append(events, toolStreamEvent{Content: content})
-				}
+				// If capture never resolved into a real tool call, release the
+				// buffered text instead of swallowing it.
+				state.noteText(content)
+				events = append(events, toolStreamEvent{Content: content})
 			}
 		}
 		state.capture.Reset()
@@ -130,100 +126,57 @@ func flushToolSieve(state *toolStreamSieveState, toolNames []string) []toolStrea
 	}
 	if state.pending.Len() > 0 {
 		content := state.pending.String()
-		// Safety: if pending contains XML tool tag fragments (e.g. "tool_calls>"
-		// from a split closing tag), swallow them instead of leaking.
-		if hasOpenXMLToolTag(content) || looksLikeXMLToolTagFragment(content) {
-			// Drop it — likely an incomplete tool call fragment.
-		} else {
-			state.noteText(content)
-			events = append(events, toolStreamEvent{Content: content})
-		}
+		// If pending never resolved into a real tool call, release it as text.
+		state.noteText(content)
+		events = append(events, toolStreamEvent{Content: content})
 		state.pending.Reset()
 	}
 	return events
 }

-func splitSafeContentForToolDetection(s string) (safe, hold string) {
+func splitSafeContentForToolDetection(state *toolStreamSieveState, s string) (safe, hold string) {
 	if s == "" {
 		return "", ""
 	}
-	suspiciousStart := findSuspiciousPrefixStart(s)
-	if suspiciousStart < 0 {
-		return s, ""
-	}
-	if suspiciousStart > 0 {
-		return s[:suspiciousStart], s[suspiciousStart:]
-	}
-	// If suspicious content starts at position 0, keep holding until we can
-	// parse a complete tool JSON block or reach stream flush.
-	return "", s
-}
-
-func findSuspiciousPrefixStart(s string) int {
-	start := -1
-	indices := []int{
-		strings.LastIndex(s, "{"),
-		strings.LastIndex(s, "["),
-		strings.LastIndex(s, "```"),
-	}
-	for _, idx := range indices {
-		if idx > start {
-			start = idx
+	if xmlIdx := findPartialXMLToolTagStart(s); xmlIdx >= 0 {
+		if insideCodeFenceWithState(state, s[:xmlIdx]) {
+			return s, ""
 		}
+		if xmlIdx > 0 {
+			return s[:xmlIdx], s[xmlIdx:]
+		}
+		return "", s
 	}
-	// Also check for partial XML tool tag at end of string.
-	if xmlIdx := findPartialXMLToolTagStart(s); xmlIdx >= 0 && xmlIdx > start {
-		start = xmlIdx
-	}
-	return start
+	return s, ""
 }

-func findToolSegmentStart(s string) int {
+func findToolSegmentStart(state *toolStreamSieveState, s string) int {
 	if s == "" {
 		return -1
 	}
 	lower := strings.ToLower(s)
-	keywords := []string{"tool_calls", "\"function\"", "function.name:", "\"tool_use\""}
-	bestKeyIdx := -1
-	for _, kw := range keywords {
-		idx := strings.Index(lower, kw)
-		if idx >= 0 && (bestKeyIdx < 0 || idx < bestKeyIdx) {
-			bestKeyIdx = idx
+	offset := 0
+	for {
+		bestKeyIdx := -1
+		matchedTag := ""
+		for _, tag := range xmlToolTagsToDetect {
+			idx := strings.Index(lower[offset:], tag)
+			if idx >= 0 {
+				idx += offset
+				if bestKeyIdx < 0 || idx < bestKeyIdx {
+					bestKeyIdx = idx
+					matchedTag = tag
+				}
+			}
 		}
-	}
-	if fnKeyIdx := findQuotedFunctionCallKeyStart(s); fnKeyIdx >= 0 && (bestKeyIdx < 0 || fnKeyIdx < bestKeyIdx) {
-		bestKeyIdx = fnKeyIdx
-	}
-	// Also detect XML tool call tags.
-	for _, tag := range xmlToolTagsToDetect {
-		idx := strings.Index(lower, tag)
-		if idx >= 0 && (bestKeyIdx < 0 || idx < bestKeyIdx) {
-			bestKeyIdx = idx
+		if bestKeyIdx < 0 {
+			return -1
 		}
-	}
-	if bestKeyIdx < 0 {
-		return -1
-	}
-	// For XML tags, the '<' is itself the segment start.
-	if bestKeyIdx < len(s) && s[bestKeyIdx] == '<' {
-		if fenceStart, ok := openFenceStartBefore(s, bestKeyIdx); ok {
-			return fenceStart
+		if !insideCodeFenceWithState(state, s[:bestKeyIdx]) {
+			return bestKeyIdx
 		}
-		return bestKeyIdx
+		offset = bestKeyIdx + len(matchedTag)
 	}
-	start := strings.LastIndex(s[:bestKeyIdx], "{")
-	if start < 0 {
-		start = bestKeyIdx
-	}
-	// If the keyword matched inside an XML tag (e.g. "tool_calls" in "<tool_calls>"),
-	// back up past the '<' to capture the full tag.
-	if start > 0 && s[start-1] == '<' {
-		start--
-	}
-	if fenceStart, ok := openFenceStartBefore(s, start); ok {
-		return fenceStart
-	}
-	return start
 }

 func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix string, calls []toolcall.ParsedToolCall, suffix string, ready bool) {
@@ -232,7 +185,7 @@ func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix
 		return "", nil, "", false
 	}

-	// Try XML tool call extraction first.
+	// XML tool call extraction only.
 	if xmlPrefix, xmlCalls, xmlSuffix, xmlReady := consumeXMLToolCapture(captured, toolNames); xmlReady {
 		return xmlPrefix, xmlCalls, xmlSuffix, true
 	}
@@ -240,45 +193,5 @@ func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix
 	if hasOpenXMLToolTag(captured) {
 		return "", nil, "", false
 	}
-
-	lower := strings.ToLower(captured)
-	keyIdx := -1
-	keywords := []string{"tool_calls", "\"function\"", "function.name:", "\"tool_use\""}
-	for _, kw := range keywords {
-		idx := strings.Index(lower, kw)
-		if idx >= 0 && (keyIdx < 0 || idx < keyIdx) {
-			keyIdx = idx
-		}
-	}
-	if fnKeyIdx := findQuotedFunctionCallKeyStart(captured); fnKeyIdx >= 0 && (keyIdx < 0 || fnKeyIdx < keyIdx) {
-		keyIdx = fnKeyIdx
-	}
-
-	if keyIdx < 0 {
-		return "", nil, "", false
-	}
-	start := strings.LastIndex(captured[:keyIdx], "{")
-	if start < 0 {
-		start = keyIdx
-	}
-	obj, end, ok := extractJSONObjectFrom(captured, start)
-	if !ok {
-		return "", nil, "", false
-	}
-	prefixPart := captured[:start]
-	suffixPart := captured[end:]
-	parsed := toolcall.ParseStandaloneToolCallsDetailed(obj, toolNames)
-	if len(parsed.Calls) == 0 {
-		if parsed.SawToolCallSyntax && parsed.RejectedByPolicy {
-			// Parsed as tool-call payload but rejected by schema/policy:
-			// consume it to avoid leaking raw tool_calls JSON to user content.
-			return prefixPart, nil, suffixPart, true
-		}
-		// If it has obvious keywords but failed to parse even after loose repair,
-		// we still might want to intercept it if it looks like an attempt at tool call.
-		// For now, keep the original logic but rely on loose JSON repair.
-		return captured, nil, "", true
-	}
-	prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart)
-	return prefixPart, parsed.Calls, suffixPart, true
+	return "", nil, "", false
 }
--- a/internal/adapter/openai/tool_sieve_functioncall.go
+++ b/internal/adapter/openai/tool_sieve_functioncall.go
@@ -1,100 +0,0 @@
-package openai
-
-import "strings"
-
-func findQuotedFunctionCallKeyStart(s string) int {
-	lower := strings.ToLower(s)
-	quotedIdx := findFunctionCallKeyStart(lower, `"functioncall"`)
-	bareIdx := findFunctionCallKeyStart(lower, "functioncall")
-
-	// Prefer the quoted JSON key whenever we have a structural match.
-	// Bare-key detection is only for loose payloads where the quoted form
-	// is absent.
-	if quotedIdx >= 0 {
-		return quotedIdx
-	}
-	return bareIdx
-}
-
-func findFunctionCallKeyStart(lower, key string) int {
-	for from := 0; from < len(lower); {
-		rel := strings.Index(lower[from:], key)
-		if rel < 0 {
-			return -1
-		}
-		idx := from + rel
-		if isInsideJSONString(lower, idx) {
-			from = idx + 1
-			continue
-		}
-		if !hasJSONObjectContextPrefix(lower[:idx]) {
-			from = idx + 1
-			continue
-		}
-		if !hasJSONKeyBoundary(lower, idx, len(key)) {
-			from = idx + 1
-			continue
-		}
-		j := idx + len(key)
-		for j < len(lower) && (lower[j] == ' ' || lower[j] == '\t' || lower[j] == '\r' || lower[j] == '\n') {
-			j++
-		}
-		if j < len(lower) && lower[j] == ':' {
-			k := j + 1
-			for k < len(lower) && (lower[k] == ' ' || lower[k] == '\t' || lower[k] == '\r' || lower[k] == '\n') {
-				k++
-			}
-			if k < len(lower) && lower[k] != '{' {
-				from = idx + 1
-				continue
-			}
-			return idx
-		}
-		from = idx + 1
-	}
-	return -1
-}
-
-func isInsideJSONString(s string, idx int) bool {
-	inString := false
-	escaped := false
-	for i := 0; i < idx; i++ {
-		c := s[i]
-		if escaped {
-			escaped = false
-			continue
-		}
-		if c == '\\' && inString {
-			escaped = true
-			continue
-		}
-		if c == '"' {
-			inString = !inString
-		}
-	}
-	return inString
-}
-
-func hasJSONObjectContextPrefix(prefix string) bool {
-	return strings.LastIndex(prefix, "{") >= 0
-}
-
-func hasJSONKeyBoundary(s string, idx, keyLen int) bool {
-	if idx > 0 {
-		prev := s[idx-1]
-		if isLowerAlphaNumeric(prev) {
-			return false
-		}
-	}
-	if end := idx + keyLen; end < len(s) {
-		next := s[end]
-		if isLowerAlphaNumeric(next) {
-			return false
-		}
-	}
-	return true
-}
-
-func isLowerAlphaNumeric(b byte) bool {
-	return (b >= 'a' && b <= 'z') || (b >= '0' && b <= '9') || b == '_'
-}
--- a/internal/adapter/openai/tool_sieve_functioncall_test.go
+++ b/internal/adapter/openai/tool_sieve_functioncall_test.go
@@ -1,23 +0,0 @@
-package openai
-
-import "testing"
-
-func TestFindQuotedFunctionCallKeyStart_PrefersEarlierBareKey(t *testing.T) {
-	input := `{functionCall:{"name":"a","arguments":"{}"},"message":"literal text: \"functionCall\": not a key"}`
-
-	got := findQuotedFunctionCallKeyStart(input)
-	want := 1
-	if got != want {
-		t.Fatalf("findQuotedFunctionCallKeyStart() = %d, want %d", got, want)
-	}
-}
-
-func TestFindQuotedFunctionCallKeyStart_PrefersEarlierQuotedKey(t *testing.T) {
-	input := `{"functionCall":{"name":"a","arguments":"{}"},"note":"functionCall appears in prose"}`
-
-	got := findQuotedFunctionCallKeyStart(input)
-	want := 1
-	if got != want {
-		t.Fatalf("findQuotedFunctionCallKeyStart() = %d, want %d", got, want)
-	}
-}
--- a/internal/adapter/openai/tool_sieve_jsonscan.go
+++ b/internal/adapter/openai/tool_sieve_jsonscan.go
@@ -2,48 +2,6 @@ package openai

 import "strings"

-func extractJSONObjectFrom(text string, start int) (string, int, bool) {
-	if start < 0 || start >= len(text) || text[start] != '{' {
-		return "", 0, false
-	}
-	depth := 0
-	quote := byte(0)
-	escaped := false
-	for i := start; i < len(text); i++ {
-		ch := text[i]
-		if quote != 0 {
-			if escaped {
-				escaped = false
-				continue
-			}
-			if ch == '\\' {
-				escaped = true
-				continue
-			}
-			if ch == quote {
-				quote = 0
-			}
-			continue
-		}
-		if ch == '"' || ch == '\'' {
-			quote = ch
-			continue
-		}
-		if ch == '{' {
-			depth++
-			continue
-		}
-		if ch == '}' {
-			depth--
-			if depth == 0 {
-				end := i + 1
-				return text[start:end], end, true
-			}
-		}
-	}
-	return "", 0, false
-}
-
 func trimWrappingJSONFence(prefix, suffix string) (string, string) {
 	trimmedPrefix := strings.TrimRight(prefix, " \t\r\n")
 	fenceIdx := strings.LastIndex(trimmedPrefix, "```")
@@ -67,18 +25,3 @@ func trimWrappingJSONFence(prefix, suffix string) (string, string) {
 	consumedLeading := len(suffix) - len(trimmedSuffix)
 	return trimmedPrefix[:fenceIdx], suffix[consumedLeading+3:]
 }
-
-func openFenceStartBefore(s string, pos int) (int, bool) {
-	if pos <= 0 || pos > len(s) {
-		return -1, false
-	}
-	segment := s[:pos]
-	lastFence := strings.LastIndex(segment, "```")
-	if lastFence < 0 {
-		return -1, false
-	}
-	if strings.Count(segment, "```")%2 == 1 {
-		return lastFence, true
-	}
-	return -1, false
-}
--- a/internal/adapter/openai/tool_sieve_state.go
+++ b/internal/adapter/openai/tool_sieve_state.go
@@ -6,19 +6,22 @@ import (
 )

 type toolStreamSieveState struct {
-	pending          strings.Builder
-	capture          strings.Builder
-	capturing        bool
-	recentTextTail   string
-	pendingToolRaw   string
-	pendingToolCalls []toolcall.ParsedToolCall
-	disableDeltas    bool
-	toolNameSent     bool
-	toolName         string
-	toolArgsStart    int
-	toolArgsSent     int
-	toolArgsString   bool
-	toolArgsDone     bool
+	pending               strings.Builder
+	capture               strings.Builder
+	capturing             bool
+	codeFenceStack        []int
+	codeFencePendingTicks int
+	codeFenceLineStart    bool
+	recentTextTail        string
+	pendingToolRaw        string
+	pendingToolCalls      []toolcall.ParsedToolCall
+	disableDeltas         bool
+	toolNameSent          bool
+	toolName              string
+	toolArgsStart         int
+	toolArgsSent          int
+	toolArgsString        bool
+	toolArgsDone          bool
 }

 type toolStreamEvent struct {
@@ -47,9 +50,10 @@ func (s *toolStreamSieveState) resetIncrementalToolState() {
 }

 func (s *toolStreamSieveState) noteText(content string) {
-	if content == "" {
+	if !hasMeaningfulText(content) {
 		return
 	}
+	updateCodeFenceState(s, content)
 	s.recentTextTail = appendTail(s.recentTextTail, content, toolSieveContextTailLimit)
 }

@@ -63,3 +67,107 @@ func appendTail(prev, next string, max int) string {
 	}
 	return combined[len(combined)-max:]
 }
+
+func hasMeaningfulText(text string) bool {
+	return strings.TrimSpace(text) != ""
+}
+
+func insideCodeFenceWithState(state *toolStreamSieveState, text string) bool {
+	if state == nil {
+		return insideCodeFence(text)
+	}
+	simulated := simulateCodeFenceState(
+		state.codeFenceStack,
+		state.codeFencePendingTicks,
+		state.codeFenceLineStart,
+		text,
+	)
+	return len(simulated.stack) > 0
+}
+
+func insideCodeFence(text string) bool {
+	if text == "" {
+		return false
+	}
+	return len(simulateCodeFenceState(nil, 0, true, text).stack) > 0
+}
+
+func updateCodeFenceState(state *toolStreamSieveState, text string) {
+	if state == nil || !hasMeaningfulText(text) {
+		return
+	}
+	next := simulateCodeFenceState(
+		state.codeFenceStack,
+		state.codeFencePendingTicks,
+		state.codeFenceLineStart,
+		text,
+	)
+	state.codeFenceStack = next.stack
+	state.codeFencePendingTicks = next.pendingTicks
+	state.codeFenceLineStart = next.lineStart
+}
+
+type codeFenceSimulation struct {
+	stack        []int
+	pendingTicks int
+	lineStart    bool
+}
+
+func simulateCodeFenceState(stack []int, pendingTicks int, lineStart bool, text string) codeFenceSimulation {
+	chunk := text
+	nextStack := append([]int(nil), stack...)
+	ticks := pendingTicks
+	atLineStart := lineStart
+
+	flushTicks := func() {
+		if ticks > 0 {
+			if atLineStart && ticks >= 3 {
+				applyFenceMarker(&nextStack, ticks)
+			}
+			atLineStart = false
+			ticks = 0
+		}
+	}
+
+	for i := 0; i < len(chunk); i++ {
+		ch := chunk[i]
+		if ch == '`' {
+			ticks++
+			continue
+		}
+		flushTicks()
+		switch ch {
+		case '\n', '\r':
+			atLineStart = true
+		case ' ', '\t':
+			if atLineStart {
+				continue
+			}
+			atLineStart = false
+		default:
+			atLineStart = false
+		}
+	}
+
+	return codeFenceSimulation{
+		stack:        nextStack,
+		pendingTicks: ticks,
+		lineStart:    atLineStart,
+	}
+}
+
+func applyFenceMarker(stack *[]int, ticks int) {
+	if stack == nil || ticks <= 0 {
+		return
+	}
+	if len(*stack) == 0 {
+		*stack = append(*stack, ticks)
+		return
+	}
+	top := (*stack)[len(*stack)-1]
+	if ticks >= top {
+		*stack = (*stack)[:len(*stack)-1]
+		return
+	}
+	*stack = append(*stack, ticks)
+}
--- a/internal/adapter/openai/tool_sieve_xml.go
+++ b/internal/adapter/openai/tool_sieve_xml.go
@@ -26,8 +26,8 @@ var xmlToolCallTagPairs = []struct{ open, close string }{
 	{"<invoke", "</invoke>"},
 	{"<tool_use", "</tool_use>"},
 	// Agent-style: these are XML "tool call" patterns from coding agents.
-	// They get captured → parsed. If parsing fails, the block is consumed
-	// (swallowed) to prevent raw XML from leaking to the client.
+	// They get captured → parsed. If parsing fails, the raw XML is preserved
+	// so the caller can still see the original text.
 	{"<attempt_completion", "</attempt_completion>"},
 	{"<ask_followup_question", "</ask_followup_question>"},
 	{"<new_task", "</new_task>"},
@@ -73,31 +73,12 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string,
 			prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart)
 			return prefixPart, parsed, suffixPart, true
 		}
-		// If this block does not look like an executable tool-call payload,
-		// pass it through as normal content (e.g. user-requested XML snippets).
-		if !looksLikeExecutableXMLToolCallBlock(xmlBlock, pair.open) {
-			return prefixPart + xmlBlock, nil, suffixPart, true
-		}
-		// Looks like XML tool syntax but failed to parse — consume it to avoid leak.
-		return prefixPart, nil, suffixPart, true
+		// If this block failed to become a tool call, pass it through as text.
+		return prefixPart + xmlBlock, nil, suffixPart, true
 	}
 	return "", nil, "", false
 }

-func looksLikeExecutableXMLToolCallBlock(xmlBlock, openTag string) bool {
-	lower := strings.ToLower(xmlBlock)
-	// Agent wrapper tags are always treated as internal tool-call wrappers.
-	switch openTag {
-	case "<attempt_completion", "<ask_followup_question", "<new_task":
-		return true
-	}
-	return strings.Contains(lower, "<tool_name") ||
-		strings.Contains(lower, "<parameters") ||
-		strings.Contains(lower, `"tool"`) ||
-		strings.Contains(lower, `"tool_name"`) ||
-		strings.Contains(lower, `"name"`)
-}
-
 // hasOpenXMLToolTag returns true if captured text contains an XML tool opening tag
 // whose SPECIFIC closing tag has not appeared yet.
 func hasOpenXMLToolTag(captured string) bool {
@@ -137,32 +118,3 @@ func findPartialXMLToolTagStart(s string) int {
 	}
 	return -1
 }
-
-// looksLikeXMLToolTagFragment returns true if s looks like a fragment from a
-// split XML tool call tag — for example "tool_calls>" or "/tool_call>\n".
-// These fragments arise when '<' was consumed separately and the tail remains.
-func looksLikeXMLToolTagFragment(s string) bool {
-	trimmed := strings.TrimSpace(s)
-	if trimmed == "" {
-		return false
-	}
-	lower := strings.ToLower(trimmed)
-	// Check for closing tag tails like "tool_calls>" or "/tool_calls>"
-	fragments := []string{
-		"tool_calls>", "tool_call>", "/tool_calls>", "/tool_call>",
-		"function_calls>", "function_call>", "/function_calls>", "/function_call>",
-		"invoke>", "/invoke>", "tool_use>", "/tool_use>",
-		"tool_name>", "/tool_name>", "parameters>", "/parameters>",
-		// Agent-style tag fragments
-		"attempt_completion>", "/attempt_completion>",
-		"ask_followup_question>", "/ask_followup_question>",
-		"new_task>", "/new_task>",
-		"result>", "/result>",
-	}
-	for _, f := range fragments {
-		if strings.Contains(lower, f) {
-			return true
-		}
-	}
-	return false
-}
--- a/internal/adapter/openai/tool_sieve_xml_test.go
+++ b/internal/adapter/openai/tool_sieve_xml_test.go
@@ -121,6 +121,105 @@ func TestProcessToolSieveNonToolXMLKeepsSuffixForToolParsing(t *testing.T) {
 	}
 }

+func TestProcessToolSievePassesThroughMalformedExecutableXMLBlock(t *testing.T) {
+	var state toolStreamSieveState
+	chunk := `<tool_call><parameters>{"path":"README.md"}</parameters></tool_call>`
+	events := processToolSieveChunk(&state, chunk, []string{"read_file"})
+	events = append(events, flushToolSieve(&state, []string{"read_file"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	for _, evt := range events {
+		textContent.WriteString(evt.Content)
+		toolCalls += len(evt.ToolCalls)
+	}
+
+	if toolCalls != 0 {
+		t.Fatalf("expected malformed executable-looking XML to stay text, got %d events=%#v", toolCalls, events)
+	}
+	if textContent.String() != chunk {
+		t.Fatalf("expected malformed executable-looking XML to pass through unchanged, got %q", textContent.String())
+	}
+}
+
+func TestProcessToolSievePassesThroughFencedXMLToolCallExamples(t *testing.T) {
+	var state toolStreamSieveState
+	input := strings.Join([]string{
+		"Before first example.\n```",
+		"xml\n<tool_call><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n",
+		"Between examples.\n```xml\n",
+		"<tool_call><tool_name>search</tool_name><parameters>{\"q\":\"golang\"}</parameters></tool_call>\n",
+		"```\nAfter examples.",
+	}, "")
+
+	chunks := []string{
+		"Before first example.\n```",
+		"xml\n<tool_call><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n",
+		"Between examples.\n```xml\n",
+		"<tool_call><tool_name>search</tool_name><parameters>{\"q\":\"golang\"}</parameters></tool_call>\n",
+		"```\nAfter examples.",
+	}
+
+	var events []toolStreamEvent
+	for _, c := range chunks {
+		events = append(events, processToolSieveChunk(&state, c, []string{"read_file", "search"})...)
+	}
+	events = append(events, flushToolSieve(&state, []string{"read_file", "search"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	for _, evt := range events {
+		if evt.Content != "" {
+			textContent.WriteString(evt.Content)
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+
+	if toolCalls != 0 {
+		t.Fatalf("expected fenced XML examples to stay text, got %d tool calls events=%#v", toolCalls, events)
+	}
+	if textContent.String() != input {
+		t.Fatalf("expected fenced XML examples to pass through unchanged, got %q", textContent.String())
+	}
+}
+
+func TestProcessToolSieveKeepsPartialXMLTagInsideFencedExample(t *testing.T) {
+	var state toolStreamSieveState
+	input := strings.Join([]string{
+		"Example:\n```xml\n<tool_ca",
+		"ll><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n",
+		"Done.",
+	}, "")
+
+	chunks := []string{
+		"Example:\n```xml\n<tool_ca",
+		"ll><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n",
+		"Done.",
+	}
+
+	var events []toolStreamEvent
+	for _, c := range chunks {
+		events = append(events, processToolSieveChunk(&state, c, []string{"read_file"})...)
+	}
+	events = append(events, flushToolSieve(&state, []string{"read_file"})...)
+
+	var textContent strings.Builder
+	toolCalls := 0
+	for _, evt := range events {
+		if evt.Content != "" {
+			textContent.WriteString(evt.Content)
+		}
+		toolCalls += len(evt.ToolCalls)
+	}
+
+	if toolCalls != 0 {
+		t.Fatalf("expected partial fenced XML to stay text, got %d tool calls events=%#v", toolCalls, events)
+	}
+	if textContent.String() != input {
+		t.Fatalf("expected partial fenced XML to pass through unchanged, got %q", textContent.String())
+	}
+}
+
 func TestProcessToolSievePartialXMLTagHeldBack(t *testing.T) {
 	var state toolStreamSieveState
 	// Chunk ends with a partial XML tool tag.
@@ -147,15 +246,16 @@ func TestFindToolSegmentStartDetectsXMLToolCalls(t *testing.T) {
 		want  int
 	}{
 		{"tool_calls_tag", "some text <tool_calls>\n", 10},
-		{"gemini_function_call_json", `some text {"functionCall":{"name":"search","args":{"q":"latest"}}}`, 10},
 		{"tool_call_tag", "prefix <tool_call>\n", 7},
 		{"invoke_tag", "text <invoke name=\"foo\">body</invoke>", 5},
+		{"xml_inside_code_fence", "```xml\n<tool_call><tool_name>read_file</tool_name></tool_call>\n```", -1},
 		{"function_call_tag", "<function_call name=\"foo\">body</function_call>", 0},
 		{"no_xml", "just plain text", -1},
+		{"gemini_json_no_detect", `some text {"functionCall":{"name":"search"}}`, -1},
 	}
 	for _, tc := range cases {
 		t.Run(tc.name, func(t *testing.T) {
-			got := findToolSegmentStart(tc.input)
+			got := findToolSegmentStart(nil, tc.input)
 			if got != tc.want {
 				t.Fatalf("findToolSegmentStart(%q) = %d, want %d", tc.input, got, tc.want)
 			}
@@ -163,81 +263,6 @@ func TestFindToolSegmentStartDetectsXMLToolCalls(t *testing.T) {
 	}
 }

-func TestFindToolSegmentStartIgnoresFunctionCallProse(t *testing.T) {
-	input := "Please explain the functionCall API field and how clients should parse it."
-	if got := findToolSegmentStart(input); got != -1 {
-		t.Fatalf("expected no tool segment start for prose, got %d", got)
-	}
-}
-
-func TestFindToolSegmentStartDetectsQuotedFunctionCallKey(t *testing.T) {
-	input := `prefix {"functionCall": {"name":"search_web","args":{"query":"x"}}}`
-	want := strings.Index(input, "{")
-	if got := findToolSegmentStart(input); got != want {
-		t.Fatalf("expected JSON object start %d, got %d", want, got)
-	}
-}
-
-func TestFindToolSegmentStartDetectsLooseFunctionCallKey(t *testing.T) {
-	input := `prefix {functionCall: {"name":"search_web","args":{"query":"x"}}}`
-	want := strings.Index(input, "{")
-	if got := findToolSegmentStart(input); got != want {
-		t.Fatalf("expected JSON object start %d, got %d", want, got)
-	}
-}
-
-func TestFindToolSegmentStartPrefersQuotedFunctionCallOverEarlierBareProse(t *testing.T) {
-	input := `prefix {note} functionCall: docs hint {"functionCall":{"name":"search_web","args":{"query":"x"}}}`
-	want := strings.Index(input, `{"functionCall"`)
-	if got := findToolSegmentStart(input); got != want {
-		t.Fatalf("expected quoted functionCall JSON start %d, got %d", want, got)
-	}
-}
-
-func TestFindToolSegmentStartIgnoresLooseFunctionCallProse(t *testing.T) {
-	input := "Please explain why functionCall: is used in documentation examples."
-	if got := findToolSegmentStart(input); got != -1 {
-		t.Fatalf("expected no tool segment start for prose, got %d", got)
-	}
-}
-
-func TestProcessToolSieveDoesNotBufferFunctionCallProse(t *testing.T) {
-	var state toolStreamSieveState
-	chunk := "Please explain the functionCall API field and keep streaming this sentence."
-	events := processToolSieveChunk(&state, chunk, []string{"search_web"})
-	var text string
-	for _, evt := range events {
-		text += evt.Content
-		if len(evt.ToolCalls) > 0 {
-			t.Fatalf("expected no tool calls for prose, got %#v", evt.ToolCalls)
-		}
-	}
-	if text != chunk {
-		t.Fatalf("expected prose to pass through immediately, got %q", text)
-	}
-}
-
-func TestProcessToolSieveDetectsGeminiFunctionCallPayload(t *testing.T) {
-	var state toolStreamSieveState
-	events := processToolSieveChunk(&state, `{"functionCall":{"name":"search_web","args":{"query":"latest"}}}`, []string{"search_web"})
-	events = append(events, flushToolSieve(&state, []string{"search_web"})...)
-
-	var textContent string
-	var toolCalls int
-	for _, evt := range events {
-		if evt.Content != "" {
-			textContent += evt.Content
-		}
-		toolCalls += len(evt.ToolCalls)
-	}
-	if toolCalls != 1 {
-		t.Fatalf("expected one tool call from functionCall payload, got events=%#v", events)
-	}
-	if strings.Contains(strings.ToLower(textContent), "functioncall") {
-		t.Fatalf("functionCall json leaked into text content: %q", textContent)
-	}
-}
-
 func TestFindPartialXMLToolTagStart(t *testing.T) {
 	cases := []struct {
 		name  string
@@ -344,8 +369,8 @@ func TestProcessToolSieveTokenByTokenXMLNoLeak(t *testing.T) {
 	}
 }

-// Test that flushToolSieve on incomplete XML does NOT leak the raw XML content.
-func TestFlushToolSieveIncompleteXMLDoesNotLeak(t *testing.T) {
+// Test that flushToolSieve on incomplete XML falls back to raw text.
+func TestFlushToolSieveIncompleteXMLFallsBackToText(t *testing.T) {
 	var state toolStreamSieveState
 	// XML block starts but stream ends before completion.
 	chunks := []string{
@@ -367,8 +392,8 @@ func TestFlushToolSieveIncompleteXMLDoesNotLeak(t *testing.T) {
 		}
 	}

-	if strings.Contains(textContent, "<tool_call") {
-		t.Fatalf("incomplete XML leaked on flush: %q", textContent)
+	if textContent != strings.Join(chunks, "") {
+		t.Fatalf("expected incomplete XML to fall back to raw text, got %q", textContent)
 	}
 }

@@ -405,10 +430,10 @@ func TestOpeningXMLTagNotLeakedAsContent(t *testing.T) {
 	}
 }

-func TestProcessToolSieveInterceptsAttemptCompletionLeak(t *testing.T) {
+func TestProcessToolSieveFallsBackToRawAttemptCompletion(t *testing.T) {
 	var state toolStreamSieveState
-	// Simulate an agent outputting attempt_completion XML tag
-	// which shouldn't leak to text output, even if it fails to parse as a valid tool.
+	// Simulate an agent outputting attempt_completion XML tag.
+	// If it does not parse as a tool call, it should fall back to raw text.
 	chunks := []string{
 		"Done with task.\n",
 		"<attempt_completion>\n",
@@ -432,7 +457,7 @@ func TestProcessToolSieveInterceptsAttemptCompletionLeak(t *testing.T) {
 		t.Fatalf("expected leading text to be emitted, got %q", textContent)
 	}

-	if strings.Contains(textContent, "<attempt_completion>") || strings.Contains(textContent, "result>") {
-		t.Fatalf("agent XML tag content leaked to text: %q", textContent)
+	if textContent != strings.Join(chunks, "") {
+		t.Fatalf("expected agent XML to fall back to raw text, got %q", textContent)
 	}
 }
--- a/internal/adapter/openai/upstream_empty.go
+++ b/internal/adapter/openai/upstream_empty.go
@@ -2,8 +2,8 @@ package openai

 import "net/http"

-func writeUpstreamEmptyOutputError(w http.ResponseWriter, thinking, text string, contentFilter bool) bool {
-	if thinking != "" || text != "" {
+func writeUpstreamEmptyOutputError(w http.ResponseWriter, text string, contentFilter bool) bool {
+	if text != "" {
 		return false
 	}
 	if contentFilter {
--- a/internal/adapter/openai/vercel_stream.go
+++ b/internal/adapter/openai/vercel_stream.go
@@ -52,6 +52,10 @@ func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Reque
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
+	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
+		writeOpenAIInlineFileError(w, err)
+		return
+	}
 	if !util.ToBool(req["stream"]) {
 		writeOpenAIError(w, http.StatusBadRequest, "stream must be true")
 		return
--- a/internal/admin/handler_accounts_crud.go
+++ b/internal/admin/handler_accounts_crud.go
@@ -21,8 +21,8 @@ func (h *Handler) listAccounts(w http.ResponseWriter, r *http.Request) {
 	if pageSize < 1 {
 		pageSize = 1
 	}
-	if pageSize > 100 {
-		pageSize = 100
+	if pageSize > 5000 {
+		pageSize = 5000
 	}
 	accounts := h.Store.Snapshot().Accounts
 	reverseAccounts(accounts)
--- a/internal/admin/handler_accounts_crud_test.go
+++ b/internal/admin/handler_accounts_crud_test.go
@@ -0,0 +1,53 @@
+package admin
+
+import (
+	"encoding/json"
+	"fmt"
+	"net/http"
+	"net/http/httptest"
+	"strings"
+	"testing"
+)
+
+func TestListAccountsPageSizeCapIs5000(t *testing.T) {
+	accounts := make([]string, 0, 150)
+	for i := range 150 {
+		accounts = append(accounts, fmt.Sprintf(`{"email":"u%d@example.com","password":"pwd"}`, i))
+	}
+	raw := fmt.Sprintf(`{"accounts":[%s]}`, strings.Join(accounts, ","))
+	router := newHTTPAdminHarness(t, raw, &testingDSMock{})
+
+	rec := httptest.NewRecorder()
+	router.ServeHTTP(rec, adminReq(http.MethodGet, "/accounts?page=1&page_size=200", nil))
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
+	}
+	var payload map[string]any
+	if err := json.Unmarshal(rec.Body.Bytes(), &payload); err != nil {
+		t.Fatalf("decode response: %v", err)
+	}
+	items, _ := payload["items"].([]any)
+	if len(items) != 150 {
+		t.Fatalf("expected all 150 accounts with page_size=200, got %d", len(items))
+	}
+	if ps, _ := payload["page_size"].(float64); ps != 200 {
+		t.Fatalf("expected page_size=200 in response, got %v", payload["page_size"])
+	}
+}
+
+func TestListAccountsPageSizeAbove5000ClampedTo5000(t *testing.T) {
+	router := newHTTPAdminHarness(t, `{"accounts":[{"email":"u@example.com","password":"pwd"}]}`, &testingDSMock{})
+
+	rec := httptest.NewRecorder()
+	router.ServeHTTP(rec, adminReq(http.MethodGet, "/accounts?page=1&page_size=9999", nil))
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
+	}
+	var payload map[string]any
+	if err := json.Unmarshal(rec.Body.Bytes(), &payload); err != nil {
+		t.Fatalf("decode response: %v", err)
+	}
+	if ps, _ := payload["page_size"].(float64); ps != 5000 {
+		t.Fatalf("expected page_size clamped to 5000, got %v", payload["page_size"])
+	}
+}
--- a/internal/compat/go_compat_test.go
+++ b/internal/compat/go_compat_test.go
@@ -1,12 +1,10 @@
 package compat

 import (
-	"ds2api/internal/toolcall"
 	"encoding/json"
 	"os"
 	"path/filepath"
 	"reflect"
-	"strings"
 	"testing"

 	"ds2api/internal/sse"
@@ -65,55 +63,6 @@ func TestGoCompatSSEFixtures(t *testing.T) {
 	}
 }

-func TestGoCompatToolcallFixtures(t *testing.T) {
-	files, err := filepath.Glob(compatPath("fixtures", "toolcalls", "*.json"))
-	if err != nil {
-		t.Fatalf("glob toolcall fixtures failed: %v", err)
-	}
-	if len(files) == 0 {
-		t.Fatal("no toolcall fixtures found")
-	}
-	for _, fixturePath := range files {
-		name := trimExt(filepath.Base(fixturePath))
-		expectedPath := compatPath("expected", "toolcalls_"+name+".json")
-
-		var fixture struct {
-			Text      string   `json:"text"`
-			ToolNames []string `json:"tool_names"`
-			Mode      string   `json:"mode"`
-		}
-		mustLoadJSON(t, fixturePath, &fixture)
-
-		var expected struct {
-			Calls             []toolcall.ParsedToolCall `json:"calls"`
-			SawToolCallSyntax bool                      `json:"sawToolCallSyntax"`
-			RejectedByPolicy  bool                      `json:"rejectedByPolicy"`
-			RejectedToolNames []string                  `json:"rejectedToolNames"`
-		}
-		mustLoadJSON(t, expectedPath, &expected)
-
-		var got toolcall.ToolCallParseResult
-		switch strings.ToLower(strings.TrimSpace(fixture.Mode)) {
-		case "standalone":
-			got = toolcall.ParseStandaloneToolCallsDetailed(fixture.Text, fixture.ToolNames)
-		default:
-			got = toolcall.ParseToolCallsDetailed(fixture.Text, fixture.ToolNames)
-		}
-		if got.Calls == nil {
-			got.Calls = []toolcall.ParsedToolCall{}
-		}
-		if got.RejectedToolNames == nil {
-			got.RejectedToolNames = []string{}
-		}
-		if !reflect.DeepEqual(got.Calls, expected.Calls) ||
-			got.SawToolCallSyntax != expected.SawToolCallSyntax ||
-			got.RejectedByPolicy != expected.RejectedByPolicy ||
-			!reflect.DeepEqual(got.RejectedToolNames, expected.RejectedToolNames) {
-			t.Fatalf("toolcall fixture %s mismatch:\n got=%#v\nwant=%#v", name, got, expected)
-		}
-	}
-}
-
 func TestGoCompatTokenFixtures(t *testing.T) {
 	var fixture struct {
 		Cases []struct {
--- a/internal/deepseek/client_auth.go
+++ b/internal/deepseek/client_auth.go
@@ -91,17 +91,25 @@ func (c *Client) CreateSession(ctx context.Context, a *auth.RequestAuth, maxAtte
 }

 func (c *Client) GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error) {
+	return c.GetPowForTarget(ctx, a, DeepSeekCompletionTargetPath, maxAttempts)
+}
+
+func (c *Client) GetPowForTarget(ctx context.Context, a *auth.RequestAuth, targetPath string, maxAttempts int) (string, error) {
 	if maxAttempts <= 0 {
 		maxAttempts = c.maxRetries
 	}
+	targetPath = strings.TrimSpace(targetPath)
+	if targetPath == "" {
+		targetPath = DeepSeekCompletionTargetPath
+	}
 	clients := c.requestClientsForAuth(ctx, a)
 	attempts := 0
 	refreshed := false
 	for attempts < maxAttempts {
 		headers := c.authHeaders(a.DeepSeekToken)
-		resp, status, err := c.postJSONWithStatus(ctx, clients.regular, clients.fallback, DeepSeekCreatePowURL, headers, map[string]any{"target_path": "/api/v0/chat/completion"})
+		resp, status, err := c.postJSONWithStatus(ctx, clients.regular, clients.fallback, DeepSeekCreatePowURL, headers, map[string]any{"target_path": targetPath})
 		if err != nil {
-			config.Logger.Warn("[get_pow] request error", "error", err, "account", a.AccountID)
+			config.Logger.Warn("[get_pow] request error", "error", err, "account", a.AccountID, "target_path", targetPath)
 			attempts++
 			continue
 		}
@@ -117,7 +125,7 @@ func (c *Client) GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts in
 			}
 			return BuildPowHeader(challenge, answer)
 		}
-		config.Logger.Warn("[get_pow] failed", "status", status, "code", code, "biz_code", bizCode, "msg", msg, "biz_msg", bizMsg, "use_config_token", a.UseConfigToken, "account", a.AccountID)
+		config.Logger.Warn("[get_pow] failed", "status", status, "code", code, "biz_code", bizCode, "msg", msg, "biz_msg", bizMsg, "use_config_token", a.UseConfigToken, "account", a.AccountID, "target_path", targetPath)
 		if a.UseConfigToken {
 			if !refreshed && shouldAttemptRefresh(status, code, bizCode, msg, bizMsg) {
 				if c.Auth.RefreshToken(ctx, a) {
--- a/internal/deepseek/client_completion.go
+++ b/internal/deepseek/client_completion.go
@@ -51,6 +51,7 @@ func (c *Client) streamPost(ctx context.Context, doer trans.Doer, url string, he
 	if err != nil {
 		return nil, err
 	}
+	headers = c.jsonHeaders(headers)
 	clients := c.requestClientsFromContext(ctx)
 	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
 	if err != nil {
--- a/internal/deepseek/client_file_status.go
+++ b/internal/deepseek/client_file_status.go
@@ -0,0 +1,188 @@
+package deepseek
+
+import (
+	"context"
+	"errors"
+	"fmt"
+	"net/http"
+	"net/url"
+	"strings"
+	"time"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+)
+
+const (
+	fileReadyPollAttempts = 60
+	fileReadyPollInterval = time.Second
+	fileReadyPollTimeout  = 65 * time.Second
+)
+
+var fileReadySleep = time.Sleep
+
+func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, result *UploadFileResult) error {
+	if result == nil || strings.TrimSpace(result.ID) == "" {
+		return nil
+	}
+	if isReadyUploadFileStatus(result.Status) {
+		return nil
+	}
+
+	pollCtx, cancel := context.WithTimeout(ctx, fileReadyPollTimeout)
+	defer cancel()
+
+	var lastErr error
+	for attempt := 0; attempt < fileReadyPollAttempts; attempt++ {
+		if err := pollCtx.Err(); err != nil {
+			if lastErr != nil {
+				return fmt.Errorf("waiting for file %s to become ready: %w", result.ID, lastErr)
+			}
+			return fmt.Errorf("waiting for file %s to become ready: %w", result.ID, err)
+		}
+
+		fetched, err := c.fetchUploadedFile(pollCtx, a, result.ID)
+		if err == nil && fetched != nil {
+			mergeUploadFileResults(result, fetched)
+			if isReadyUploadFileStatus(result.Status) {
+				return nil
+			}
+			lastErr = fmt.Errorf("status=%s", strings.TrimSpace(result.Status))
+		} else if err != nil {
+			lastErr = err
+			config.Logger.Debug("[upload_file] waiting for file readiness", "file_id", result.ID, "attempt", attempt+1, "error", err)
+		}
+
+		if attempt < fileReadyPollAttempts-1 {
+			fileReadySleep(fileReadyPollInterval)
+		}
+	}
+
+	if lastErr == nil {
+		lastErr = fmt.Errorf("status=%s", strings.TrimSpace(result.Status))
+	}
+	return fmt.Errorf("file %s did not become ready: %w", result.ID, lastErr)
+}
+
+func (c *Client) fetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*UploadFileResult, error) {
+	fileID = strings.TrimSpace(fileID)
+	if fileID == "" {
+		return nil, errors.New("file id is required")
+	}
+	clients := c.requestClientsForAuth(ctx, a)
+	reqURL := DeepSeekFetchFilesURL + "?file_ids=" + url.QueryEscape(fileID)
+	headers := c.authHeaders(a.DeepSeekToken)
+
+	resp, status, err := c.getJSONWithStatus(ctx, clients.regular, reqURL, headers)
+	if err != nil {
+		return nil, err
+	}
+
+	code, bizCode, msg, bizMsg := extractResponseStatus(resp)
+	if status != http.StatusOK || code != 0 || bizCode != 0 {
+		if strings.TrimSpace(bizMsg) != "" {
+			msg = bizMsg
+		}
+		if msg == "" {
+			msg = http.StatusText(status)
+		}
+		return nil, fmt.Errorf("request failed: status=%d, code=%d, msg=%s", status, code, msg)
+	}
+
+	result := extractFetchedUploadFileResult(resp, fileID)
+	if result == nil || strings.TrimSpace(result.ID) == "" {
+		return nil, errors.New("fetch files succeeded without matching file data")
+	}
+	result.Raw = resp
+	return result, nil
+}
+
+func extractFetchedUploadFileResult(resp map[string]any, targetID string) *UploadFileResult {
+	targetID = strings.TrimSpace(targetID)
+	if resp == nil || targetID == "" {
+		return nil
+	}
+
+	var walk func(any) *UploadFileResult
+	walk = func(v any) *UploadFileResult {
+		switch x := v.(type) {
+		case map[string]any:
+			if result := buildUploadFileResultFromMap(x, targetID); result != nil {
+				return result
+			}
+			for _, nested := range x {
+				if result := walk(nested); result != nil {
+					return result
+				}
+			}
+		case []any:
+			for _, item := range x {
+				if result := walk(item); result != nil {
+					return result
+				}
+			}
+		}
+		return nil
+	}
+
+	if result := walk(resp); result != nil {
+		return result
+	}
+	return nil
+}
+
+func buildUploadFileResultFromMap(m map[string]any, targetID string) *UploadFileResult {
+	fileID := strings.TrimSpace(firstNonEmptyString(m, "id", "file_id"))
+	if fileID == "" || !strings.EqualFold(fileID, targetID) {
+		return nil
+	}
+	result := &UploadFileResult{
+		ID:       fileID,
+		Filename: firstNonEmptyString(m, "name", "filename", "file_name"),
+		Status:   firstNonEmptyString(m, "status", "file_status"),
+		Purpose:  firstNonEmptyString(m, "purpose"),
+		IsImage:  firstBool(m, "is_image", "isImage"),
+		Bytes:    firstPositiveInt64(m, "bytes", "size", "file_size"),
+	}
+	if result.Status == "" {
+		result.Status = "uploaded"
+	}
+	return result
+}
+
+func mergeUploadFileResults(dst, src *UploadFileResult) {
+	if dst == nil || src == nil {
+		return
+	}
+	if strings.TrimSpace(src.ID) != "" {
+		dst.ID = strings.TrimSpace(src.ID)
+	}
+	if strings.TrimSpace(src.Filename) != "" {
+		dst.Filename = strings.TrimSpace(src.Filename)
+	}
+	if src.Bytes > 0 {
+		dst.Bytes = src.Bytes
+	}
+	if strings.TrimSpace(src.Status) != "" {
+		dst.Status = strings.TrimSpace(src.Status)
+	}
+	if strings.TrimSpace(src.Purpose) != "" {
+		dst.Purpose = strings.TrimSpace(src.Purpose)
+	}
+	dst.IsImage = src.IsImage
+	if len(src.Raw) > 0 {
+		dst.Raw = src.Raw
+	}
+	if src.RawHeaders != nil {
+		dst.RawHeaders = src.RawHeaders.Clone()
+	}
+}
+
+func isReadyUploadFileStatus(status string) bool {
+	switch strings.ToLower(strings.TrimSpace(status)) {
+	case "processed", "ready", "done", "available", "success", "completed", "finished":
+		return true
+	default:
+		return false
+	}
+}
--- a/internal/deepseek/client_http_helpers.go
+++ b/internal/deepseek/client_http_helpers.go
@@ -35,6 +35,12 @@ func preview(b []byte) string {
 	return s
 }

+func (c *Client) jsonHeaders(headers map[string]string) map[string]string {
+	out := cloneStringMap(headers)
+	out["Content-Type"] = "application/json"
+	return out
+}
+
 func ScanSSELines(resp *http.Response, onLine func([]byte) bool) error {
 	scanner := bufio.NewScanner(resp.Body)
 	buf := make([]byte, 0, 64*1024)
--- a/internal/deepseek/client_http_json.go
+++ b/internal/deepseek/client_http_json.go
@@ -27,6 +27,7 @@ func (c *Client) postJSONWithStatus(ctx context.Context, doer trans.Doer, fallba
 	if err != nil {
 		return nil, 0, err
 	}
+	headers = c.jsonHeaders(headers)
 	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
 	if err != nil {
 		return nil, 0, err
--- a/internal/deepseek/client_upload.go
+++ b/internal/deepseek/client_upload.go
@@ -0,0 +1,282 @@
+package deepseek
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"errors"
+	"fmt"
+	"mime/multipart"
+	"net/http"
+	"net/textproto"
+	"path/filepath"
+	"strconv"
+	"strings"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	trans "ds2api/internal/deepseek/transport"
+)
+
+type UploadFileRequest struct {
+	Filename    string
+	ContentType string
+	Purpose     string
+	Data        []byte
+}
+
+type UploadFileResult struct {
+	ID         string
+	Filename   string
+	Bytes      int64
+	Status     string
+	Purpose    string
+	AccountID  string
+	IsImage    bool
+	Raw        map[string]any
+	RawHeaders http.Header
+}
+
+func (c *Client) UploadFile(ctx context.Context, a *auth.RequestAuth, req UploadFileRequest, maxAttempts int) (*UploadFileResult, error) {
+	if maxAttempts <= 0 {
+		maxAttempts = c.maxRetries
+	}
+	if len(req.Data) == 0 {
+		return nil, errors.New("file is required")
+	}
+	filename := strings.TrimSpace(req.Filename)
+	if filename == "" {
+		filename = "upload.bin"
+	}
+	contentType := strings.TrimSpace(req.ContentType)
+	if contentType == "" {
+		contentType = "application/octet-stream"
+	}
+	purpose := strings.TrimSpace(req.Purpose)
+	body, contentTypeHeader, err := buildUploadMultipartBody(filename, contentType, req.Data)
+	if err != nil {
+		return nil, err
+	}
+	capturePayload := map[string]any{
+		"filename":     filename,
+		"content_type": contentType,
+		"purpose":      purpose,
+		"bytes":        len(req.Data),
+	}
+	captureSession := c.capture.Start("deepseek_upload_file", DeepSeekUploadFileURL, a.AccountID, capturePayload)
+	attempts := 0
+	refreshed := false
+	powHeader := ""
+	for attempts < maxAttempts {
+		clients := c.requestClientsForAuth(ctx, a)
+		if strings.TrimSpace(powHeader) == "" {
+			powHeader, err = c.GetPowForTarget(ctx, a, DeepSeekUploadTargetPath, maxAttempts)
+			if err != nil {
+				return nil, err
+			}
+			clients = c.requestClientsForAuth(ctx, a)
+		}
+		headers := c.authHeaders(a.DeepSeekToken)
+		headers["Content-Type"] = contentTypeHeader
+		headers["x-ds-pow-response"] = powHeader
+		headers["x-file-size"] = strconv.Itoa(len(req.Data))
+		headers["x-thinking-enabled"] = "1"
+		resp, err := c.doUpload(ctx, clients.regular, clients.fallback, DeepSeekUploadFileURL, headers, body)
+		if err != nil {
+			config.Logger.Warn("[upload_file] request error", "error", err, "account", a.AccountID, "filename", filename)
+			powHeader = ""
+			attempts++
+			continue
+		}
+		if captureSession != nil {
+			resp.Body = captureSession.WrapBody(resp.Body, resp.StatusCode)
+		}
+		payloadBytes, readErr := readResponseBody(resp)
+		_ = resp.Body.Close()
+		if readErr != nil {
+			powHeader = ""
+			attempts++
+			continue
+		}
+		parsed := map[string]any{}
+		if len(payloadBytes) > 0 {
+			if err := json.Unmarshal(payloadBytes, &parsed); err != nil {
+				config.Logger.Warn("[upload_file] json parse failed", "status", resp.StatusCode, "preview", preview(payloadBytes))
+			}
+		}
+		code, bizCode, msg, bizMsg := extractResponseStatus(parsed)
+		if resp.StatusCode == http.StatusOK && code == 0 && bizCode == 0 {
+			result := extractUploadFileResult(parsed)
+			result.Raw = parsed
+			result.RawHeaders = resp.Header.Clone()
+			if result.Filename == "" {
+				result.Filename = filename
+			}
+			if result.Bytes == 0 {
+				result.Bytes = int64(len(req.Data))
+			}
+			if result.Purpose == "" {
+				result.Purpose = purpose
+			}
+			if result.AccountID == "" {
+				result.AccountID = a.AccountID
+			}
+			if result.ID == "" {
+				return nil, errors.New("upload file succeeded without file id")
+			}
+			if err := c.waitForUploadedFile(ctx, a, result); err != nil {
+				return nil, err
+			}
+			return result, nil
+		}
+		config.Logger.Warn("[upload_file] failed", "status", resp.StatusCode, "code", code, "biz_code", bizCode, "msg", msg, "biz_msg", bizMsg, "account", a.AccountID, "filename", filename)
+		powHeader = ""
+		if a.UseConfigToken {
+			if !refreshed && shouldAttemptRefresh(resp.StatusCode, code, bizCode, msg, bizMsg) {
+				if c.Auth.RefreshToken(ctx, a) {
+					refreshed = true
+					attempts++
+					continue
+				}
+			}
+			if c.Auth.SwitchAccount(ctx, a) {
+				refreshed = false
+				attempts++
+				continue
+			}
+		}
+		attempts++
+	}
+	return nil, errors.New("upload file failed")
+}
+
+func buildUploadMultipartBody(filename, contentType string, data []byte) ([]byte, string, error) {
+	var buf bytes.Buffer
+	writer := multipart.NewWriter(&buf)
+	partHeader := textproto.MIMEHeader{}
+	partHeader.Set("Content-Disposition", fmt.Sprintf(`form-data; name="file"; filename=%q`, escapeMultipartFilename(filename)))
+	partHeader.Set("Content-Type", contentType)
+	part, err := writer.CreatePart(partHeader)
+	if err != nil {
+		return nil, "", err
+	}
+	if _, err := part.Write(data); err != nil {
+		return nil, "", err
+	}
+	if err := writer.Close(); err != nil {
+		return nil, "", err
+	}
+	return buf.Bytes(), writer.FormDataContentType(), nil
+}
+
+func escapeMultipartFilename(filename string) string {
+	filename = filepath.Base(strings.TrimSpace(filename))
+	filename = strings.ReplaceAll(filename, `\`, "_")
+	filename = strings.ReplaceAll(filename, `"`, "_")
+	if filename == "." || filename == "" {
+		return "upload.bin"
+	}
+	return filename
+}
+
+func (c *Client) doUpload(ctx context.Context, doer trans.Doer, fallback trans.Doer, url string, headers map[string]string, body []byte) (*http.Response, error) {
+	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
+	if err != nil {
+		return nil, err
+	}
+	for k, v := range headers {
+		req.Header.Set(k, v)
+	}
+	resp, err := doer.Do(req)
+	if err == nil {
+		return resp, nil
+	}
+	config.Logger.Warn("[deepseek] fingerprint upload request failed, fallback to std transport", "url", url, "error", err)
+	req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
+	if reqErr != nil {
+		return nil, reqErr
+	}
+	for k, v := range headers {
+		req2.Header.Set(k, v)
+	}
+	return fallback.Do(req2)
+}
+
+func extractUploadFileResult(resp map[string]any) *UploadFileResult {
+	result := &UploadFileResult{Status: "uploaded"}
+	data, _ := resp["data"].(map[string]any)
+	bizData, _ := data["biz_data"].(map[string]any)
+	searchMaps := []map[string]any{resp, data, bizData}
+	for _, parent := range []map[string]any{resp, data, bizData} {
+		if parent == nil {
+			continue
+		}
+		for _, key := range []string{"file", "biz_data", "data"} {
+			if nested, ok := parent[key].(map[string]any); ok {
+				searchMaps = append(searchMaps, nested)
+			}
+		}
+	}
+	for _, m := range searchMaps {
+		if m == nil {
+			continue
+		}
+		if result.ID == "" {
+			result.ID = firstNonEmptyString(m, "id", "file_id")
+		}
+		if result.Filename == "" {
+			result.Filename = firstNonEmptyString(m, "name", "filename", "file_name")
+		}
+		if result.Status == "uploaded" {
+			if status := firstNonEmptyString(m, "status", "file_status"); status != "" {
+				result.Status = status
+			}
+		}
+		if !result.IsImage {
+			result.IsImage = firstBool(m, "is_image", "isImage")
+		}
+		if result.Purpose == "" {
+			result.Purpose = firstNonEmptyString(m, "purpose")
+		}
+		if result.AccountID == "" {
+			result.AccountID = firstNonEmptyString(m, "account_id", "accountId", "owner_account_id", "ownerAccountId")
+		}
+		if result.Bytes == 0 {
+			result.Bytes = firstPositiveInt64(m, "bytes", "size", "file_size")
+		}
+	}
+	return result
+}
+
+func firstBool(m map[string]any, keys ...string) bool {
+	for _, key := range keys {
+		switch v := m[key].(type) {
+		case bool:
+			return v
+		case string:
+			switch strings.ToLower(strings.TrimSpace(v)) {
+			case "true", "1", "yes", "y":
+				return true
+			}
+		}
+	}
+	return false
+}
+
+func firstNonEmptyString(m map[string]any, keys ...string) string {
+	for _, key := range keys {
+		if v, _ := m[key].(string); strings.TrimSpace(v) != "" {
+			return strings.TrimSpace(v)
+		}
+	}
+	return ""
+}
+
+func firstPositiveInt64(m map[string]any, keys ...string) int64 {
+	for _, key := range keys {
+		if v := toInt64(m[key], 0); v > 0 {
+			return v
+		}
+	}
+	return 0
+}
--- a/internal/deepseek/client_upload_test.go
+++ b/internal/deepseek/client_upload_test.go
@@ -0,0 +1,216 @@
+package deepseek
+
+import (
+	"context"
+	"encoding/base64"
+	"encoding/hex"
+	"encoding/json"
+	"io"
+	"net/http"
+	"strings"
+	"testing"
+	"time"
+
+	"ds2api/internal/auth"
+	powpkg "ds2api/pow"
+)
+
+func TestBuildUploadMultipartBodyOmitsPurposeAndIncludesFilePart(t *testing.T) {
+	body, contentType, err := buildUploadMultipartBody(`../demo.txt`, "text/plain", []byte("hello"))
+	if err != nil {
+		t.Fatalf("buildUploadMultipartBody error: %v", err)
+	}
+	if !strings.HasPrefix(contentType, "multipart/form-data; boundary=") {
+		t.Fatalf("unexpected content type: %q", contentType)
+	}
+	payload := string(body)
+	if strings.Contains(payload, `name="purpose"`) || strings.Contains(payload, "assistants") {
+		t.Fatalf("expected purpose to be omitted from payload: %q", payload)
+	}
+	if !strings.Contains(payload, `name="file"; filename="demo.txt"`) {
+		t.Fatalf("expected sanitized filename in payload: %q", payload)
+	}
+	if !strings.Contains(payload, "Content-Type: text/plain") {
+		t.Fatalf("expected file content type in payload: %q", payload)
+	}
+	if !strings.Contains(payload, "hello") {
+		t.Fatalf("expected file content in payload: %q", payload)
+	}
+}
+
+func TestExtractUploadFileResultSupportsNestedShapes(t *testing.T) {
+	got := extractUploadFileResult(map[string]any{
+		"data": map[string]any{
+			"biz_data": map[string]any{
+				"file": map[string]any{
+					"file_id":   "file_123",
+					"file_name": "report.pdf",
+					"file_size": 99,
+					"status":    "processed",
+					"purpose":   "assistants",
+					"is_image":  true,
+				},
+			},
+		},
+	})
+	if got.ID != "file_123" {
+		t.Fatalf("expected id file_123, got %#v", got)
+	}
+	if got.Filename != "report.pdf" {
+		t.Fatalf("expected filename report.pdf, got %#v", got)
+	}
+	if got.Bytes != 99 {
+		t.Fatalf("expected bytes 99, got %#v", got)
+	}
+	if got.Status != "processed" {
+		t.Fatalf("expected status processed, got %#v", got)
+	}
+	if got.Purpose != "assistants" {
+		t.Fatalf("expected purpose assistants, got %#v", got)
+	}
+	if !got.IsImage {
+		t.Fatalf("expected image flag true, got %#v", got)
+	}
+}
+
+func TestUploadFileUsesUploadTargetPowAndMultipartHeaders(t *testing.T) {
+	challengeHash := powpkg.DeepSeekHashV1([]byte(powpkg.BuildPrefix("salt", 1712345678) + "42"))
+	powResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"challenge":{"algorithm":"DeepSeekHashV1","challenge":"` + hex.EncodeToString(challengeHash[:]) + `","salt":"salt","expire_at":1712345678,"difficulty":1000,"signature":"sig","target_path":"` + DeepSeekUploadTargetPath + `"}}}}`
+	uploadResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"file":{"file_id":"file_789","filename":"demo.txt","bytes":5,"status":"processed","purpose":"assistants","is_image":false}}}}`
+	var seenPow string
+	var seenTargetPath string
+	var seenContentType string
+	var seenFileSize string
+	var seenBody string
+	call := 0
+	client := &Client{
+		regular: doerFunc(func(req *http.Request) (*http.Response, error) {
+			call++
+			bodyBytes, _ := io.ReadAll(req.Body)
+			switch call {
+			case 1:
+				seenTargetPath = string(bodyBytes)
+				return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(powResponse)), Request: req}, nil
+			case 2:
+				seenPow = req.Header.Get("x-ds-pow-response")
+				seenContentType = req.Header.Get("Content-Type")
+				seenFileSize = req.Header.Get("x-file-size")
+				seenBody = string(bodyBytes)
+				return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(uploadResponse)), Request: req}, nil
+			default:
+				t.Fatalf("unexpected request count %d", call)
+				return nil, nil
+			}
+		}),
+		fallback: &http.Client{Transport: roundTripperFunc(func(req *http.Request) (*http.Response, error) {
+			return nil, nil
+		})},
+		maxRetries: 1,
+	}
+	result, err := client.UploadFile(context.Background(), &auth.RequestAuth{DeepSeekToken: "token", TriedAccounts: map[string]bool{}}, UploadFileRequest{
+		Filename:    "demo.txt",
+		ContentType: "text/plain",
+		Purpose:     "assistants",
+		Data:        []byte("hello"),
+	}, 1)
+	if err != nil {
+		t.Fatalf("UploadFile error: %v", err)
+	}
+	if result.ID != "file_789" {
+		t.Fatalf("expected uploaded file id file_789, got %#v", result)
+	}
+	if !strings.Contains(seenTargetPath, `"target_path":"`+DeepSeekUploadTargetPath+`"`) {
+		t.Fatalf("expected upload target_path in pow request, got %q", seenTargetPath)
+	}
+	if strings.TrimSpace(seenPow) == "" {
+		t.Fatal("expected x-ds-pow-response header")
+	}
+	rawPow, err := base64.StdEncoding.DecodeString(seenPow)
+	if err != nil {
+		t.Fatalf("decode pow header failed: %v", err)
+	}
+	var powHeader map[string]any
+	if err := json.Unmarshal(rawPow, &powHeader); err != nil {
+		t.Fatalf("unmarshal pow header failed: %v", err)
+	}
+	if powHeader["target_path"] != DeepSeekUploadTargetPath {
+		t.Fatalf("expected pow target_path %q, got %#v", DeepSeekUploadTargetPath, powHeader["target_path"])
+	}
+	if seenFileSize != "5" {
+		t.Fatalf("expected x-file-size=5, got %q", seenFileSize)
+	}
+	if !strings.HasPrefix(seenContentType, "multipart/form-data; boundary=") {
+		t.Fatalf("expected multipart content type, got %q", seenContentType)
+	}
+	if !strings.Contains(seenBody, `name="file"; filename="demo.txt"`) {
+		t.Fatalf("expected file part in upload body: %q", seenBody)
+	}
+}
+
+func TestUploadFileWaitsForProcessedFetchFiles(t *testing.T) {
+	oldSleep := fileReadySleep
+	fileReadySleep = func(time.Duration) {}
+	defer func() { fileReadySleep = oldSleep }()
+
+	challengeHash := powpkg.DeepSeekHashV1([]byte(powpkg.BuildPrefix("salt", 1712345678) + "42"))
+	powResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"challenge":{"algorithm":"DeepSeekHashV1","challenge":"` + hex.EncodeToString(challengeHash[:]) + `","salt":"salt","expire_at":1712345678,"difficulty":1000,"signature":"sig","target_path":"` + DeepSeekUploadTargetPath + `"}}}}`
+	uploadResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"file":{"file_id":"file_789","filename":"demo.txt","bytes":5,"status":"PENDING","purpose":"assistants","is_image":false}}}}`
+	pendingFetchResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"files":[{"file_id":"file_789","filename":"demo.txt","bytes":5,"status":"PENDING","purpose":"assistants","is_image":false}]}}}`
+	processedFetchResponse := `{"code":0,"msg":"ok","data":{"biz_code":0,"biz_data":{"files":[{"file_id":"file_789","filename":"demo.txt","bytes":5,"status":"processed","purpose":"assistants","is_image":true}]}}}`
+
+	var call int
+	client := &Client{
+		regular: doerFunc(func(req *http.Request) (*http.Response, error) {
+			call++
+			switch call {
+			case 1:
+				bodyBytes, _ := io.ReadAll(req.Body)
+				if !strings.Contains(string(bodyBytes), `"target_path":"`+DeepSeekUploadTargetPath+`"`) {
+					t.Fatalf("expected pow target path request, got %s", string(bodyBytes))
+				}
+				return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(powResponse)), Request: req}, nil
+			case 2:
+				return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(uploadResponse)), Request: req}, nil
+			case 3, 4:
+				if req.Method != http.MethodGet {
+					t.Fatalf("expected GET fetch request, got %s", req.Method)
+				}
+				if req.URL.Path != "/api/v0/file/fetch_files" {
+					t.Fatalf("expected fetch files path /api/v0/file/fetch_files, got %q", req.URL.Path)
+				}
+				if got := req.URL.Query().Get("file_ids"); got != "file_789" {
+					t.Fatalf("expected file_ids=file_789, got %q", got)
+				}
+				respBody := pendingFetchResponse
+				if call == 4 {
+					respBody = processedFetchResponse
+				}
+				return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(respBody)), Request: req}, nil
+			default:
+				t.Fatalf("unexpected request count %d", call)
+				return nil, nil
+			}
+		}),
+		fallback:   &http.Client{Transport: roundTripperFunc(func(req *http.Request) (*http.Response, error) { return nil, nil })},
+		maxRetries: 1,
+	}
+
+	result, err := client.UploadFile(context.Background(), &auth.RequestAuth{DeepSeekToken: "token", TriedAccounts: map[string]bool{}}, UploadFileRequest{
+		Filename:    "demo.txt",
+		ContentType: "text/plain",
+		Purpose:     "assistants",
+		Data:        []byte("hello"),
+	}, 1)
+	if err != nil {
+		t.Fatalf("UploadFile error: %v", err)
+	}
+	if result.ID != "file_789" {
+		t.Fatalf("expected uploaded file id file_789, got %#v", result)
+	}
+	if result.Status != "processed" {
+		t.Fatalf("expected final status processed, got %#v", result.Status)
+	}
+	if call != 4 {
+		t.Fatalf("expected 4 requests, got %d", call)
+	}
+}
--- a/internal/deepseek/constants.go
+++ b/internal/deepseek/constants.go
@@ -12,9 +12,13 @@ const (
 	DeepSeekCreatePowURL         = "https://chat.deepseek.com/api/v0/chat/create_pow_challenge"
 	DeepSeekCompletionURL        = "https://chat.deepseek.com/api/v0/chat/completion"
 	DeepSeekContinueURL          = "https://chat.deepseek.com/api/v0/chat/continue"
+	DeepSeekUploadFileURL        = "https://chat.deepseek.com/api/v0/file/upload_file"
+	DeepSeekFetchFilesURL        = "https://chat.deepseek.com/api/v0/file/fetch_files"
 	DeepSeekFetchSessionURL      = "https://chat.deepseek.com/api/v0/chat_session/fetch_page"
 	DeepSeekDeleteSessionURL     = "https://chat.deepseek.com/api/v0/chat_session/delete"
 	DeepSeekDeleteAllSessionsURL = "https://chat.deepseek.com/api/v0/chat_session/delete_all"
+	DeepSeekCompletionTargetPath = "/api/v0/chat/completion"
+	DeepSeekUploadTargetPath     = "/api/v0/file/upload_file"
 )

 var defaultBaseHeaders = map[string]string{
--- a/internal/deepseek/constants_shared.json
+++ b/internal/deepseek/constants_shared.json
@@ -3,7 +3,6 @@
    "Host": "chat.deepseek.com",
    "User-Agent": "DeepSeek/1.8.0 Android/35",
    "Accept": "application/json",
-    "Content-Type": "application/json",
    "x-client-platform": "android",
    "x-client-version": "1.8.0",
    "x-client-locale": "zh_CN",
--- a/internal/deepseek/prompt.go
+++ b/internal/deepseek/prompt.go
@@ -5,3 +5,7 @@ import "ds2api/internal/prompt"
 func MessagesPrepare(messages []map[string]any) string {
 	return prompt.MessagesPrepare(messages)
 }
+
+func MessagesPrepareWithThinking(messages []map[string]any, thinkingEnabled bool) string {
+	return prompt.MessagesPrepareWithThinking(messages, thinkingEnabled)
+}
--- a/internal/format/claude/render_test.go
+++ b/internal/format/claude/render_test.go
@@ -2,32 +2,6 @@ package claude

 import "testing"

-func TestBuildMessageResponseDetectsToolCallsFromThinkingFallback(t *testing.T) {
-	resp := BuildMessageResponse(
-		"msg_1",
-		"claude-sonnet-4-5",
-		[]any{map[string]any{"role": "user", "content": "hi"}},
-		`{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`,
-		"",
-		[]string{"search"},
-	)
-
-	if resp["stop_reason"] != "tool_use" {
-		t.Fatalf("expected stop_reason=tool_use, got=%#v", resp["stop_reason"])
-	}
-	content, _ := resp["content"].([]map[string]any)
-	if len(content) < 2 {
-		t.Fatalf("expected thinking + tool_use content blocks, got=%#v", resp["content"])
-	}
-	last := content[len(content)-1]
-	if last["type"] != "tool_use" {
-		t.Fatalf("expected last content block tool_use, got=%#v", last["type"])
-	}
-	if last["name"] != "search" {
-		t.Fatalf("expected tool name search, got=%#v", last["name"])
-	}
-}
-
 func TestBuildMessageResponseSkipsThinkingFallbackWhenFinalTextExists(t *testing.T) {
 	resp := BuildMessageResponse(
 		"msg_1",
--- a/internal/format/openai/render_test.go
+++ b/internal/format/openai/render_test.go
@@ -1,75 +1,10 @@
 package openai

 import (
-	"encoding/json"
 	"strings"
 	"testing"
 )

-func TestBuildResponseObjectToolCallsFollowChatShape(t *testing.T) {
-	obj := BuildResponseObject(
-		"resp_test",
-		"gpt-4o",
-		"prompt",
-		"",
-		`{"tool_calls":[{"name":"search","input":{"q":"golang"}}]}`,
-		[]string{"search"},
-	)
-
-	outputText, _ := obj["output_text"].(string)
-	if outputText != "" {
-		t.Fatalf("expected output_text to be hidden for tool calls, got %q", outputText)
-	}
-
-	output, _ := obj["output"].([]any)
-	if len(output) != 1 {
-		t.Fatalf("expected function_call output only, got %#v", obj["output"])
-	}
-
-	first, _ := output[0].(map[string]any)
-	if first["type"] != "function_call" {
-		t.Fatalf("expected first output item type function_call, got %#v", first["type"])
-	}
-	if first["call_id"] == "" {
-		t.Fatalf("expected function_call item to have call_id, got %#v", first)
-	}
-	if first["name"] != "search" {
-		t.Fatalf("unexpected function name: %#v", first["name"])
-	}
-	argsRaw, _ := first["arguments"].(string)
-	var args map[string]any
-	if err := json.Unmarshal([]byte(argsRaw), &args); err != nil {
-		t.Fatalf("arguments should be valid json string, got=%q err=%v", argsRaw, err)
-	}
-	if args["q"] != "golang" {
-		t.Fatalf("unexpected arguments: %#v", args)
-	}
-}
-
-func TestBuildResponseObjectPromotesMixedProseToolPayloadToFunctionCall(t *testing.T) {
-	obj := BuildResponseObject(
-		"resp_test",
-		"gpt-4o",
-		"prompt",
-		"",
-		`示例格式：{"tool_calls":[{"name":"search","input":{"q":"golang"}}]}，但这条是普通回答。`,
-		[]string{"search"},
-	)
-
-	outputText, _ := obj["output_text"].(string)
-	if outputText != "" {
-		t.Fatalf("expected output_text hidden for mixed prose tool payload, got %q", outputText)
-	}
-	output, _ := obj["output"].([]any)
-	if len(output) != 1 {
-		t.Fatalf("expected one function_call output item, got %#v", obj["output"])
-	}
-	first, _ := output[0].(map[string]any)
-	if first["type"] != "function_call" {
-		t.Fatalf("expected function_call output type, got %#v", first["type"])
-	}
-}
-
 func TestBuildResponseObjectKeepsFencedToolPayloadAsText(t *testing.T) {
 	obj := BuildResponseObject(
 		"resp_test",
--- a/internal/js/chat-stream/sse_parse_impl.js
+++ b/internal/js/chat-stream/sse_parse_impl.js
@@ -7,6 +7,53 @@ const {
  SKIP_EXACT_PATHS,
 } = require('../shared/deepseek-constants');

+
+
+function stripThinkTags(text) {
+  if (typeof text !== 'string' || !text) {
+    return text;
+  }
+  return text.replace(/<\/?\s*think\s*>/gi, '');
+}
+
+function splitThinkingParts(parts) {
+  const out = [];
+  let thinkingDone = false;
+  for (const p of parts) {
+    if (!p) continue;
+    if (thinkingDone && p.type === 'thinking') {
+      const cleaned = stripThinkTags(p.text);
+      if (cleaned) {
+        out.push({ text: cleaned, type: 'text' });
+      }
+      continue;
+    }
+    if (p.type !== 'thinking') {
+      const cleaned = stripThinkTags(p.text);
+      if (cleaned) {
+        out.push({ text: cleaned, type: p.type });
+      }
+      continue;
+    }
+    const match = /<\/\s*think\s*>/i.exec(p.text);
+    if (!match) {
+      out.push(p);
+      continue;
+    }
+    thinkingDone = true;
+    const before = p.text.substring(0, match.index);
+    let after = p.text.substring(match.index + match[0].length);
+    if (before) {
+      out.push({ text: before, type: 'thinking' });
+    }
+    after = stripThinkTags(after);
+    if (after) {
+      out.push({ text: after, type: 'text' });
+    }
+  }
+  return { parts: out, transitioned: thinkingDone };
+}
+
 function parseChunkForContent(chunk, thinkingEnabled, currentType, stripReferenceMarkers = true) {
  if (!chunk || typeof chunk !== 'object') {
    return {
@@ -147,7 +194,11 @@ function parseChunkForContent(chunk, thinkingEnabled, currentType, stripReferenc

  let partType = 'text';
  if (pathValue === 'response/thinking_content') {
-    partType = 'thinking';
+    if (newType === 'text') {
+      partType = 'text';
+    } else {
+      partType = 'thinking';
+    }
  } else if (pathValue === 'response/content') {
    partType = 'text';
  } else if (pathValue.includes('response/fragments') && pathValue.includes('/content')) {
@@ -186,9 +237,16 @@ function parseChunkForContent(chunk, thinkingEnabled, currentType, stripReferenc
    if (content) {
      parts.push({ text: content, type: partType });
    }
+    
+    let resolvedParts = filterLeakedContentFilterParts(parts);
+    const splitResult = splitThinkingParts(resolvedParts);
+    if (splitResult.transitioned) {
+      newType = 'text';
+    }
+    
    return {
      parsed: true,
-      parts: filterLeakedContentFilterParts(parts),
+      parts: splitResult.parts,
      finished: false,
      contentFilter: false,
      errorMessage: '',
@@ -213,9 +271,16 @@ function parseChunkForContent(chunk, thinkingEnabled, currentType, stripReferenc
      };
    }
    parts.push(...extracted.parts);
+    
+    let resolvedParts = filterLeakedContentFilterParts(parts);
+    const splitResult = splitThinkingParts(resolvedParts);
+    if (splitResult.transitioned) {
+      newType = 'text';
+    }
+    
    return {
      parsed: true,
-      parts: filterLeakedContentFilterParts(parts),
+      parts: splitResult.parts,
      finished: false,
      contentFilter: false,
      errorMessage: '',
@@ -249,9 +314,16 @@ function parseChunkForContent(chunk, thinkingEnabled, currentType, stripReferenc
      }
    }
  }
+  
+  let resolvedParts = filterLeakedContentFilterParts(parts);
+  const splitResult = splitThinkingParts(resolvedParts);
+  if (splitResult.transitioned) {
+    newType = 'text';
+  }
+
  return {
    parsed: true,
-    parts: filterLeakedContentFilterParts(parts),
+    parts: splitResult.parts,
    finished: false,
    contentFilter: false,
    errorMessage: '',
@@ -546,4 +618,5 @@ module.exports = {
  isFragmentStatusPath,
  isCitation,
  stripReferenceMarkers: stripReferenceMarkersText,
+  stripThinkTags,
 };
--- a/internal/js/helpers/stream-tool-sieve/parse.js
+++ b/internal/js/helpers/stream-tool-sieve/parse.js
@@ -4,15 +4,10 @@ const {
  toStringSafe,
 } = require('./state');
 const {
-  buildToolCallCandidates,
-  parseToolCallsPayload,
  parseMarkupToolCalls,
-  parseTextKVToolCalls,
  stripFencedCodeBlocks,
 } = require('./parse_payload');
-const { TOOL_SEGMENT_KEYWORDS } = require('./tool-keywords');

-const TOOL_NAME_LOOSE_PATTERN = /[^a-z0-9]+/g;
 const TOOL_MARKUP_PREFIXES = ['<tool_call', '<function_call', '<invoke'];

 function extractToolNames(tools) {
@@ -51,47 +46,12 @@ function parseToolCallsDetailed(text, toolNames) {
    return result;
  }

-  const candidates = buildToolCallCandidates(normalized);
-  for (const c of candidates) {
-    if (!isLikelyJSONToolPayloadCandidate(c)) {
-      continue;
-    }
-    const jsonParsed = parseToolCallsPayload(c);
-    if (jsonParsed.length === 0) {
-      continue;
-    }
-    result.sawToolCallSyntax = true;
-    const filteredJSON = filterToolCallsDetailed(jsonParsed, toolNames);
-    result.calls = filteredJSON.calls;
-    result.rejectedToolNames = filteredJSON.rejectedToolNames;
-    result.rejectedByPolicy = filteredJSON.rejectedToolNames.length > 0 && filteredJSON.calls.length === 0;
+  // XML markup parsing only.
+  const parsed = parseMarkupToolCalls(normalized);
+  if (parsed.length === 0) {
    return result;
  }
-  let parsed = [];
-  for (const c of candidates) {
-    parsed = parseMarkupToolCalls(c);
-    if (parsed.length === 0) {
-      parsed = parseToolCallsPayload(c);
-    }
-    if (parsed.length === 0) {
-      parsed = parseTextKVToolCalls(c);
-    }
-    if (parsed.length > 0) {
-      result.sawToolCallSyntax = true;
-      break;
-    }
-  }
-  if (parsed.length === 0) {
-    parsed = parseMarkupToolCalls(normalized);
-    if (parsed.length === 0) {
-      parsed = parseTextKVToolCalls(normalized);
-      if (parsed.length === 0) {
-        return result;
-      }
-    }
-    result.sawToolCallSyntax = true;
-  }
-
+  result.sawToolCallSyntax = true;
  const filtered = filterToolCallsDetailed(parsed, toolNames);
  result.calls = filtered.calls;
  result.rejectedToolNames = filtered.rejectedToolNames;
@@ -113,43 +73,11 @@ function parseStandaloneToolCallsDetailed(text, toolNames) {
  if (shouldSkipToolCallParsingForCodeFenceExample(trimmed)) {
    return result;
  }
-  const candidates = buildToolCallCandidates(trimmed);
-  let parsed = [];
-  for (const c of candidates) {
-    if (!isLikelyJSONToolPayloadCandidate(c)) {
-      continue;
-    }
-    parsed = parseToolCallsPayload(c);
-    if (parsed.length === 0) {
-      continue;
-    }
-    result.sawToolCallSyntax = true;
-    const filteredJSON = filterToolCallsDetailed(parsed, toolNames);
-    result.calls = filteredJSON.calls;
-    result.rejectedToolNames = filteredJSON.rejectedToolNames;
-    result.rejectedByPolicy = filteredJSON.rejectedToolNames.length > 0 && filteredJSON.calls.length === 0;
-    return result;
-  }
-  for (const c of candidates) {
-    parsed = parseMarkupToolCalls(c);
-    if (parsed.length === 0) {
-      parsed = parseToolCallsPayload(c);
-    }
-    if (parsed.length === 0) {
-      parsed = parseTextKVToolCalls(c);
-    }
-    if (parsed.length > 0) {
-      break;
-    }
-  }
+
+  // XML markup parsing only.
+  const parsed = parseMarkupToolCalls(trimmed);
  if (parsed.length === 0) {
-    parsed = parseMarkupToolCalls(trimmed);
-    if (parsed.length === 0) {
-      parsed = parseTextKVToolCalls(trimmed);
-      if (parsed.length === 0) {
-        return result;
-      }
-    }
+    return result;
  }

  result.sawToolCallSyntax = true;
@@ -183,41 +111,9 @@ function filterToolCallsDetailed(parsed, toolNames) {
  return { calls, rejectedToolNames: [] };
 }

-function resolveAllowedToolName(name, allowed, allowedCanonical) {
-  const normalizedName = toStringSafe(name).trim();
-  if (!normalizedName) {
-    return '';
-  }
-  if (allowed.has(normalizedName)) {
-    return normalizedName;
-  }
-  const lower = normalizedName.toLowerCase();
-  if (allowedCanonical.has(lower)) {
-    return allowedCanonical.get(lower);
-  }
-  const idx = lower.lastIndexOf('.');
-  if (idx >= 0 && idx < lower.length - 1) {
-    const tail = lower.slice(idx + 1);
-    if (allowedCanonical.has(tail)) {
-      return allowedCanonical.get(tail);
-    }
-  }
-  const loose = lower.replace(TOOL_NAME_LOOSE_PATTERN, '');
-  if (!loose) {
-    return '';
-  }
-  for (const [candidateLower, canonical] of allowedCanonical.entries()) {
-    if (candidateLower.replace(TOOL_NAME_LOOSE_PATTERN, '') === loose) {
-      return canonical;
-    }
-  }
-  return '';
-}
-
 function looksLikeToolCallSyntax(text) {
  const lower = toStringSafe(text).toLowerCase();
-  return TOOL_SEGMENT_KEYWORDS.some((kw) => lower.includes(kw))
-    || TOOL_MARKUP_PREFIXES.some((prefix) => lower.includes(prefix));
+  return TOOL_MARKUP_PREFIXES.some((prefix) => lower.includes(prefix));
 }

 function shouldSkipToolCallParsingForCodeFenceExample(text) {
@@ -228,21 +124,6 @@ function shouldSkipToolCallParsingForCodeFenceExample(text) {
  return !looksLikeToolCallSyntax(stripped);
 }

-function isLikelyJSONToolPayloadCandidate(text) {
-  const trimmed = toStringSafe(text).trim();
-  if (!trimmed) {
-    return false;
-  }
-  if (!(trimmed.startsWith('{') || trimmed.startsWith('['))) {
-    return false;
-  }
-  const lower = trimmed.toLowerCase();
-  return lower.includes('tool_calls')
-    || lower.includes('"function"')
-    || lower.includes('functioncall')
-    || lower.includes('"tool_use"');
-}
-
 module.exports = {
  extractToolNames,
  parseToolCalls,
--- a/internal/js/helpers/stream-tool-sieve/parse_payload.js
+++ b/internal/js/helpers/stream-tool-sieve/parse_payload.js
@@ -1,6 +1,5 @@
 'use strict';

-const TOOL_CALL_PATTERN = /\{\s*["']tool_calls["']\s*:\s*\[(.*?)\]\s*\}/s;
 const TOOL_CALL_MARKUP_BLOCK_PATTERN = /<(?:[a-z0-9_:-]+:)?(tool_call|function_call|invoke)\b([^>]*)>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?\1>/gi;
 const TOOL_CALL_MARKUP_SELFCLOSE_PATTERN = /<(?:[a-z0-9_:-]+:)?invoke\b([^>]*)\/>/gi;
 const TOOL_CALL_MARKUP_KV_PATTERN = /<(?:[a-z0-9_:-]+:)?([a-z0-9_.-]+)\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?\1>/gi;
@@ -20,14 +19,12 @@ const TOOL_CALL_MARKUP_ARGS_PATTERNS = [
  /<(?:[a-z0-9_:-]+:)?args\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?args>/i,
  /<(?:[a-z0-9_:-]+:)?params\b[^>]*>([\s\S]*?)<\/(?:[a-z0-9_:-]+:)?params>/i,
 ];
-const TEXT_KV_NAME_PATTERN = /function\.name:\s*([a-zA-Z0-9_.-]+)/gi;
+const CDATA_PATTERN = /^<!\[CDATA\[([\s\S]*?)]]>$/i;
+const HTML_ENTITIES_PATTERN = /&[a-z0-9#]+;/gi;

 const {
  toStringSafe,
 } = require('./state');
-const {
-  extractJSONObjectFrom,
-} = require('./jsonscan');

 function stripFencedCodeBlocks(text) {
  const t = typeof text === 'string' ? text : '';
@@ -37,138 +34,6 @@ function stripFencedCodeBlocks(text) {
  return t.replace(/```[\s\S]*?```/g, ' ');
 }

-function buildToolCallCandidates(text) {
-  const trimmed = toStringSafe(text);
-  const candidates = [trimmed];
-
-  const fenced = trimmed.match(/```(?:json)?\s*([\s\S]*?)\s*```/gi) || [];
-  for (const block of fenced) {
-    const m = block.match(/```(?:json)?\s*([\s\S]*?)\s*```/i);
-    if (m && m[1]) {
-      candidates.push(toStringSafe(m[1]));
-    }
-  }
-
-  for (const candidate of extractToolCallObjects(trimmed)) {
-    candidates.push(toStringSafe(candidate));
-  }
-
-  const first = trimmed.indexOf('{');
-  const last = trimmed.lastIndexOf('}');
-  if (first >= 0 && last > first) {
-    candidates.push(toStringSafe(trimmed.slice(first, last + 1)));
-  }
-  const firstArr = trimmed.indexOf('[');
-  const lastArr = trimmed.lastIndexOf(']');
-  if (firstArr >= 0 && lastArr > firstArr) {
-    candidates.push(toStringSafe(trimmed.slice(firstArr, lastArr + 1)));
-  }
-
-  const m = trimmed.match(TOOL_CALL_PATTERN);
-  if (m && m[1]) {
-    candidates.push(`{"tool_calls":[${m[1]}]}`);
-  }
-
-  return [...new Set(candidates.filter(Boolean))];
-}
-
-function extractToolCallObjects(text) {
-  const raw = toStringSafe(text);
-  if (!raw) {
-    return [];
-  }
-  const lower = raw.toLowerCase();
-  const out = [];
-  let offset = 0;
-
-  // eslint-disable-next-line no-constant-condition
-  while (true) {
-    const idxToolCalls = lower.indexOf('tool_calls', offset);
-    const idxFunction = lower.indexOf('"function"', offset);
-    const idxFunctionCall = lower.indexOf('functioncall', offset);
-    const idxToolUse = lower.indexOf('"tool_use"', offset);
-    let idx = -1;
-    let matched = '';
-    if (idxToolCalls >= 0 && (idxFunction < 0 || idxToolCalls <= idxFunction)) {
-      idx = idxToolCalls;
-      matched = 'tool_calls';
-    } else if (idxFunction >= 0) {
-      idx = idxFunction;
-      matched = '"function"';
-    }
-    if (idxFunctionCall >= 0 && (idx < 0 || idxFunctionCall < idx)) {
-      idx = idxFunctionCall;
-      matched = 'functioncall';
-    }
-    if (idxToolUse >= 0 && (idx < 0 || idxToolUse < idx)) {
-      idx = idxToolUse;
-      matched = '"tool_use"';
-    }
-    if (idx < 0) {
-      break;
-    }
-    let start = raw.slice(0, idx).lastIndexOf('{');
-    while (start >= 0) {
-      const obj = extractJSONObjectFrom(raw, start);
-      if (obj.ok) {
-        out.push(raw.slice(start, obj.end).trim());
-        // Ensure forward progress even when the matched keyword is outside
-        // the extracted JSON object (e.g. closing XML wrapper tags containing
-        // "tool_calls" after an earlier JSON arguments object).
-        offset = Math.max(obj.end, idx + matched.length);
-        idx = -1;
-        break;
-      }
-      start = raw.slice(0, start).lastIndexOf('{');
-    }
-    if (idx >= 0) {
-      offset = idx + matched.length;
-    }
-  }
-
-  return out;
-}
-
-function parseToolCallsPayload(payload) {
-  let decoded;
-  try {
-    decoded = JSON.parse(payload);
-  } catch (_err) {
-    return [];
-  }
-
-  if (Array.isArray(decoded)) {
-    return parseToolCallList(decoded);
-  }
-  if (!decoded || typeof decoded !== 'object') {
-    return [];
-  }
-  if (decoded.tool_calls) {
-    if (isLikelyChatMessageEnvelope(decoded)) {
-      return [];
-    }
-    return parseToolCallList(decoded.tool_calls);
-  }
-
-  const one = parseToolCallItem(decoded);
-  return one ? [one] : [];
-}
-
-function isLikelyChatMessageEnvelope(value) {
-  if (!value || typeof value !== 'object' || Array.isArray(value)) {
-    return false;
-  }
-  if (!Object.prototype.hasOwnProperty.call(value, 'tool_calls')) {
-    return false;
-  }
-  const role = toStringSafe(value.role).trim().toLowerCase();
-  if (role === 'assistant' || role === 'tool' || role === 'user' || role === 'system') {
-    return true;
-  }
-  return Object.prototype.hasOwnProperty.call(value, 'tool_call_id')
-    || Object.prototype.hasOwnProperty.call(value, 'content');
-}
-
 function parseMarkupToolCalls(text) {
  const raw = toStringSafe(text).trim();
  if (!raw) {
@@ -190,51 +55,20 @@ function parseMarkupToolCalls(text) {
  return out;
 }

-function parseTextKVToolCalls(text) {
-  const raw = toStringSafe(text);
-  if (!raw) {
-    return [];
-  }
-  const out = [];
-  const matches = [...raw.matchAll(TEXT_KV_NAME_PATTERN)];
-  if (matches.length === 0) {
-    return out;
-  }
-  for (let i = 0; i < matches.length; i += 1) {
-    const match = matches[i];
-    const name = toStringSafe(match[1]).trim();
-    if (!name) {
-      continue;
-    }
-    const nameEnd = match.index + toStringSafe(match[0]).length;
-    const searchEnd = i + 1 < matches.length ? matches[i + 1].index : raw.length;
-    const searchArea = raw.slice(nameEnd, searchEnd);
-    const argIdx = searchArea.indexOf('function.arguments:');
-    if (argIdx < 0) {
-      continue;
-    }
-    const argStart = nameEnd + argIdx + 'function.arguments:'.length;
-    const bracePos = raw.slice(argStart, searchEnd).indexOf('{');
-    if (bracePos < 0) {
-      continue;
-    }
-    const objStart = argStart + bracePos;
-    const obj = extractJSONObjectFrom(raw, objStart);
-    if (!obj.ok) {
-      continue;
-    }
-    out.push({
-      name,
-      input: parseToolCallInput(raw.slice(objStart, obj.end)),
-    });
-  }
-  return out;
-}
-
 function parseMarkupSingleToolCall(attrs, inner) {
-  const embedded = parseToolCallsPayload(inner);
-  if (embedded.length > 0) {
-    return embedded[0];
+  // Try inline JSON parse for the inner content.
+  if (inner) {
+    try {
+      const decoded = JSON.parse(inner);
+      if (decoded && typeof decoded === 'object' && !Array.isArray(decoded) && decoded.name) {
+        return {
+          name: toStringSafe(decoded.name),
+          input: decoded.input && typeof decoded.input === 'object' && !Array.isArray(decoded.input) ? decoded.input : {},
+        };
+      }
+    } catch (_err) {
+      // Not JSON, continue with markup parsing.
+    }
  }
  let name = '';
  const attrMatch = attrs.match(TOOL_CALL_MARKUP_ATTR_PATTERN);
@@ -242,7 +76,7 @@ function parseMarkupSingleToolCall(attrs, inner) {
    name = toStringSafe(attrMatch[2]).trim();
  }
  if (!name) {
-    name = stripTagText(findMarkupTagValue(inner, TOOL_CALL_MARKUP_NAME_PATTERNS));
+    name = extractRawTagValue(findMarkupTagValue(inner, TOOL_CALL_MARKUP_NAME_PATTERNS));
  }
  if (!name) {
    return null;
@@ -266,15 +100,21 @@ function parseMarkupInput(raw) {
  if (!s) {
    return {};
  }
-  const parsed = parseToolCallInput(s);
-  if (parsed && typeof parsed === 'object' && !Array.isArray(parsed) && Object.keys(parsed).length > 0) {
-    return parsed;
-  }
+  // Prioritize XML-style KV tags (e.g., <arg>val</arg>)
  const kv = parseMarkupKVObject(s);
  if (Object.keys(kv).length > 0) {
    return kv;
  }
-  return { _raw: stripTagText(s) };
+
+  // Fallback to JSON parsing
+  const parsed = parseToolCallInput(s);
+  if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
+    if (Object.keys(parsed).length > 0) {
+      return parsed;
+    }
+  }
+
+  return { _raw: extractRawTagValue(s) };
 }

 function parseMarkupKVObject(text) {
@@ -288,19 +128,65 @@ function parseMarkupKVObject(text) {
    if (!key) {
      continue;
    }
-    const valueRaw = stripTagText(m[2]);
-    if (!valueRaw) {
+    const value = parseMarkupValue(m[2]);
+    if (value === undefined || value === null) {
      continue;
    }
-    try {
-      out[key] = JSON.parse(valueRaw);
-    } catch (_err) {
-      out[key] = valueRaw;
-    }
+    appendMarkupValue(out, key, value);
  }
  return out;
 }

+function parseMarkupValue(raw) {
+  const s = toStringSafe(extractRawTagValue(raw)).trim();
+  if (!s) {
+    return '';
+  }
+
+  if (s.includes('<') && s.includes('>')) {
+    const nested = parseMarkupInput(s);
+    if (nested && typeof nested === 'object' && !Array.isArray(nested)) {
+      if (isOnlyRawValue(nested)) {
+        return toStringSafe(nested._raw);
+      }
+      return nested;
+    }
+  }
+
+  try {
+    return JSON.parse(s);
+  } catch (_err) {
+    return s;
+  }
+}
+
+function extractRawTagValue(inner) {
+  const s = toStringSafe(inner).trim();
+  if (!s) {
+    return '';
+  }
+
+  // 1. Check for CDATA
+  const cdataMatch = s.match(CDATA_PATTERN);
+  if (cdataMatch && cdataMatch[1] !== undefined) {
+    return cdataMatch[1];
+  }
+
+  // 2. Fallback to unescaping standard HTML entities
+  // Note: we avoid broad tag stripping here to preserve user content (like < symbols in code)
+  return unescapeHtml(inner);
+}
+
+function unescapeHtml(safe) {
+  if (!safe) return '';
+  return safe.replace(/&amp;/g, '&')
+    .replace(/&lt;/g, '<')
+    .replace(/&gt;/g, '>')
+    .replace(/&quot;/g, '"')
+    .replace(/&#039;/g, "'")
+    .replace(/&#x27;/g, "'");
+}
+
 function stripTagText(text) {
  return toStringSafe(text).replace(/<[^>]+>/g, ' ').trim();
 }
@@ -309,80 +195,13 @@ function findMarkupTagValue(text, patterns) {
  const source = toStringSafe(text);
  for (const p of patterns) {
    const m = source.match(p);
-    if (m && m[1]) {
+    if (m && m[1] !== undefined) {
      return toStringSafe(m[1]);
    }
  }
  return '';
 }

-function parseToolCallList(v) {
-  if (!Array.isArray(v)) {
-    return [];
-  }
-  const out = [];
-  for (const item of v) {
-    if (!item || typeof item !== 'object') {
-      continue;
-    }
-    const one = parseToolCallItem(item);
-    if (one) {
-      out.push(one);
-    }
-  }
-  return out;
-}
-
-function parseToolCallItem(m) {
-  let name = toStringSafe(m.name);
-  let inputRaw = m.input;
-  let hasInput = Object.prototype.hasOwnProperty.call(m, 'input');
-  const fnCall = m.functionCall && typeof m.functionCall === 'object' ? m.functionCall : null;
-  if (fnCall) {
-    if (!name) {
-      name = toStringSafe(fnCall.name);
-    }
-    if (!hasInput && Object.prototype.hasOwnProperty.call(fnCall, 'args')) {
-      inputRaw = fnCall.args;
-      hasInput = true;
-    }
-    if (!hasInput && Object.prototype.hasOwnProperty.call(fnCall, 'arguments')) {
-      inputRaw = fnCall.arguments;
-      hasInput = true;
-    }
-  }
-  const fn = m.function && typeof m.function === 'object' ? m.function : null;
-
-  if (fn) {
-    if (!name) {
-      name = toStringSafe(fn.name);
-    }
-    if (!hasInput && Object.prototype.hasOwnProperty.call(fn, 'arguments')) {
-      inputRaw = fn.arguments;
-      hasInput = true;
-    }
-  }
-
-  if (!hasInput) {
-    for (const k of ['arguments', 'args', 'parameters', 'params']) {
-      if (Object.prototype.hasOwnProperty.call(m, k)) {
-        inputRaw = m[k];
-        hasInput = true;
-        break;
-      }
-    }
-  }
-
-  if (!name) {
-    return null;
-  }
-
-  return {
-    name,
-    input: parseToolCallInput(inputRaw),
-  };
-}
-
 function parseToolCallInput(v) {
  if (v == null) {
    return {};
@@ -416,10 +235,28 @@ function parseToolCallInput(v) {
  return {};
 }

+function appendMarkupValue(out, key, value) {
+  if (Object.prototype.hasOwnProperty.call(out, key)) {
+    const current = out[key];
+    if (Array.isArray(current)) {
+      current.push(value);
+      return;
+    }
+    out[key] = [current, value];
+    return;
+  }
+  out[key] = value;
+}
+
+function isOnlyRawValue(obj) {
+  if (!obj || typeof obj !== 'object' || Array.isArray(obj)) {
+    return false;
+  }
+  const keys = Object.keys(obj);
+  return keys.length === 1 && keys[0] === '_raw';
+}
+
 module.exports = {
  stripFencedCodeBlocks,
-  buildToolCallCandidates,
-  parseToolCallsPayload,
  parseMarkupToolCalls,
-  parseTextKVToolCalls,
 };
--- a/internal/js/helpers/stream-tool-sieve/sieve-xml.js
+++ b/internal/js/helpers/stream-tool-sieve/sieve-xml.js
@@ -42,8 +42,8 @@ function consumeXMLToolCapture(captured, toolNames, trimWrappingJSONFence) {
        suffix: trimmedFence.suffix,
      };
    }
-    // XML tool syntax but failed to parse — consume to avoid leak.
-    return { ready: true, prefix: prefixPart, calls: [], suffix: suffixPart };
+    // If this block failed to become a tool call, pass it through as text.
+    return { ready: true, prefix: prefixPart + xmlBlock, calls: [], suffix: suffixPart };
  }
  return { ready: false, prefix: '', calls: [], suffix: '' };
 }
@@ -79,22 +79,8 @@ function findPartialXMLToolTagStart(s) {
  return -1;
 }

-function looksLikeXMLToolTagFragment(s) {
-  const trimmed = (s || '').trim();
-  if (!trimmed) return false;
-  const lower = trimmed.toLowerCase();
-  const fragments = [
-    'tool_calls>', 'tool_call>', '/tool_calls>', '/tool_call>',
-    'function_calls>', 'function_call>', '/function_calls>', '/function_call>',
-    'invoke>', '/invoke>', 'tool_use>', '/tool_use>',
-    'tool_name>', '/tool_name>', 'parameters>', '/parameters>',
-  ];
-  return fragments.some(f => lower.includes(f));
-}
-
 module.exports = {
  consumeXMLToolCapture,
  hasOpenXMLToolTag,
  findPartialXMLToolTagStart,
-  looksLikeXMLToolTagFragment,
 };
--- a/internal/js/helpers/stream-tool-sieve/sieve.js
+++ b/internal/js/helpers/stream-tool-sieve/sieve.js
@@ -4,18 +4,14 @@ const {
  noteText,
  insideCodeFenceWithState,
 } = require('./state');
-const { parseStandaloneToolCallsDetailed } = require('./parse');
-const { extractJSONObjectFrom, trimWrappingJSONFence } = require('./jsonscan');
+const { trimWrappingJSONFence } = require('./jsonscan');
 const {
-  TOOL_SEGMENT_KEYWORDS,
  XML_TOOL_SEGMENT_TAGS,
-  earliestKeywordIndex,
 } = require('./tool-keywords');
 const {
  consumeXMLToolCapture: consumeXMLToolCaptureImpl,
  hasOpenXMLToolTag,
  findPartialXMLToolTagStart,
-  looksLikeXMLToolTagFragment,
 } = require('./sieve-xml');
 function processToolSieveChunk(state, chunk, toolNames) {
  if (!state) {
@@ -80,7 +76,7 @@ function processToolSieveChunk(state, chunk, toolNames) {
      resetIncrementalToolState(state);
      continue;
    }
-    const [safe, hold] = splitSafeContentForToolDetection(pending);
+    const [safe, hold] = splitSafeContentForToolDetection(state, pending);
    if (!safe) {
      break;
    }
@@ -117,54 +113,38 @@ function flushToolSieve(state, toolNames) {
      }
    } else if (state.capture) {
      const content = state.capture;
-      if (!hasOpenXMLToolTag(content) && !looksLikeXMLToolTagFragment(content)) {
-        noteText(state, content);
-        events.push({ type: 'text', text: content });
-      }
+      noteText(state, content);
+      events.push({ type: 'text', text: content });
    }
    state.capture = '';
    state.capturing = false;
    resetIncrementalToolState(state);
  }
  if (state.pending) {
-    if (!hasOpenXMLToolTag(state.pending) && !looksLikeXMLToolTagFragment(state.pending)) {
-      noteText(state, state.pending);
-      events.push({ type: 'text', text: state.pending });
-    }
+    noteText(state, state.pending);
+    events.push({ type: 'text', text: state.pending });
    state.pending = '';
  }
  return events;
 }

-function splitSafeContentForToolDetection(s) {
+function splitSafeContentForToolDetection(state, s) {
  const text = s || '';
  if (!text) {
    return ['', ''];
  }
-  const suspiciousStart = findSuspiciousPrefixStart(text);
-  if (suspiciousStart < 0) {
-    return [text, ''];
-  }
-  if (suspiciousStart > 0) {
-    return [text.slice(0, suspiciousStart), text.slice(suspiciousStart)];
-  }
-  return ['', text];
-}
-
-function findSuspiciousPrefixStart(s) {
-  let start = -1;
-  for (const needle of ['{', '[', '```']) {
-    const idx = s.lastIndexOf(needle);
-    if (idx > start) {
-      start = idx;
+  // Only hold back partial XML tool tags.
+  const xmlIdx = findPartialXMLToolTagStart(text);
+  if (xmlIdx >= 0) {
+    if (insideCodeFenceWithState(state, text.slice(0, xmlIdx))) {
+      return [text, ''];
    }
+    if (xmlIdx > 0) {
+      return [text.slice(0, xmlIdx), text.slice(xmlIdx)];
+    }
+    return ['', text];
  }
-  // Also check for partial XML tool tag at end of string.
-  const xmlIdx = findPartialXMLToolTagStart(s);
-  if (xmlIdx >= 0 && xmlIdx > start) {
-    start = xmlIdx;
-  }
-  return start;
+  return [text, ''];
 }

 function findToolSegmentStart(state, s) {
@@ -174,39 +154,23 @@ function findToolSegmentStart(state, s) {
  const lower = s.toLowerCase();
  let offset = 0;
  while (true) {
-    // Check JSON keywords.
-    let { index: bestKeyIdx, keyword: matchedKeyword } = earliestKeywordIndex(lower, TOOL_SEGMENT_KEYWORDS, offset);
-    // Also check XML tool tags.
+    // Only check XML tool tags.
+    let bestIdx = -1;
+    let matchedTag = '';
    for (const tag of XML_TOOL_SEGMENT_TAGS) {
      const idx = lower.indexOf(tag, offset);
-      if (idx >= 0 && (bestKeyIdx < 0 || idx < bestKeyIdx)) {
-        bestKeyIdx = idx;
-        matchedKeyword = tag;
+      if (idx >= 0 && (bestIdx < 0 || idx < bestIdx)) {
+        bestIdx = idx;
+        matchedTag = tag;
      }
    }
-    if (bestKeyIdx < 0) {
+    if (bestIdx < 0) {
      return -1;
    }
-    // For XML tags, the '<' is itself the segment start.
-    if (s[bestKeyIdx] === '<') {
-      if (!insideCodeFenceWithState(state, s.slice(0, bestKeyIdx))) {
-        return bestKeyIdx;
-      }
-      offset = bestKeyIdx + matchedKeyword.length;
-      continue;
+    if (!insideCodeFenceWithState(state, s.slice(0, bestIdx))) {
+      return bestIdx;
    }
-    const keyIdx = bestKeyIdx;
-    const start = s.slice(0, keyIdx).lastIndexOf('{');
-    let candidateStart = start >= 0 ? start : keyIdx;
-    // If the keyword matched inside an XML tag (e.g. "tool_calls" in "<tool_calls>"),
-    // back up past the '<' to capture the full tag.
-    if (candidateStart > 0 && s[candidateStart - 1] === '<') {
-      candidateStart--;
-    }
-    if (!insideCodeFenceWithState(state, s.slice(0, candidateStart))) {
-      return candidateStart;
-    }
-    offset = keyIdx + matchedKeyword.length;
+    offset = bestIdx + matchedTag.length;
  }
 }

@@ -216,7 +180,7 @@ function consumeToolCapture(state, toolNames) {
    return { ready: false, prefix: '', calls: [], suffix: '' };
  }

-  // Try XML tool call extraction first.
+  // XML-only tool call extraction.
  const xmlResult = consumeXMLToolCaptureImpl(captured, toolNames, trimWrappingJSONFence);
  if (xmlResult.ready) {
    return xmlResult;
@@ -226,50 +190,12 @@ function consumeToolCapture(state, toolNames) {
    return { ready: false, prefix: '', calls: [], suffix: '' };
  }

-  const lower = captured.toLowerCase();
-  const { index: keyIdx } = earliestKeywordIndex(lower, TOOL_SEGMENT_KEYWORDS);
-  if (keyIdx < 0) {
-    return { ready: false, prefix: '', calls: [], suffix: '' };
-  }
-  const start = captured.slice(0, keyIdx).lastIndexOf('{');
-  const actualStart = start >= 0 ? start : keyIdx;
-  const obj = extractJSONObjectFrom(captured, actualStart);
-  if (!obj.ok) {
-    return { ready: false, prefix: '', calls: [], suffix: '' };
-  }
-  const prefixPart = captured.slice(0, actualStart);
-  const suffixPart = captured.slice(obj.end);
-  if (insideCodeFenceWithState(state, prefixPart)) {
-    return {
-      ready: true,
-      prefix: captured,
-      calls: [],
-      suffix: '',
-    };
-  }
-  const parsed = parseStandaloneToolCallsDetailed(captured.slice(actualStart, obj.end), toolNames);
-  if (!Array.isArray(parsed.calls) || parsed.calls.length === 0) {
-    if (parsed.sawToolCallSyntax && parsed.rejectedByPolicy) {
-      return {
-        ready: true,
-        prefix: prefixPart,
-        calls: [],
-        suffix: suffixPart,
-      };
-    }
-    return {
-      ready: true,
-      prefix: captured,
-      calls: [],
-      suffix: '',
-    };
-  }
-  const trimmedFence = trimWrappingJSONFence(prefixPart, suffixPart);
+  // No XML tool tags detected — release captured content as text.
  return {
    ready: true,
-    prefix: trimmedFence.prefix,
-    calls: parsed.calls,
-    suffix: trimmedFence.suffix,
+    prefix: captured,
+    calls: [],
+    suffix: '',
  };
 }

--- a/internal/js/helpers/stream-tool-sieve/tool-keywords.js
+++ b/internal/js/helpers/stream-tool-sieve/tool-keywords.js
@@ -1,15 +1,7 @@
 'use strict';

-const TOOL_SEGMENT_KEYWORDS = [
-  'tool_calls',
-  '"function"',
-  'function.name:',
-  'functioncall',
-  '"tool_use"',
-];
-
 const XML_TOOL_SEGMENT_TAGS = [
-  '<tool_calls>', '<tool_calls\n', '<tool_call>', '<tool_call\n',
+  '<tool_calls>', '<tool_calls\n', '<tool_calls ', '<tool_call>', '<tool_call\n', '<tool_call ',
  '<invoke ', '<invoke>', '<function_call', '<function_calls', '<tool_use>',
 ];

@@ -21,26 +13,9 @@ const XML_TOOL_CLOSING_TAGS = [
  '</tool_calls>', '</tool_call>', '</invoke>', '</function_call>', '</function_calls>', '</tool_use>',
 ];

-function earliestKeywordIndex(text, keywords = TOOL_SEGMENT_KEYWORDS, offset = 0) {
-  if (!text) {
-    return { index: -1, keyword: '' };
-  }
-  let index = -1;
-  let keyword = '';
-  for (const kw of keywords) {
-    const candidate = text.indexOf(kw, offset);
-    if (candidate >= 0 && (index < 0 || candidate < index)) {
-      index = candidate;
-      keyword = kw;
-    }
-  }
-  return { index, keyword };
-}
-
 module.exports = {
-  TOOL_SEGMENT_KEYWORDS,
  XML_TOOL_SEGMENT_TAGS,
  XML_TOOL_OPENING_TAGS,
  XML_TOOL_CLOSING_TAGS,
-  earliestKeywordIndex,
 };
+
--- a/internal/prompt/messages.go
+++ b/internal/prompt/messages.go
@@ -10,6 +10,7 @@ import (
 var markdownImagePattern = regexp.MustCompile(`!\[(.*?)\]\((.*?)\)`)

 const (
+	beginSentenceMarker   = "<｜begin▁of▁sentence｜>"
 	systemMarker          = "<｜System｜>"
 	userMarker            = "<｜User｜>"
 	assistantMarker       = "<｜Assistant｜>"
@@ -20,11 +21,20 @@ const (
 )

 func MessagesPrepare(messages []map[string]any) string {
+	return MessagesPrepareWithThinking(messages, false)
+}
+
+func MessagesPrepareWithThinking(messages []map[string]any, thinkingEnabled bool) string {
 	type block struct {
 		Role string
 		Text string
 	}
 	processed := make([]block, 0, len(messages))
+	if thinkingEnabled {
+		if instruction := buildConversationContinuityInstructions(thinkingEnabled); strings.TrimSpace(instruction) != "" {
+			processed = append(processed, block{Role: "system", Text: instruction})
+		}
+	}
 	for _, m := range messages {
 		role, _ := m["role"].(string)
 		text := NormalizeContent(m["content"])
@@ -41,8 +51,11 @@ func MessagesPrepare(messages []map[string]any) string {
 		}
 		merged = append(merged, msg)
 	}
-	parts := make([]string, 0, len(merged))
+	parts := make([]string, 0, len(merged)+2)
+	parts = append(parts, beginSentenceMarker)
+	lastRole := ""
 	for _, m := range merged {
+		lastRole = m.Role
 		switch m.Role {
 		case "assistant":
 			parts = append(parts, formatRoleBlock(assistantMarker, m.Text, endSentenceMarker))
@@ -55,26 +68,42 @@ func MessagesPrepare(messages []map[string]any) string {
 				parts = append(parts, formatRoleBlock(systemMarker, text, endInstructionsMarker))
 			}
 		case "user":
-			parts = append(parts, formatRoleBlock(userMarker, m.Text, endSentenceMarker))
+			parts = append(parts, formatRoleBlock(userMarker, m.Text, ""))
 		default:
 			if strings.TrimSpace(m.Text) != "" {
 				parts = append(parts, m.Text)
 			}
 		}
 	}
-	out := strings.Join(parts, "\n\n")
+	if lastRole != "assistant" {
+		parts = append(parts, assistantMarker)
+	}
+	out := strings.Join(parts, "")
 	return markdownImagePattern.ReplaceAllString(out, `[${1}](${2})`)
 }

-// DeepSeek-style turn suffixes stay attached to the same block as the role content.
+// formatRoleBlock produces a single concatenated block: marker + text + endMarker.
+// No whitespace is inserted between marker and text so role boundaries stay
+// compact and predictable for downstream parsers.
 func formatRoleBlock(marker, text, endMarker string) string {
-	out := marker + "\n" + text
+	out := marker + text
 	if strings.TrimSpace(endMarker) != "" {
 		out += endMarker
 	}
 	return out
 }

+func buildConversationContinuityInstructions(thinkingEnabled bool) string {
+	lines := []string{
+		"Continue the conversation from the full prior context and the latest tool results.",
+		"Treat earlier messages as binding context; answer the user's current request as a continuation, not a restart.",
+	}
+	if thinkingEnabled {
+		lines = append(lines, "Keep reasoning internal. Do not leave the final user-facing answer only in reasoning; always provide the answer in visible assistant content.")
+	}
+	return strings.Join(lines, "\n")
+}
+
 func NormalizeContent(v any) string {
 	if v == nil {
 		return ""
--- a/internal/prompt/messages_test.go
+++ b/internal/prompt/messages_test.go
@@ -32,15 +32,21 @@ func TestMessagesPrepareUsesTurnSuffixes(t *testing.T) {
 		{"role": "assistant", "content": "Answer"},
 	}
 	got := MessagesPrepare(messages)
-	if !strings.Contains(got, "<｜System｜>\nSystem rule<｜end▁of▁instructions｜>") {
+	if !strings.HasPrefix(got, "<｜begin▁of▁sentence｜>") {
+		t.Fatalf("expected begin-of-sentence marker, got %q", got)
+	}
+	if !strings.Contains(got, "<｜System｜>System rule<｜end▁of▁instructions｜>") {
 		t.Fatalf("expected system instructions suffix, got %q", got)
 	}
-	if !strings.Contains(got, "<｜User｜>\nQuestion<｜end▁of▁sentence｜>") {
-		t.Fatalf("expected user sentence suffix, got %q", got)
+	if !strings.Contains(got, "<｜User｜>Question") {
+		t.Fatalf("expected user question, got %q", got)
 	}
-	if !strings.Contains(got, "<｜Assistant｜>\nAnswer<｜end▁of▁sentence｜>") {
+	if !strings.Contains(got, "<｜Assistant｜>Answer<｜end▁of▁sentence｜>") {
 		t.Fatalf("expected assistant sentence suffix, got %q", got)
 	}
+	if strings.Contains(got, "<think>") || strings.Contains(got, "</think>") {
+		t.Fatalf("did not expect think tags in prompt, got %q", got)
+	}
 }

 func TestNormalizeContentArrayFallsBackToContentWhenTextEmpty(t *testing.T) {
@@ -51,3 +57,24 @@ func TestNormalizeContentArrayFallsBackToContentWhenTextEmpty(t *testing.T) {
 		t.Fatalf("expected fallback to content when text is empty, got %q", got)
 	}
 }
+
+func TestMessagesPrepareWithThinkingAddsContinuityContract(t *testing.T) {
+	messages := []map[string]any{{"role": "user", "content": "Question"}}
+	gotThinking := MessagesPrepareWithThinking(messages, true)
+	gotPlain := MessagesPrepareWithThinking(messages, false)
+	if gotThinking == gotPlain {
+		t.Fatalf("expected thinking-enabled prompt to include extra continuity instructions")
+	}
+	if !strings.HasSuffix(gotThinking, "<｜Assistant｜>") {
+		t.Fatalf("expected assistant suffix, got %q", gotThinking)
+	}
+	if !strings.Contains(gotThinking, "Continue the conversation from the full prior context") {
+		t.Fatalf("expected continuity instruction in thinking prompt, got %q", gotThinking)
+	}
+	if !strings.Contains(gotThinking, "final user-facing answer only in reasoning") {
+		t.Fatalf("expected visible-answer instruction in thinking prompt, got %q", gotThinking)
+	}
+	if strings.Contains(gotPlain, "Continue the conversation from the full prior context") {
+		t.Fatalf("did not expect thinking-only instruction in plain prompt, got %q", gotPlain)
+	}
+}
--- a/internal/prompt/tool_calls.go
+++ b/internal/prompt/tool_calls.go
@@ -2,6 +2,9 @@ package prompt

 import (
 	"encoding/json"
+	"fmt"
+	"regexp"
+	"sort"
 	"strings"
 )

@@ -11,6 +14,8 @@ var promptXMLTextEscaper = strings.NewReplacer(
 	">", "&gt;",
 )

+var promptXMLNamePattern = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_.:-]*$`)
+
 // FormatToolCallsForPrompt renders a tool_calls slice into the canonical
 // prompt-visible history block used across adapters.
 func FormatToolCallsForPrompt(raw any) string {
@@ -87,12 +92,160 @@ func formatToolCallForPrompt(call map[string]any) string {
 		}
 	}

+	parameters := formatToolCallParametersForPrompt(argsRaw)
+
 	return "  <tool_call>\n" +
 		"    <tool_name>" + escapeXMLText(name) + "</tool_name>\n" +
-		"    <parameters>" + escapeXMLText(StringifyToolCallArguments(argsRaw)) + "</parameters>\n" +
+		parameters + "\n" +
 		"  </tool_call>"
 }

+func formatToolCallParametersForPrompt(raw any) string {
+	value := normalizePromptToolCallValue(raw)
+	body, ok := renderPromptToolXMLBody(value, "      ")
+	if ok {
+		if strings.TrimSpace(body) == "" {
+			return "    <parameters></parameters>"
+		}
+		return "    <parameters>\n" + body + "\n    </parameters>"
+	}
+
+	fallback := StringifyToolCallArguments(raw)
+	if strings.TrimSpace(fallback) == "" {
+		fallback = "{}"
+	}
+	return "    <parameters><content>" + renderPromptXMLText(fallback) + "</content></parameters>"
+}
+
+func normalizePromptToolCallValue(raw any) any {
+	switch x := raw.(type) {
+	case nil:
+		return nil
+	case string:
+		s := strings.TrimSpace(x)
+		if s == "" {
+			return ""
+		}
+		var parsed any
+		if err := json.Unmarshal([]byte(s), &parsed); err == nil {
+			return parsed
+		}
+		return x
+	default:
+		return x
+	}
+}
+
+func renderPromptToolXMLBody(value any, indent string) (string, bool) {
+	switch v := value.(type) {
+	case nil:
+		return "", true
+	case map[string]any:
+		return renderPromptToolXMLMap(v, indent)
+	case []any:
+		return renderPromptToolXMLArray(v, indent)
+	case string:
+		return indent + "<content>" + renderPromptXMLText(v) + "</content>", true
+	case bool, float32, float64, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64:
+		return indent + "<value>" + escapeXMLText(fmt.Sprint(v)) + "</value>", true
+	default:
+		return indent + "<value>" + renderPromptXMLText(fmt.Sprint(v)) + "</value>", true
+	}
+}
+
+func renderPromptToolXMLMap(m map[string]any, indent string) (string, bool) {
+	if len(m) == 0 {
+		return "", true
+	}
+	keys := make([]string, 0, len(m))
+	for k := range m {
+		if !isValidPromptXMLName(k) {
+			return "", false
+		}
+		keys = append(keys, k)
+	}
+	sort.Strings(keys)
+
+	lines := make([]string, 0, len(keys))
+	for _, key := range keys {
+		rendered, ok := renderPromptToolXMLNode(key, m[key], indent)
+		if !ok {
+			return "", false
+		}
+		lines = append(lines, rendered)
+	}
+	return strings.Join(lines, "\n"), true
+}
+
+func renderPromptToolXMLArray(items []any, indent string) (string, bool) {
+	if len(items) == 0 {
+		return "", true
+	}
+	lines := make([]string, 0, len(items))
+	for _, item := range items {
+		rendered, ok := renderPromptToolXMLNode("item", item, indent)
+		if !ok {
+			return "", false
+		}
+		lines = append(lines, rendered)
+	}
+	return strings.Join(lines, "\n"), true
+}
+
+func renderPromptToolXMLNode(name string, value any, indent string) (string, bool) {
+	if !isValidPromptXMLName(name) {
+		return "", false
+	}
+	switch v := value.(type) {
+	case nil:
+		return indent + "<" + name + "></" + name + ">", true
+	case map[string]any:
+		inner, ok := renderPromptToolXMLMap(v, indent+"  ")
+		if !ok {
+			return "", false
+		}
+		if strings.TrimSpace(inner) == "" {
+			return indent + "<" + name + "></" + name + ">", true
+		}
+		return indent + "<" + name + ">\n" + inner + "\n" + indent + "</" + name + ">", true
+	case []any:
+		if len(v) == 0 {
+			return indent + "<" + name + "></" + name + ">", true
+		}
+		lines := make([]string, 0, len(v))
+		for _, item := range v {
+			rendered, ok := renderPromptToolXMLNode(name, item, indent)
+			if !ok {
+				return "", false
+			}
+			lines = append(lines, rendered)
+		}
+		return strings.Join(lines, "\n"), true
+	case string:
+		return indent + "<" + name + ">" + renderPromptXMLText(v) + "</" + name + ">", true
+	case bool, float32, float64, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64:
+		return indent + "<" + name + ">" + escapeXMLText(fmt.Sprint(v)) + "</" + name + ">", true
+	default:
+		return indent + "<" + name + ">" + renderPromptXMLText(fmt.Sprint(v)) + "</" + name + ">", true
+	}
+}
+
+// renderPromptXMLText emits CDATA for every string so prompt-visible tool
+// history stays uniform and does not drift back toward ad-hoc escaping.
+func renderPromptXMLText(text string) string {
+	if text == "" {
+		return ""
+	}
+	if strings.Contains(text, "]]>") {
+		return "<![CDATA[" + strings.ReplaceAll(text, "]]>", "]]]]><![CDATA[>") + "]]>"
+	}
+	return "<![CDATA[" + text + "]]>"
+}
+
+func isValidPromptXMLName(name string) bool {
+	return promptXMLNamePattern.MatchString(strings.TrimSpace(name))
+}
+
 func normalizeToolArgumentString(raw string) string {
 	trimmed := strings.TrimSpace(raw)
 	if trimmed == "" {
--- a/internal/prompt/tool_calls_test.go
+++ b/internal/prompt/tool_calls_test.go
@@ -22,7 +22,7 @@ func TestFormatToolCallsForPromptXML(t *testing.T) {
 	if got == "" {
 		t.Fatal("expected non-empty formatted tool calls")
 	}
-	if got != "<tool_calls>\n  <tool_call>\n    <tool_name>search_web</tool_name>\n    <parameters>{\"query\":\"latest\"}</parameters>\n  </tool_call>\n</tool_calls>" {
+	if got != "<tool_calls>\n  <tool_call>\n    <tool_name>search_web</tool_name>\n    <parameters>\n      <query><![CDATA[latest]]></query>\n    </parameters>\n  </tool_call>\n</tool_calls>" {
 		t.Fatalf("unexpected formatted tool call XML: %q", got)
 	}
 }
@@ -34,8 +34,24 @@ func TestFormatToolCallsForPromptEscapesXMLEntities(t *testing.T) {
 			"arguments": `{"q":"a < b && c > d"}`,
 		},
 	})
-	want := "<tool_calls>\n  <tool_call>\n    <tool_name>search&lt;&amp;&gt;</tool_name>\n    <parameters>{\"q\":\"a &lt; b &amp;&amp; c &gt; d\"}</parameters>\n  </tool_call>\n</tool_calls>"
+	want := "<tool_calls>\n  <tool_call>\n    <tool_name>search&lt;&amp;&gt;</tool_name>\n    <parameters>\n      <q><![CDATA[a < b && c > d]]></q>\n    </parameters>\n  </tool_call>\n</tool_calls>"
 	if got != want {
 		t.Fatalf("unexpected escaped tool call XML: %q", got)
 	}
 }
+
+func TestFormatToolCallsForPromptUsesCDATAForMultilineContent(t *testing.T) {
+	got := FormatToolCallsForPrompt([]any{
+		map[string]any{
+			"name": "write_file",
+			"arguments": map[string]any{
+				"path":    "script.sh",
+				"content": "#!/bin/bash\nprintf \"hello\"\n",
+			},
+		},
+	})
+	want := "<tool_calls>\n  <tool_call>\n    <tool_name>write_file</tool_name>\n    <parameters>\n      <content><![CDATA[#!/bin/bash\nprintf \"hello\"\n]]></content>\n      <path><![CDATA[script.sh]]></path>\n    </parameters>\n  </tool_call>\n</tool_calls>"
+	if got != want {
+		t.Fatalf("unexpected multiline cdata tool call XML: %q", got)
+	}
+}
--- a/internal/sse/parser.go
+++ b/internal/sse/parser.go
@@ -3,6 +3,7 @@ package sse
 import (
 	"bytes"
 	"encoding/json"
+	"regexp"
 	"strings"

 	"ds2api/internal/deepseek"
@@ -93,6 +94,11 @@ func ParseSSEChunkForContent(chunk map[string]any, thinkingEnabled bool, current
 	if finished {
 		return nil, true, newType
 	}
+	var transitioned bool
+	parts, transitioned = splitThinkingParts(parts)
+	if transitioned {
+		newType = "text"
+	}
 	return parts, false, newType
 }

@@ -166,6 +172,9 @@ func updateTypeFromNestedResponse(path string, v any, newType *string) {
 func resolvePartType(path string, thinkingEnabled bool, newType string) string {
 	switch {
 	case path == "response/thinking_content":
+		if newType == "text" {
+			return "text"
+		}
 		return "thinking"
 	case path == "response/content":
 		return "text"
@@ -244,6 +253,63 @@ func appendContentPart(parts *[]ContentPart, content, kind string) {
 	*parts = append(*parts, ContentPart{Text: content, Type: kind})
 }

+var thinkClosePattern = regexp.MustCompile(`(?i)</\s*think\s*>`)
+var thinkOpenPattern = regexp.MustCompile(`(?i)<\s*think\s*>`)
+
+// splitThinkingParts detects </think> inside thinking content and
+// auto-transitions everything after it to text. This handles the
+// DeepSeek API bug where the upstream SSE keeps sending
+// reasoning_content even though the model has finished thinking.
+func splitThinkingParts(parts []ContentPart) ([]ContentPart, bool) {
+	var out []ContentPart
+	thinkingDone := false
+	for _, p := range parts {
+		if thinkingDone && p.Type == "thinking" {
+			// Already transitioned — treat remaining thinking as text.
+			cleaned := stripThinkTags(p.Text)
+			if cleaned != "" {
+				out = append(out, ContentPart{Text: cleaned, Type: "text"})
+			}
+			continue
+		}
+		if p.Type != "thinking" {
+			cleaned := stripThinkTags(p.Text)
+			if cleaned != "" {
+				out = append(out, ContentPart{Text: cleaned, Type: p.Type})
+			}
+			continue
+		}
+		loc := thinkClosePattern.FindStringIndex(p.Text)
+		if loc == nil {
+			out = append(out, p)
+			continue
+		}
+		// Split at </think>: before is still thinking, after becomes text.
+		thinkingDone = true
+		before := p.Text[:loc[0]]
+		after := p.Text[loc[1]:]
+		if before != "" {
+			out = append(out, ContentPart{Text: before, Type: "thinking"})
+		}
+		after = stripThinkTags(after)
+		if after != "" {
+			out = append(out, ContentPart{Text: after, Type: "text"})
+		}
+	}
+	if !thinkingDone {
+		// Return 'out' instead of 'parts' because text parts might have been cleaned via stripThinkTags
+		return out, false
+	}
+	return out, true
+}
+
+// stripThinkTags removes any remaining <think> or </think> tags from text.
+func stripThinkTags(s string) string {
+	s = thinkClosePattern.ReplaceAllString(s, "")
+	s = thinkOpenPattern.ReplaceAllString(s, "")
+	return s
+}
+
 func isStatusPath(path string) bool {
 	return path == "response/status" || path == "status"
 }
--- a/internal/sse/parser_test.go
+++ b/internal/sse/parser_test.go
@@ -87,3 +87,79 @@ func TestParseSSEChunkForContentAfterAppendUsesUpdatedType(t *testing.T) {
 		t.Fatalf("unexpected parts: %#v", parts)
 	}
 }
+
+func TestParseSSEChunkForContentAutoTransitionsThinkClose(t *testing.T) {
+	chunk := map[string]any{
+		"p": "response/thinking_content",
+		"v": "deep thoughts</think>actual answer",
+	}
+	parts, _, _ := ParseSSEChunkForContent(chunk, true, "thinking")
+	if len(parts) != 2 {
+		t.Fatalf("expected 2 parts from split, got %d: %#v", len(parts), parts)
+	}
+	if parts[0].Type != "thinking" || parts[0].Text != "deep thoughts" {
+		t.Fatalf("first part should be thinking: %#v", parts[0])
+	}
+	if parts[1].Type != "text" || parts[1].Text != "actual answer" {
+		t.Fatalf("second part should be text: %#v", parts[1])
+	}
+}
+
+func TestParseSSEChunkForContentStripsLeakedThinkTags(t *testing.T) {
+	chunk := map[string]any{
+		"p": "response/thinking_content",
+		"v": "<think>more thoughts</think>  answer",
+	}
+	parts, _, _ := ParseSSEChunkForContent(chunk, true, "thinking")
+	if len(parts) != 2 {
+		t.Fatalf("expected 2 parts, got %d: %#v", len(parts), parts)
+	}
+	if parts[0].Type != "thinking" || parts[0].Text != "<think>more thoughts" {
+		// note: the open tag is before the split, so it remains in the thinking part.
+		// that's fine, the output sanitization handles the final string.
+		t.Fatalf("first part mismatch: %#v", parts[0])
+	}
+	if parts[1].Type != "text" || parts[1].Text != "  answer" {
+		t.Fatalf("second part mismatch: %#v", parts[1])
+	}
+}
+
+func TestParseSSEChunkForContentAutoTransitionsState(t *testing.T) {
+	chunk1 := map[string]any{
+		"p": "response/thinking_content",
+		"v": "end of thought</think>start of text",
+	}
+	parts1, _, nextType1 := ParseSSEChunkForContent(chunk1, true, "thinking")
+	if len(parts1) != 2 || parts1[1].Type != "text" {
+		t.Fatalf("expected split parts, got %#v", parts1)
+	}
+	if nextType1 != "text" {
+		t.Fatalf("expected nextType to transition to text, got %q", nextType1)
+	}
+
+	chunk2 := map[string]any{
+		"p": "response/thinking_content",
+		"v": "more actual text sent to thinking path",
+	}
+	parts2, _, nextType2 := ParseSSEChunkForContent(chunk2, true, nextType1)
+	if len(parts2) != 1 || parts2[0].Type != "text" {
+		t.Fatalf("expected subsequent parts to be text, got %#v", parts2)
+	}
+	if nextType2 != "text" {
+		t.Fatalf("expected nextType2 to remain text, got %q", nextType2)
+	}
+}
+
+func TestParseSSEChunkForContentStripsLeakedThinkTagsFromText(t *testing.T) {
+	chunk := map[string]any{
+		"p": "response/content", // This makes the part type "text"
+		"v": "normal text <think>leaked</think> end",
+	}
+	parts, _, _ := ParseSSEChunkForContent(chunk, true, "text")
+	if len(parts) != 1 {
+		t.Fatalf("expected 1 part, got %d: %#v", len(parts), parts)
+	}
+	if parts[0].Type != "text" || parts[0].Text != "normal text leaked end" {
+		t.Fatalf("expected leaked think tag to be stripped, got %#v", parts[0])
+	}
+}
--- a/internal/toolcall/regression_test.go
+++ b/internal/toolcall/regression_test.go
@@ -0,0 +1,81 @@
+package toolcall
+
+import (
+	"reflect"
+	"testing"
+)
+
+func TestRegression_RobustXMLAndCDATA(t *testing.T) {
+	tests := []struct {
+		name     string
+		text     string
+		expected []ParsedToolCall
+	}{
+		{
+			name:     "Standard JSON parameters (Regression)",
+			text:     `<tool_call><tool_name>foo</tool_name><parameters>{"a": 1}</parameters></tool_call>`,
+			expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"a": float64(1)}}},
+		},
+		{
+			name:     "XML tags parameters (Regression)",
+			text:     `<tool_call><tool_name>foo</tool_name><parameters><arg1>hello</arg1></parameters></tool_call>`,
+			expected: []ParsedToolCall{{Name: "foo", Input: map[string]any{"arg1": "hello"}}},
+		},
+		{
+			name: "CDATA parameters (New Feature)",
+			text: `<tool_call><tool_name>write_file</tool_name><parameters><content><![CDATA[line 1
+line 2 with <tags> and & symbols]]></content></parameters></tool_call>`,
+			expected: []ParsedToolCall{{
+				Name:  "write_file",
+				Input: map[string]any{"content": "line 1\nline 2 with <tags> and & symbols"},
+			}},
+		},
+		{
+			name: "Nested XML with repeated parameters (New Feature)",
+			text: `<tool_call><tool_name>write_file</tool_name><parameters><path>script.sh</path><content><![CDATA[#!/bin/bash
+echo "hello"
+]]></content><item>first</item><item>second</item></parameters></tool_call>`,
+			expected: []ParsedToolCall{{
+				Name: "write_file",
+				Input: map[string]any{
+					"path":    "script.sh",
+					"content": "#!/bin/bash\necho \"hello\"\n",
+					"item":    []any{"first", "second"},
+				},
+			}},
+		},
+		{
+			name: "Dirty XML with unescaped symbols (Robustness Improvement)",
+			text: `<tool_call><tool_name>bash</tool_name><parameters><command>echo "hello" > out.txt && cat out.txt</command></parameters></tool_call>`,
+			expected: []ParsedToolCall{{
+				Name:  "bash",
+				Input: map[string]any{"command": "echo \"hello\" > out.txt && cat out.txt"},
+			}},
+		},
+		{
+			name: "Mixed JSON inside CDATA (New Hybrid Case)",
+			text: `<tool_call><tool_name>foo</tool_name><parameters><![CDATA[{"json_param": "works"}]]></parameters></tool_call>`,
+			expected: []ParsedToolCall{{
+				Name:  "foo",
+				Input: map[string]any{"json_param": "works"},
+			}},
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := ParseToolCalls(tt.text, []string{"foo", "write_file", "bash"})
+			if len(got) != len(tt.expected) {
+				t.Fatalf("expected %d calls, got %d", len(tt.expected), len(got))
+			}
+			for i := range got {
+				if got[i].Name != tt.expected[i].Name {
+					t.Errorf("expected name %q, got %q", tt.expected[i].Name, got[i].Name)
+				}
+				if !reflect.DeepEqual(got[i].Input, tt.expected[i].Input) {
+					t.Errorf("expected input %#v, got %#v", tt.expected[i].Input, got[i].Input)
+				}
+			}
+		})
+	}
+}
--- a/internal/toolcall/tool_prompt.go
+++ b/internal/toolcall/tool_prompt.go
@@ -36,45 +36,47 @@ func BuildToolCallInstructions(toolNames []string) string {

 	return `TOOL CALL FORMAT — FOLLOW EXACTLY:

-When calling tools, emit ONLY raw XML at the very end of your response. No text before, no text after, no markdown fences.
-
 <tool_calls>
  <tool_call>
    <tool_name>TOOL_NAME_HERE</tool_name>
-    <parameters>{"key":"value"}</parameters>
+    <parameters>
+      <PARAMETER_NAME><![CDATA[PARAMETER_VALUE]]></PARAMETER_NAME>
+    </parameters>
  </tool_call>
 </tool_calls>

 RULES:
-1) When calling tools, you MUST use the <tool_calls> XML format.
-2) No text is allowed AFTER the XML block.
-3) <parameters> MUST be a single-line strict JSON object. Use double quotes.
-4) Multiple tools must be inside the same <tool_calls> root.
-5) Do NOT wrap XML in markdown fences (` + "```" + `).
-6) Do NOT invent parameters. Use only the provided schema.
-7) CRITICAL: Do NOT use native tool markers like "<｜Tool｜>" or "<｜tool｜>".
-8) CRITICAL: Do NOT output role markers like "<｜System｜>", "<｜User｜>", or "<｜Assistant｜>".
-9) CRITICAL: Do NOT output internal monologues (e.g. "I will list files now..."). Just output your answer or the XML.
+1) Use the <tool_calls> XML format only. Never emit JSON or function-call syntax.
+2) Put one or more <tool_call> entries under a single <tool_calls> root.
+3) Parameters must be XML, not JSON.
+4) All string values must use <![CDATA[...]]>, even short ones. This includes code, scripts, file contents, prompts, paths, names, and queries.
+5) Objects use nested XML elements. Arrays may repeat the same tag or use <item> children.
+6) Numbers, booleans, and null stay plain text.
+7) Use only the parameter names in the tool schema. Do not invent fields.
+8) Do NOT wrap XML in markdown fences. Do NOT output explanations, role markers, or internal monologue.
+
+PARAMETER SHAPES:
+- string => <name><![CDATA[value]]></name>
+- object => nested XML elements
+- array => repeated tags or <item> children
+- number/bool/null => plain text
+
+【WRONG — Do NOT do these】:

-❌ WRONG — Do NOT do these:
 Wrong 1 — mixed text after XML:
  <tool_calls>...</tool_calls> I hope this helps.
 Wrong 2 — function-call syntax:
  Grep({"pattern": "token"})
-Wrong 3 — missing <tool_calls> wrapper:
-  <tool_call><tool_name>` + ex1 + `</tool_name><parameters>{}</parameters></tool_call>
+Wrong 3 — JSON parameters:
+  <tool_call><tool_name>` + ex1 + `</tool_name><parameters>{"path":"x"}</parameters></tool_call>
 Wrong 4 — Markdown code fences:
  ` + "```xml" + `
  <tool_calls>...</tool_calls>
  ` + "```" + `
-Wrong 5 — native tool tokens:
-  <｜Tool｜>call_some_tool{"param":1}<｜Tool｜>
-Wrong 6 — role markers in response:
-  <｜Assistant｜> Here is the result...

 Remember: The ONLY valid way to use tools is the <tool_calls> XML block at the end of your response.

-✅ CORRECT EXAMPLES:
+【CORRECT EXAMPLES】:

 Example A — Single tool:
 <tool_calls>
@@ -96,15 +98,31 @@ Example B — Two tools in parallel:
  </tool_call>
 </tool_calls>

-Example C — Tool with complex nested JSON parameters:
+Example C — Tool with nested XML parameters:
 <tool_calls>
  <tool_call>
    <tool_name>` + ex3 + `</tool_name>
    <parameters>` + ex3Params + `</parameters>
  </tool_call>
 </tool_calls>
+ 
+Example D — Tool with long script using CDATA (RELIABLE FOR CODE/SCRIPTS):
+<tool_calls>
+  <tool_call>
+    <tool_name>` + ex2 + `</tool_name>
+    <parameters>
+      <path>` + promptCDATA("script.sh") + `</path>
+      <content><![CDATA[
+#!/bin/bash
+if [ "$1" == "test" ]; then
+  echo "Success!"
+fi
+]]></content>
+    </parameters>
+  </tool_call>
+</tool_calls>

-Remember: Output ONLY the <tool_calls>...</tool_calls> XML block when calling tools.`
+`
 }

 func matchAny(name string, candidates ...string) bool {
@@ -119,34 +137,44 @@ func matchAny(name string, candidates ...string) bool {
 func exampleReadParams(name string) string {
 	switch strings.TrimSpace(name) {
 	case "Read":
-		return `{"file_path":"README.md"}`
+		return `<file_path>` + promptCDATA("README.md") + `</file_path>`
 	case "Glob":
-		return `{"pattern":"**/*.go","path":"."}`
+		return `<pattern>` + promptCDATA("**/*.go") + `</pattern><path>` + promptCDATA(".") + `</path>`
 	default:
-		return `{"path":"src/main.go"}`
+		return `<path>` + promptCDATA("src/main.go") + `</path>`
 	}
 }

 func exampleWriteOrExecParams(name string) string {
 	switch strings.TrimSpace(name) {
 	case "Bash", "execute_command":
-		return `{"command":"pwd"}`
+		return `<command>` + promptCDATA("pwd") + `</command>`
 	case "exec_command":
-		return `{"cmd":"pwd"}`
+		return `<cmd>` + promptCDATA("pwd") + `</cmd>`
 	case "Edit":
-		return `{"file_path":"README.md","old_string":"foo","new_string":"bar"}`
+		return `<file_path>` + promptCDATA("README.md") + `</file_path><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string>`
 	case "MultiEdit":
-		return `{"file_path":"README.md","edits":[{"old_string":"foo","new_string":"bar"}]}`
+		return `<file_path>` + promptCDATA("README.md") + `</file_path><edits><old_string>` + promptCDATA("foo") + `</old_string><new_string>` + promptCDATA("bar") + `</new_string></edits>`
 	default:
-		return `{"path":"output.txt","content":"Hello world"}`
+		return `<path>` + promptCDATA("output.txt") + `</path><content>` + promptCDATA("Hello world") + `</content>`
 	}
 }

 func exampleInteractiveParams(name string) string {
 	switch strings.TrimSpace(name) {
 	case "Task":
-		return `{"description":"Investigate flaky tests","prompt":"Run targeted tests and summarize failures"}`
+		return `<description>` + promptCDATA("Investigate flaky tests") + `</description><prompt>` + promptCDATA("Run targeted tests and summarize failures") + `</prompt>`
 	default:
-		return `{"question":"Which approach do you prefer?","follow_up":[{"text":"Option A"},{"text":"Option B"}]}`
+		return `<question>` + promptCDATA("Which approach do you prefer?") + `</question><follow_up><text>` + promptCDATA("Option A") + `</text></follow_up><follow_up><text>` + promptCDATA("Option B") + `</text></follow_up>`
 	}
 }
+
+func promptCDATA(text string) string {
+	if text == "" {
+		return ""
+	}
+	if strings.Contains(text, "]]>") {
+		return "<![CDATA[" + strings.ReplaceAll(text, "]]>", "]]]]><![CDATA[>") + "]]>"
+	}
+	return "<![CDATA[" + text + "]]>"
+}
--- a/internal/toolcall/tool_prompt_test.go
+++ b/internal/toolcall/tool_prompt_test.go
@@ -10,7 +10,7 @@ func TestBuildToolCallInstructions_ExecCommandUsesCmdExample(t *testing.T) {
 	if !strings.Contains(out, `<tool_name>exec_command</tool_name>`) {
 		t.Fatalf("expected exec_command in examples, got: %s", out)
 	}
-	if !strings.Contains(out, `<parameters>{"cmd":"pwd"}</parameters>`) {
+	if !strings.Contains(out, `<parameters><cmd><![CDATA[pwd]]></cmd></parameters>`) {
 		t.Fatalf("expected cmd parameter example for exec_command, got: %s", out)
 	}
 }
@@ -20,7 +20,7 @@ func TestBuildToolCallInstructions_ExecuteCommandUsesCommandExample(t *testing.T
 	if !strings.Contains(out, `<tool_name>execute_command</tool_name>`) {
 		t.Fatalf("expected execute_command in examples, got: %s", out)
 	}
-	if !strings.Contains(out, `<parameters>{"command":"pwd"}</parameters>`) {
+	if !strings.Contains(out, `<parameters><command><![CDATA[pwd]]></command></parameters>`) {
 		t.Fatalf("expected command parameter example for execute_command, got: %s", out)
 	}
 }
--- a/internal/toolcall/toolcall_edge_test.go
+++ b/internal/toolcall/toolcall_edge_test.go
@@ -4,7 +4,7 @@ import (
 	"testing"
 )

-// ─── FormatOpenAIStreamToolCalls ─────────────────────────────────────
+// --- FormatOpenAIStreamToolCalls ---

 func TestFormatOpenAIStreamToolCalls(t *testing.T) {
 	formatted := FormatOpenAIStreamToolCalls([]ParsedToolCall{
@@ -22,15 +22,7 @@ func TestFormatOpenAIStreamToolCalls(t *testing.T) {
 	}
 }

-// ─── ParseToolCalls more edge cases ──────────────────────────────────
-
-func TestParseToolCallsNoToolNames(t *testing.T) {
-	text := `{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`
-	calls := ParseToolCalls(text, nil)
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call with nil tool names, got %d", len(calls))
-	}
-}
+// --- ParseToolCalls edge cases ---

 func TestParseToolCallsEmptyText(t *testing.T) {
 	calls := ParseToolCalls("", []string{"search"})
@@ -38,55 +30,3 @@ func TestParseToolCallsEmptyText(t *testing.T) {
 		t.Fatalf("expected 0 calls for empty text, got %d", len(calls))
 	}
 }
-
-func TestParseToolCallsMultipleTools(t *testing.T) {
-	text := `{"tool_calls":[{"name":"search","input":{"q":"go"}},{"name":"get_weather","input":{"city":"beijing"}}]}`
-	calls := ParseToolCalls(text, []string{"search", "get_weather"})
-	if len(calls) != 2 {
-		t.Fatalf("expected 2 calls, got %d", len(calls))
-	}
-}
-
-func TestParseToolCallsInputAsString(t *testing.T) {
-	text := `{"tool_calls":[{"name":"search","input":"{\"q\":\"golang\"}"}]}`
-	calls := ParseToolCalls(text, []string{"search"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Input["q"] != "golang" {
-		t.Fatalf("expected parsed string input, got %#v", calls[0].Input)
-	}
-}
-
-func TestParseToolCallsWithFunctionWrapper(t *testing.T) {
-	text := `{"tool_calls":[{"function":{"name":"calc","arguments":{"x":1,"y":2}}}]}`
-	calls := ParseToolCalls(text, []string{"calc"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Name != "calc" {
-		t.Fatalf("expected calc, got %q", calls[0].Name)
-	}
-}
-
-func TestParseStandaloneToolCallsFencedCodeBlock(t *testing.T) {
-	fenced := "Here's an example:\n```json\n{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}\n```\nDon't execute this."
-	calls := ParseStandaloneToolCalls(fenced, []string{"search"})
-	if len(calls) != 0 {
-		t.Fatalf("expected fenced code block to be ignored, got %d calls", len(calls))
-	}
-}
-
-// ─── looksLikeToolExampleContext ─────────────────────────────────────
-
-func TestLooksLikeToolExampleContextNone(t *testing.T) {
-	if looksLikeToolExampleContext("I will call the tool now") {
-		t.Fatal("expected false for non-example context")
-	}
-}
-
-func TestLooksLikeToolExampleContextFenced(t *testing.T) {
-	if !looksLikeToolExampleContext("```json") {
-		t.Fatal("expected true for fenced code block context")
-	}
-}
--- a/internal/toolcall/toolcalls_candidates.go
+++ b/internal/toolcall/toolcalls_candidates.go
@@ -1,205 +1,4 @@
 package toolcall

-import (
-	"regexp"
-	"strings"
-)
-
-var toolCallPattern = regexp.MustCompile(`\{\s*["']tool_calls["']\s*:\s*\[(.*?)\]\s*\}`)
-var fencedJSONPattern = regexp.MustCompile("(?s)```(?:json)?\\s*(.*?)\\s*```")
-var fencedCodeBlockPattern = regexp.MustCompile("(?s)```[\\s\\S]*?```")
-
-//nolint:unused // retained for future markup tool-call heuristics.
-var markupToolSyntaxPattern = regexp.MustCompile(`(?i)<(?:(?:[a-z0-9_:-]+:)?(?:tool_call|function_call|invoke)\b|(?:[a-z0-9_:-]+:)?function_calls\b|(?:[a-z0-9_:-]+:)?tool_use\b)`)
-
-func buildToolCallCandidates(text string) []string {
-	trimmed := strings.TrimSpace(text)
-	candidates := []string{trimmed}
-
-	// fenced code block candidates: ```json ... ```
-	for _, match := range fencedJSONPattern.FindAllStringSubmatch(trimmed, -1) {
-		if len(match) >= 2 {
-			candidates = append(candidates, strings.TrimSpace(match[1]))
-		}
-	}
-
-	// best-effort extraction around tool call keywords in mixed text payloads.
-	candidates = append(candidates, extractToolCallObjects(trimmed)...)
-
-	// best-effort object slice: from first '{' to last '}'
-	first := strings.Index(trimmed, "{")
-	last := strings.LastIndex(trimmed, "}")
-	if first >= 0 && last > first {
-		candidates = append(candidates, strings.TrimSpace(trimmed[first:last+1]))
-	}
-	// best-effort array slice: from first '[' to last ']'
-	firstArr := strings.Index(trimmed, "[")
-	lastArr := strings.LastIndex(trimmed, "]")
-	if firstArr >= 0 && lastArr > firstArr {
-		candidates = append(candidates, strings.TrimSpace(trimmed[firstArr:lastArr+1]))
-	}
-
-	// legacy regex extraction fallback
-	if m := toolCallPattern.FindStringSubmatch(trimmed); len(m) >= 2 {
-		candidates = append(candidates, "{"+`"tool_calls":[`+m[1]+"]}")
-	}
-
-	uniq := make([]string, 0, len(candidates))
-	seen := map[string]struct{}{}
-	for _, c := range candidates {
-		if c == "" {
-			continue
-		}
-		if _, ok := seen[c]; ok {
-			continue
-		}
-		seen[c] = struct{}{}
-		uniq = append(uniq, c)
-	}
-	return uniq
-}
-
-func extractToolCallObjects(text string) []string {
-	if text == "" {
-		return nil
-	}
-	lower := strings.ToLower(text)
-	out := []string{}
-	offset := 0
-	keywords := []string{"tool_calls", "\"function\"", "function.name:", "functioncall", "\"tool_use\""}
-	for {
-		bestIdx := -1
-		matchedKeyword := ""
-		for _, kw := range keywords {
-			idx := strings.Index(lower[offset:], kw)
-			if idx >= 0 {
-				absIdx := offset + idx
-				if bestIdx < 0 || absIdx < bestIdx {
-					bestIdx = absIdx
-					matchedKeyword = kw
-				}
-			}
-		}
-
-		if bestIdx < 0 {
-			break
-		}
-
-		idx := bestIdx
-		// Avoid backtracking too far to prevent OOM on malicious or very long strings
-		searchLimit := idx - 2000
-		if searchLimit < offset {
-			searchLimit = offset
-		}
-
-		start := strings.LastIndex(text[searchLimit:idx], "{")
-		if start >= 0 {
-			start += searchLimit
-		}
-
-		if start < 0 {
-			offset = idx + len(matchedKeyword)
-			continue
-		}
-
-		foundObj := false
-		for start >= searchLimit {
-			candidate, end, ok := extractJSONObject(text, start)
-			if ok {
-				// Move forward to avoid repeatedly matching the same object.
-				offset = end
-				out = append(out, strings.TrimSpace(candidate))
-				foundObj = true
-				break
-			}
-			// Try previous '{'
-			if start > searchLimit {
-				prevStart := strings.LastIndex(text[searchLimit:start], "{")
-				if prevStart >= 0 {
-					start = searchLimit + prevStart
-					continue
-				}
-			}
-			break
-		}
-
-		if !foundObj {
-			offset = idx + len(matchedKeyword)
-		}
-	}
-	return out
-}
-
-func extractJSONObject(text string, start int) (string, int, bool) {
-	if start < 0 || start >= len(text) || text[start] != '{' {
-		return "", 0, false
-	}
-	depth := 0
-	quote := byte(0)
-	escaped := false
-	// Limit scan length to avoid OOM on unclosed objects
-	maxLen := start + 50000
-	if maxLen > len(text) {
-		maxLen = len(text)
-	}
-	for i := start; i < maxLen; i++ {
-		ch := text[i]
-		if quote != 0 {
-			if escaped {
-				escaped = false
-				continue
-			}
-			if ch == '\\' {
-				escaped = true
-				continue
-			}
-			if ch == quote {
-				quote = 0
-			}
-			continue
-		}
-		if ch == '"' || ch == '\'' {
-			quote = ch
-			continue
-		}
-		if ch == '{' {
-			depth++
-			continue
-		}
-		if ch == '}' {
-			depth--
-			if depth == 0 {
-				return text[start : i+1], i + 1, true
-			}
-		}
-	}
-	return "", 0, false
-}
-
-func looksLikeToolExampleContext(text string) bool {
-	t := strings.ToLower(strings.TrimSpace(text))
-	if t == "" {
-		return false
-	}
-	return strings.Contains(t, "```")
-}
-
-func shouldSkipToolCallParsingForCodeFenceExample(text string) bool {
-	if !looksLikeToolCallSyntax(text) {
-		return false
-	}
-	stripped := strings.TrimSpace(stripFencedCodeBlocks(text))
-	return !looksLikeToolCallSyntax(stripped)
-}
-
-//nolint:unused // retained for future markup tool-call heuristics.
-func looksLikeMarkupToolSyntax(text string) bool {
-	return markupToolSyntaxPattern.MatchString(text)
-}
-
-func stripFencedCodeBlocks(text string) string {
-	if text == "" {
-		return ""
-	}
-	return fencedCodeBlockPattern.ReplaceAllString(text, " ")
-}
+// toolcalls_candidates.go is reserved for tool-call candidate helper logic.
+// It exists to satisfy the refactor line gate target list.
--- a/internal/toolcall/toolcalls_markup.go
+++ b/internal/toolcall/toolcalls_markup.go
@@ -22,6 +22,9 @@ var toolCallMarkupNamePatternByTag = map[string]*regexp.Regexp{
 	"name":     regexp.MustCompile(`(?is)<(?:[a-z0-9_:-]+:)?name\b[^>]*>(.*?)</(?:[a-z0-9_:-]+:)?name>`),
 	"function": regexp.MustCompile(`(?is)<(?:[a-z0-9_:-]+:)?function\b[^>]*>(.*?)</(?:[a-z0-9_:-]+:)?function>`),
 }
+
+// cdataPattern matches a standalone CDATA section.
+var cdataPattern = regexp.MustCompile(`(?is)^<!\[CDATA\[(.*?)]]>$`)
 var toolCallMarkupArgsTagNames = []string{"input", "arguments", "argument", "parameters", "parameter", "args", "params"}
 var toolCallMarkupArgsPatternByTag = map[string]*regexp.Regexp{
 	"input":      regexp.MustCompile(`(?is)<(?:[a-z0-9_:-]+:)?input\b[^>]*>(.*?)</(?:[a-z0-9_:-]+:)?input>`),
@@ -68,8 +71,31 @@ func parseMarkupToolCalls(text string) []ParsedToolCall {
 }

 func parseMarkupSingleToolCall(attrs string, inner string) ParsedToolCall {
-	if parsed := parseToolCallsPayload(inner); len(parsed) > 0 {
-		return parsed[0]
+	// Try parsing inner content as a JSON tool call object.
+	if raw := strings.TrimSpace(inner); raw != "" && strings.HasPrefix(raw, "{") {
+		var obj map[string]any
+		if err := json.Unmarshal([]byte(raw), &obj); err == nil {
+			name, _ := obj["name"].(string)
+			if name == "" {
+				if fn, ok := obj["function"].(map[string]any); ok {
+					name, _ = fn["name"].(string)
+				}
+			}
+			if name == "" {
+				if fc, ok := obj["functionCall"].(map[string]any); ok {
+					name, _ = fc["name"].(string)
+				}
+			}
+			if strings.TrimSpace(name) != "" {
+				input := parseToolCallInput(obj["input"])
+				if len(input) == 0 {
+					if args, ok := obj["arguments"]; ok {
+						input = parseToolCallInput(args)
+					}
+				}
+				return ParsedToolCall{Name: strings.TrimSpace(name), Input: input}
+			}
+		}
 	}

 	name := ""
@@ -93,17 +119,7 @@ func parseMarkupSingleToolCall(attrs string, inner string) ParsedToolCall {
 }

 func parseMarkupInput(raw string) map[string]any {
-	raw = strings.TrimSpace(html.UnescapeString(raw))
-	if raw == "" {
-		return map[string]any{}
-	}
-	if parsed := parseToolCallInput(raw); len(parsed) > 0 {
-		return parsed
-	}
-	if kv := parseMarkupKVObject(raw); len(kv) > 0 {
-		return kv
-	}
-	return map[string]any{"_raw": html.UnescapeString(stripTagText(raw))}
+	return parseStructuredToolCallInput(raw)
 }

 func parseMarkupKVObject(text string) map[string]any {
@@ -124,16 +140,11 @@ func parseMarkupKVObject(text string) map[string]any {
 		if !strings.EqualFold(key, endKey) {
 			continue
 		}
-		value := strings.TrimSpace(html.UnescapeString(stripTagText(m[2])))
-		if value == "" {
+		value := parseMarkupValue(m[2])
+		if value == nil {
 			continue
 		}
-		var jsonValue any
-		if json.Unmarshal([]byte(value), &jsonValue) == nil {
-			out[key] = jsonValue
-			continue
-		}
-		out[key] = value
+		appendMarkupValue(out, key, value)
 	}
 	if len(out) == 0 {
 		return nil
@@ -141,6 +152,67 @@ func parseMarkupKVObject(text string) map[string]any {
 	return out
 }

+func parseMarkupValue(inner string) any {
+	value := strings.TrimSpace(extractRawTagValue(inner))
+	if value == "" {
+		return ""
+	}
+
+	if strings.Contains(value, "<") && strings.Contains(value, ">") {
+		if parsed := parseStructuredToolCallInput(value); len(parsed) > 0 {
+			if len(parsed) == 1 {
+				if raw, ok := parsed["_raw"].(string); ok {
+					return raw
+				}
+			}
+			return parsed
+		}
+	}
+
+	var jsonValue any
+	if json.Unmarshal([]byte(value), &jsonValue) == nil {
+		return jsonValue
+	}
+	return value
+}
+
+func appendMarkupValue(out map[string]any, key string, value any) {
+	if existing, ok := out[key]; ok {
+		switch current := existing.(type) {
+		case []any:
+			out[key] = append(current, value)
+		default:
+			out[key] = []any{current, value}
+		}
+		return
+	}
+	out[key] = value
+}
+
+// extractRawTagValue treats the inner content of a tag robustly.
+// It detects CDATA and strips it, otherwise it unescapes standard HTML entities.
+// It avoids over-aggressive tag stripping that might break user content.
+func extractRawTagValue(inner string) string {
+	trimmed := strings.TrimSpace(inner)
+	if trimmed == "" {
+		return ""
+	}
+
+	// 1. Check for CDATA - if present, it's the ultimate "safe" container.
+	if cdataMatches := cdataPattern.FindStringSubmatch(trimmed); len(cdataMatches) >= 2 {
+		return cdataMatches[1] // Return raw content between CDATA brackets
+	}
+
+	// 2. If no CDATA, we still want to be robust.
+	// We unescape standard HTML entities (like &lt; &gt; &amp;)
+	// but we DON'T recursively strip tags unless they are actually valid XML tags
+	// at the start/end (which should have been handled by the outer matcher anyway).
+
+	// If it contains what looks like a single tag and no other text, it might be nested XML
+	// but for KV objects we usually want the value.
+	return html.UnescapeString(inner)
+}
+
 func stripTagText(text string) string {
 	return strings.TrimSpace(anyTagPattern.ReplaceAllString(text, ""))
 }
@@ -152,7 +224,7 @@ func findMarkupTagValue(text string, tagNames []string, patternByTag map[string]
 			continue
 		}
 		if m := pattern.FindStringSubmatch(text); len(m) >= 2 {
-			value := strings.TrimSpace(m[1])
+			value := extractRawTagValue(m[1])
 			if value != "" {
 				return value
 			}
--- a/internal/toolcall/toolcalls_name_match.go
+++ b/internal/toolcall/toolcalls_name_match.go
@@ -1,35 +0,0 @@
-package toolcall
-
-import (
-	"regexp"
-	"strings"
-)
-
-//nolint:unused // retained for policy-level tool-name matching compatibility.
-var toolNameLoosePattern = regexp.MustCompile(`[^a-z0-9]+`)
-
-//nolint:unused // retained for policy-level tool-name matching compatibility.
-func resolveAllowedToolNameWithLooseMatch(name string, allowed map[string]struct{}, allowedCanonical map[string]string) string {
-	if _, ok := allowed[name]; ok {
-		return name
-	}
-	lower := strings.ToLower(strings.TrimSpace(name))
-	if canonical, ok := allowedCanonical[lower]; ok {
-		return canonical
-	}
-	if idx := strings.LastIndex(lower, "."); idx >= 0 && idx < len(lower)-1 {
-		if canonical, ok := allowedCanonical[lower[idx+1:]]; ok {
-			return canonical
-		}
-	}
-	loose := toolNameLoosePattern.ReplaceAllString(lower, "")
-	if loose == "" {
-		return ""
-	}
-	for candidateLower, canonical := range allowedCanonical {
-		if toolNameLoosePattern.ReplaceAllString(candidateLower, "") == loose {
-			return canonical
-		}
-	}
-	return ""
-}
--- a/internal/toolcall/toolcalls_parse.go
+++ b/internal/toolcall/toolcalls_parse.go
@@ -1,7 +1,6 @@
 package toolcall

 import (
-	"encoding/json"
 	"strings"
 )

@@ -22,126 +21,38 @@ func ParseToolCalls(text string, availableToolNames []string) []ParsedToolCall {
 }

 func ParseToolCallsDetailed(text string, availableToolNames []string) ToolCallParseResult {
-	result := ToolCallParseResult{}
-	if strings.TrimSpace(text) == "" {
-		return result
-	}
-	result.SawToolCallSyntax = looksLikeToolCallSyntax(text)
-	if shouldSkipToolCallParsingForCodeFenceExample(text) {
-		return result
-	}
-
-	candidates := buildToolCallCandidates(text)
-	for _, candidate := range candidates {
-		if !isLikelyJSONToolPayloadCandidate(candidate) {
-			continue
-		}
-		tc := parseToolCallsPayload(candidate)
-		if len(tc) == 0 {
-			continue
-		}
-		parsed := tc
-		calls, rejectedNames := filterToolCallsDetailed(parsed)
-		result.Calls = calls
-		result.RejectedToolNames = rejectedNames
-		result.RejectedByPolicy = len(rejectedNames) > 0 && len(calls) == 0
-		result.SawToolCallSyntax = true
-		return result
-	}
-	var parsed []ParsedToolCall
-	for _, candidate := range candidates {
-		tc := parseXMLToolCalls(candidate)
-		if len(tc) == 0 {
-			tc = parseMarkupToolCalls(candidate)
-		}
-		if len(tc) == 0 {
-			tc = parseToolCallsPayload(candidate)
-		}
-		if len(tc) == 0 {
-			tc = parseTextKVToolCalls(candidate)
-		}
-		if len(tc) > 0 {
-			parsed = tc
-			result.SawToolCallSyntax = true
-			break
-		}
-	}
-	if len(parsed) == 0 {
-		parsed = parseXMLToolCalls(text)
-		if len(parsed) == 0 {
-			parsed = parseTextKVToolCalls(text)
-			if len(parsed) == 0 {
-				return result
-			}
-		}
-		result.SawToolCallSyntax = true
-	}
-
-	calls, rejectedNames := filterToolCallsDetailed(parsed)
-	result.Calls = calls
-	result.RejectedToolNames = rejectedNames
-	result.RejectedByPolicy = len(rejectedNames) > 0 && len(calls) == 0
-	return result
+	return parseToolCallsDetailedXMLOnly(text)
 }
+
 func ParseStandaloneToolCalls(text string, availableToolNames []string) []ParsedToolCall {
 	return ParseStandaloneToolCallsDetailed(text, availableToolNames).Calls
 }

 func ParseStandaloneToolCallsDetailed(text string, availableToolNames []string) ToolCallParseResult {
+	return parseToolCallsDetailedXMLOnly(text)
+}
+
+func parseToolCallsDetailedXMLOnly(text string) ToolCallParseResult {
 	result := ToolCallParseResult{}
 	trimmed := strings.TrimSpace(text)
 	if trimmed == "" {
 		return result
 	}
 	result.SawToolCallSyntax = looksLikeToolCallSyntax(trimmed)
-	if shouldSkipToolCallParsingForCodeFenceExample(trimmed) {
+	trimmed = stripFencedCodeBlocks(trimmed)
+	trimmed = strings.TrimSpace(trimmed)
+	if trimmed == "" {
 		return result
 	}
-	candidates := buildToolCallCandidates(trimmed)
-	var parsed []ParsedToolCall
-	for _, candidate := range candidates {
-		if !isLikelyJSONToolPayloadCandidate(candidate) {
-			continue
-		}
-		parsed = parseToolCallsPayload(candidate)
-		if len(parsed) == 0 {
-			continue
-		}
-		result.SawToolCallSyntax = true
-		calls, rejectedNames := filterToolCallsDetailed(parsed)
-		result.Calls = calls
-		result.RejectedToolNames = rejectedNames
-		result.RejectedByPolicy = len(rejectedNames) > 0 && len(calls) == 0
-		return result
-	}
-	for _, candidate := range candidates {
-		candidate = strings.TrimSpace(candidate)
-		if candidate == "" {
-			continue
-		}
-		parsed = parseXMLToolCalls(candidate)
-		if len(parsed) == 0 {
-			parsed = parseMarkupToolCalls(candidate)
-		}
-		if len(parsed) == 0 {
-			parsed = parseToolCallsPayload(candidate)
-		}
-		if len(parsed) == 0 {
-			parsed = parseTextKVToolCalls(candidate)
-		}
-		if len(parsed) > 0 {
-			break
-		}
+
+	parsed := parseXMLToolCalls(trimmed)
+	if len(parsed) == 0 {
+		parsed = parseMarkupToolCalls(trimmed)
 	}
 	if len(parsed) == 0 {
-		parsed = parseXMLToolCalls(trimmed)
-		if len(parsed) == 0 {
-			parsed = parseTextKVToolCalls(trimmed)
-			if len(parsed) == 0 {
-				return result
-			}
-		}
+		return result
 	}
+
 	result.SawToolCallSyntax = true
 	calls, rejectedNames := filterToolCallsDetailed(parsed)
 	result.Calls = calls
@@ -164,70 +75,89 @@ func filterToolCallsDetailed(parsed []ParsedToolCall) ([]ParsedToolCall, []strin
 	return out, nil
 }

-//nolint:unused // retained for policy-level tool-name matching compatibility.
-func resolveAllowedToolName(name string, allowed map[string]struct{}, allowedCanonical map[string]string) string {
-	return resolveAllowedToolNameWithLooseMatch(name, allowed, allowedCanonical)
-}
-
-func parseToolCallsPayload(payload string) []ParsedToolCall {
-	var decoded any
-	if err := json.Unmarshal([]byte(payload), &decoded); err != nil {
-		// Try to repair backslashes first! Because LLMs often mix these two problems.
-		repaired := repairInvalidJSONBackslashes(payload)
-		// Try loose repair on top of that
-		repaired = RepairLooseJSON(repaired)
-		if err := json.Unmarshal([]byte(repaired), &decoded); err != nil {
-			return nil
-		}
-	}
-	switch v := decoded.(type) {
-	case map[string]any:
-		if tc, ok := v["tool_calls"]; ok {
-			if isLikelyChatMessageEnvelope(v) {
-				return nil
-			}
-			return parseToolCallList(tc)
-		}
-		if parsed, ok := parseToolCallItem(v); ok {
-			return []ParsedToolCall{parsed}
-		}
-	case []any:
-		return parseToolCallList(v)
-	}
-	return nil
-}
-
-func isLikelyChatMessageEnvelope(v map[string]any) bool {
-	if v == nil {
-		return false
-	}
-	if _, ok := v["tool_calls"]; !ok {
-		return false
-	}
-	if role, ok := v["role"].(string); ok {
-		switch strings.ToLower(strings.TrimSpace(role)) {
-		case "assistant", "tool", "user", "system":
-			return true
-		}
-	}
-	if _, ok := v["tool_call_id"]; ok {
-		return true
-	}
-	if _, ok := v["content"]; ok {
-		return true
-	}
-	return false
-}
-
 func looksLikeToolCallSyntax(text string) bool {
 	lower := strings.ToLower(text)
-	return strings.Contains(lower, "tool_calls") ||
-		strings.Contains(lower, "\"function\"") ||
-		strings.Contains(lower, "functioncall") ||
-		strings.Contains(lower, "\"tool_use\"") ||
+	return strings.Contains(lower, "<tool_calls") ||
 		strings.Contains(lower, "<tool_call") ||
+		strings.Contains(lower, "<function_calls") ||
 		strings.Contains(lower, "<function_call") ||
-		strings.Contains(lower, "<function_name") ||
 		strings.Contains(lower, "<invoke") ||
-		strings.Contains(lower, "function.name:")
+		strings.Contains(lower, "<tool_use") ||
+		strings.Contains(lower, "<attempt_completion") ||
+		strings.Contains(lower, "<ask_followup_question") ||
+		strings.Contains(lower, "<new_task") ||
+		strings.Contains(lower, "<result")
+}
+
+func stripFencedCodeBlocks(text string) string {
+	if text == "" {
+		return ""
+	}
+	var b strings.Builder
+	b.Grow(len(text))
+
+	lines := strings.SplitAfter(text, "\n")
+	inFence := false
+	fenceMarker := ""
+	for _, line := range lines {
+		trimmed := strings.TrimLeft(line, " \t")
+		if !inFence {
+			if marker, ok := parseFenceOpen(trimmed); ok {
+				inFence = true
+				fenceMarker = marker
+				continue
+			}
+			b.WriteString(line)
+			continue
+		}
+
+		if isFenceClose(trimmed, fenceMarker) {
+			inFence = false
+			fenceMarker = ""
+		}
+	}
+
+	if inFence {
+		return ""
+	}
+	return b.String()
+}
+
+func parseFenceOpen(line string) (string, bool) {
+	if len(line) < 3 {
+		return "", false
+	}
+	ch := line[0]
+	if ch != '`' && ch != '~' {
+		return "", false
+	}
+	count := countLeadingFenceChars(line, ch)
+	if count < 3 {
+		return "", false
+	}
+	return strings.Repeat(string(ch), count), true
+}
+
+func isFenceClose(line, marker string) bool {
+	if marker == "" {
+		return false
+	}
+	ch := marker[0]
+	if line == "" || line[0] != ch {
+		return false
+	}
+	count := countLeadingFenceChars(line, ch)
+	if count < len(marker) {
+		return false
+	}
+	rest := strings.TrimSpace(line[count:])
+	return rest == ""
+}
+
+func countLeadingFenceChars(line string, ch byte) int {
+	count := 0
+	for count < len(line) && line[count] == ch {
+		count++
+	}
+	return count
 }
--- a/internal/toolcall/toolcalls_parse_item.go
+++ b/internal/toolcall/toolcalls_parse_item.go
@@ -1,87 +0,0 @@
-package toolcall
-
-import "strings"
-
-func isLikelyJSONToolPayloadCandidate(candidate string) bool {
-	trimmed := strings.TrimSpace(candidate)
-	if trimmed == "" {
-		return false
-	}
-	if !strings.HasPrefix(trimmed, "{") && !strings.HasPrefix(trimmed, "[") {
-		return false
-	}
-	lower := strings.ToLower(trimmed)
-	return strings.Contains(lower, "tool_calls") ||
-		strings.Contains(lower, "\"function\"") ||
-		strings.Contains(lower, "functioncall") ||
-		strings.Contains(lower, "\"tool_use\"")
-}
-
-func parseToolCallList(v any) []ParsedToolCall {
-	items, ok := v.([]any)
-	if !ok {
-		return nil
-	}
-	out := make([]ParsedToolCall, 0, len(items))
-	for _, item := range items {
-		m, ok := item.(map[string]any)
-		if !ok {
-			continue
-		}
-		if tc, ok := parseToolCallItem(m); ok {
-			out = append(out, tc)
-		}
-	}
-	if len(out) == 0 {
-		return nil
-	}
-	return out
-}
-
-func parseToolCallItem(m map[string]any) (ParsedToolCall, bool) {
-	name, _ := m["name"].(string)
-	inputRaw, hasInput := m["input"]
-	if fnCall, ok := m["functionCall"].(map[string]any); ok {
-		if name == "" {
-			name, _ = fnCall["name"].(string)
-		}
-		if !hasInput {
-			if v, ok := fnCall["args"]; ok {
-				inputRaw = v
-				hasInput = true
-			}
-		}
-		if !hasInput {
-			if v, ok := fnCall["arguments"]; ok {
-				inputRaw = v
-				hasInput = true
-			}
-		}
-	}
-	if fn, ok := m["function"].(map[string]any); ok {
-		if name == "" {
-			name, _ = fn["name"].(string)
-		}
-		if !hasInput {
-			if v, ok := fn["arguments"]; ok {
-				inputRaw = v
-				hasInput = true
-			}
-		}
-	}
-	if !hasInput {
-		for _, key := range []string{"arguments", "args", "parameters", "params"} {
-			if v, ok := m[key]; ok {
-				inputRaw = v
-				break
-			}
-		}
-	}
-	if strings.TrimSpace(name) == "" {
-		return ParsedToolCall{}, false
-	}
-	return ParsedToolCall{
-		Name:  strings.TrimSpace(name),
-		Input: parseToolCallInput(inputRaw),
-	}, true
-}
--- a/internal/toolcall/toolcalls_parse_markup.go
+++ b/internal/toolcall/toolcalls_parse_markup.go
@@ -13,7 +13,6 @@ var functionCallPattern = regexp.MustCompile(`(?is)<function_call>\s*([^<]+?)\s*
 var functionParamPattern = regexp.MustCompile(`(?is)<function\s+parameter\s+name="([^"]+)"\s*>\s*(.*?)\s*</function\s+parameter>`)
 var antmlFunctionCallPattern = regexp.MustCompile(`(?is)<(?:[a-z0-9_]+:)?function_call[^>]*(?:name|function)="([^"]+)"[^>]*>\s*(.*?)\s*</(?:[a-z0-9_]+:)?function_call>`)
 var antmlArgumentPattern = regexp.MustCompile(`(?is)<(?:[a-z0-9_]+:)?argument\s+name="([^"]+)"\s*>\s*(.*?)\s*</(?:[a-z0-9_]+:)?argument>`)
-var antmlParametersPattern = regexp.MustCompile(`(?is)<(?:[a-z0-9_]+:)?parameters\s*>\s*(\{.*?\})\s*</(?:[a-z0-9_]+:)?parameters>`)
 var invokeCallPattern = regexp.MustCompile(`(?is)<invoke\s+name="([^"]+)"\s*>(.*?)</invoke>`)
 var invokeParamPattern = regexp.MustCompile(`(?is)<parameter\s+name="([^"]+)"\s*>\s*(.*?)\s*</parameter>`)
 var toolUseFunctionPattern = regexp.MustCompile(`(?is)<tool_use>\s*<function\s+name="([^"]+)"\s*>(.*?)</function>\s*</tool_use>`)
@@ -89,7 +88,6 @@ func parseSingleXMLToolCall(block string) (ParsedToolCall, bool) {
 	name := ""
 	params := extractXMLToolParamsByRegex(inner)
 	dec := xml.NewDecoder(strings.NewReader(block))
-	inParams := false
 	inTool := false
 	for {
 		tok, err := dec.Token()
@@ -108,56 +106,36 @@ func parseSingleXMLToolCall(block string) (ParsedToolCall, bool) {
 					}
 				}
 			case "parameters":
-				inParams = true
 				var node struct {
 					Inner string `xml:",innerxml"`
 				}
 				if err := dec.DecodeElement(&node, &t); err == nil {
 					inner := strings.TrimSpace(node.Inner)
 					if inner != "" {
-						unescapedInner := html.UnescapeString(inner)
-						if parsed := parseToolCallInput(unescapedInner); len(parsed) > 0 {
-							if len(parsed) == 1 {
-								if _, onlyRaw := parsed["_raw"]; onlyRaw {
-									if kv := parseMarkupKVObject(unescapedInner); len(kv) > 0 {
-										for k, vv := range kv {
-											params[k] = vv
-										}
-										break
-									}
-								}
-							}
+						extracted := extractRawTagValue(inner)
+						if parsed := parseStructuredToolCallInput(extracted); len(parsed) > 0 {
 							for k, vv := range parsed {
 								params[k] = vv
 							}
-						} else if kv := parseMarkupKVObject(unescapedInner); len(kv) > 0 {
-							for k, vv := range kv {
-								params[k] = vv
-							}
 						}
 					}
 				}
-				inParams = false
 			case "tool_name", "function_name", "name":
 				var v string
 				if err := dec.DecodeElement(&v, &t); err == nil && strings.TrimSpace(v) != "" {
-					if inParams {
-						params[t.Name.Local] = strings.TrimSpace(v)
-						break
-					}
 					name = strings.TrimSpace(v)
 				}
 			case "input", "arguments", "argument", "args", "params":
 				var v string
 				if err := dec.DecodeElement(&v, &t); err == nil && strings.TrimSpace(v) != "" {
-					if parsed := parseToolCallInput(strings.TrimSpace(v)); len(parsed) > 0 {
+					if parsed := parseStructuredToolCallInput(strings.TrimSpace(v)); len(parsed) > 0 {
 						for k, vv := range parsed {
 							params[k] = vv
 						}
 					}
 				}
 			default:
-				if inParams || inTool {
+				if inTool {
 					var v string
 					if err := dec.DecodeElement(&v, &t); err == nil {
 						params[t.Name.Local] = strings.TrimSpace(html.UnescapeString(v))
@@ -166,9 +144,6 @@ func parseSingleXMLToolCall(block string) (ParsedToolCall, bool) {
 			}
 		case xml.EndElement:
 			tag := strings.ToLower(t.Name.Local)
-			if tag == "parameters" {
-				inParams = false
-			}
 			if tag == "tool" {
 				inTool = false
 			}
@@ -243,9 +218,15 @@ func parseFunctionCallTagStyle(text string) (ParsedToolCall, bool) {
 			continue
 		}
 		key := strings.TrimSpace(pm[1])
-		val := strings.TrimSpace(html.UnescapeString(pm[2]))
+		val := extractRawTagValue(pm[2])
 		if key != "" {
-			input[key] = val
+			if parsed := parseStructuredToolCallInput(val); len(parsed) > 0 {
+				if isOnlyRawValue(parsed, val) {
+					input[key] = val
+				} else {
+					input[key] = parsed
+				}
+			}
 		}
 	}
 	return ParsedToolCall{Name: name, Input: input}, true
@@ -276,28 +257,36 @@ func parseSingleAntmlFunctionCallMatch(m []string) (ParsedToolCall, bool) {
 	if name == "" {
 		return ParsedToolCall{}, false
 	}
-	body := strings.TrimSpace(html.UnescapeString(m[2]))
+	body := strings.TrimSpace(m[2])
 	input := map[string]any{}
 	if strings.HasPrefix(body, "{") {
 		if err := json.Unmarshal([]byte(body), &input); err == nil {
 			return ParsedToolCall{Name: name, Input: input}, true
 		}
 	}
-	if pm := antmlParametersPattern.FindStringSubmatch(body); len(pm) >= 2 {
-		if err := json.Unmarshal([]byte(strings.TrimSpace(pm[1])), &input); err == nil {
-			return ParsedToolCall{Name: name, Input: input}, true
-		}
-	}
 	for _, am := range antmlArgumentPattern.FindAllStringSubmatch(body, -1) {
 		if len(am) < 3 {
 			continue
 		}
 		k := strings.TrimSpace(am[1])
-		v := strings.TrimSpace(html.UnescapeString(am[2]))
+		v := extractRawTagValue(am[2])
 		if k != "" {
 			input[k] = v
 		}
 	}
+	if len(input) > 0 {
+		return ParsedToolCall{Name: name, Input: input}, true
+	}
+	if paramsRaw := findMarkupTagValue(body, toolCallMarkupArgsTagNames, toolCallMarkupArgsPatternByTag); paramsRaw != "" {
+		if parsed := parseMarkupInput(paramsRaw); len(parsed) > 0 {
+			return ParsedToolCall{Name: name, Input: parsed}, true
+		}
+	}
+	if strings.Contains(body, "<") {
+		if parsed := parseStructuredToolCallInput(body); len(parsed) > 0 && !isOnlyRawValue(parsed, body) {
+			return ParsedToolCall{Name: name, Input: parsed}, true
+		}
+	}
 	return ParsedToolCall{Name: name, Input: input}, true
 }

@@ -316,9 +305,15 @@ func parseInvokeFunctionCallStyle(text string) (ParsedToolCall, bool) {
 			continue
 		}
 		k := strings.TrimSpace(pm[1])
-		v := strings.TrimSpace(html.UnescapeString(pm[2]))
+		v := extractRawTagValue(pm[2])
 		if k != "" {
-			input[k] = v
+			if parsed := parseStructuredToolCallInput(v); len(parsed) > 0 {
+				if isOnlyRawValue(parsed, v) {
+					input[k] = v
+				} else {
+					input[k] = parsed
+				}
+			}
 		}
 	}
 	if len(input) == 0 {
@@ -326,6 +321,8 @@ func parseInvokeFunctionCallStyle(text string) (ParsedToolCall, bool) {
 			input = parseMarkupInput(argsRaw)
 		} else if kv := parseMarkupKVObject(m[2]); len(kv) > 0 {
 			input = kv
+		} else if parsed := parseStructuredToolCallInput(m[2]); len(parsed) > 0 && !isOnlyRawValue(parsed, strings.TrimSpace(html.UnescapeString(m[2]))) {
+			input = parsed
 		}
 	}
 	return ParsedToolCall{Name: name, Input: input}, true
@@ -347,9 +344,15 @@ func parseToolUseFunctionStyle(text string) (ParsedToolCall, bool) {
 			continue
 		}
 		k := strings.TrimSpace(pm[1])
-		v := strings.TrimSpace(html.UnescapeString(pm[2]))
+		v := extractRawTagValue(pm[2])
 		if k != "" {
-			input[k] = v
+			if parsed := parseStructuredToolCallInput(v); len(parsed) > 0 {
+				if isOnlyRawValue(parsed, v) {
+					input[k] = v
+				} else {
+					input[k] = parsed
+				}
+			}
 		}
 	}
 	return ParsedToolCall{Name: name, Input: input}, true
@@ -364,13 +367,11 @@ func parseToolUseNameParametersStyle(text string) (ParsedToolCall, bool) {
 	if name == "" {
 		return ParsedToolCall{}, false
 	}
-	raw := strings.TrimSpace(html.UnescapeString(m[2]))
+	raw := strings.TrimSpace(m[2])
 	input := map[string]any{}
 	if raw != "" {
-		if parsed := parseToolCallInput(raw); len(parsed) > 0 {
+		if parsed := parseStructuredToolCallInput(raw); len(parsed) > 0 {
 			input = parsed
-		} else if kv := parseMarkupKVObject(raw); len(kv) > 0 {
-			input = kv
 		}
 	}
 	return ParsedToolCall{Name: name, Input: input}, true
@@ -385,13 +386,11 @@ func parseToolUseFunctionNameParametersStyle(text string) (ParsedToolCall, bool)
 	if name == "" {
 		return ParsedToolCall{}, false
 	}
-	raw := strings.TrimSpace(html.UnescapeString(m[2]))
+	raw := strings.TrimSpace(m[2])
 	input := map[string]any{}
 	if raw != "" {
-		if parsed := parseToolCallInput(raw); len(parsed) > 0 {
+		if parsed := parseStructuredToolCallInput(raw); len(parsed) > 0 {
 			input = parsed
-		} else if kv := parseMarkupKVObject(raw); len(kv) > 0 {
-			input = kv
 		}
 	}
 	return ParsedToolCall{Name: name, Input: input}, true
@@ -406,14 +405,14 @@ func parseToolUseToolNameBodyStyle(text string) (ParsedToolCall, bool) {
 	if name == "" {
 		return ParsedToolCall{}, false
 	}
-	body := strings.TrimSpace(html.UnescapeString(m[2]))
+	body := strings.TrimSpace(m[2])
 	input := map[string]any{}
 	if body != "" {
 		if kv := parseXMLChildKV(body); len(kv) > 0 {
 			input = kv
 		} else if kv := parseMarkupKVObject(body); len(kv) > 0 {
 			input = kv
-		} else if parsed := parseToolCallInput(body); len(parsed) > 0 {
+		} else if parsed := parseStructuredToolCallInput(body); len(parsed) > 0 {
 			input = parsed
 		}
 	}
@@ -425,32 +424,11 @@ func parseXMLChildKV(body string) map[string]any {
 	if trimmed == "" {
 		return nil
 	}
-	dec := xml.NewDecoder(strings.NewReader("<root>" + trimmed + "</root>"))
-	out := map[string]any{}
-	for {
-		tok, err := dec.Token()
-		if err != nil {
-			break
-		}
-		start, ok := tok.(xml.StartElement)
-		if !ok || strings.EqualFold(start.Name.Local, "root") {
-			continue
-		}
-		var v string
-		if err := dec.DecodeElement(&v, &start); err != nil {
-			continue
-		}
-		key := strings.TrimSpace(start.Name.Local)
-		val := strings.TrimSpace(v)
-		if key == "" || val == "" {
-			continue
-		}
-		out[key] = val
-	}
-	if len(out) == 0 {
+	parsed := parseStructuredToolCallInput(trimmed)
+	if len(parsed) == 0 {
 		return nil
 	}
-	return out
+	return parsed
 }

 func asString(v any) string {
--- a/internal/toolcall/toolcalls_test.go
+++ b/internal/toolcall/toolcalls_test.go
@@ -5,89 +5,6 @@ import (
 	"testing"
 )

-func TestParseToolCalls(t *testing.T) {
-	text := `prefix {"tool_calls":[{"name":"search","input":{"q":"golang"}}]} suffix`
-	calls := ParseToolCalls(text, []string{"search"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Name != "search" {
-		t.Fatalf("unexpected tool name: %s", calls[0].Name)
-	}
-	if calls[0].Input["q"] != "golang" {
-		t.Fatalf("unexpected args: %#v", calls[0].Input)
-	}
-}
-
-func TestParseToolCallsIgnoresFencedJSON(t *testing.T) {
-	text := "I will call tools now\n```json\n{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"news\"}}]}\n```"
-	calls := ParseToolCalls(text, []string{"search"})
-	if len(calls) != 0 {
-		t.Fatalf("expected fenced tool_call payload to be ignored, got %#v", calls)
-	}
-}
-
-func TestParseToolCallsWithFunctionArgumentsString(t *testing.T) {
-	text := `{"tool_calls":[{"function":{"name":"get_weather","arguments":"{\"city\":\"beijing\"}"}}]}`
-	calls := ParseToolCalls(text, []string{"get_weather"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Name != "get_weather" {
-		t.Fatalf("unexpected tool name: %s", calls[0].Name)
-	}
-	if calls[0].Input["city"] != "beijing" {
-		t.Fatalf("unexpected args: %#v", calls[0].Input)
-	}
-}
-
-func TestParseToolCallsKeepsUnknownToolName(t *testing.T) {
-	text := `{"tool_calls":[{"name":"unknown","input":{}}]}`
-	calls := ParseToolCalls(text, []string{"search"})
-	if len(calls) != 1 || calls[0].Name != "unknown" {
-		t.Fatalf("expected unknown tool to be preserved, got %#v", calls)
-	}
-}
-
-func TestParseToolCallsKeepsOriginalToolNameCase(t *testing.T) {
-	text := `{"tool_calls":[{"name":"Bash","input":{"command":"ls -al"}}]}`
-	calls := ParseToolCalls(text, []string{"bash"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "Bash" {
-		t.Fatalf("expected original tool name Bash, got %q", calls[0].Name)
-	}
-}
-
-func TestParseToolCallsDetailedDoesNotRejectByPolicy(t *testing.T) {
-	text := `{"tool_calls":[{"name":"unknown","input":{}}]}`
-	res := ParseToolCallsDetailed(text, []string{"search"})
-	if !res.SawToolCallSyntax {
-		t.Fatalf("expected SawToolCallSyntax=true, got %#v", res)
-	}
-	if res.RejectedByPolicy {
-		t.Fatalf("expected RejectedByPolicy=false, got %#v", res)
-	}
-	if len(res.Calls) != 1 || res.Calls[0].Name != "unknown" {
-		t.Fatalf("expected call to be preserved, got %#v", res.Calls)
-	}
-}
-
-func TestParseToolCallsDetailedAllowsWhenAllowListEmpty(t *testing.T) {
-	text := `{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`
-	res := ParseToolCallsDetailed(text, nil)
-	if !res.SawToolCallSyntax {
-		t.Fatalf("expected SawToolCallSyntax=true, got %#v", res)
-	}
-	if res.RejectedByPolicy {
-		t.Fatalf("expected RejectedByPolicy=false, got %#v", res)
-	}
-	if len(res.Calls) != 1 || res.Calls[0].Name != "search" {
-		t.Fatalf("expected calls when allow-list is empty, got %#v", res.Calls)
-	}
-}
-
 func TestFormatOpenAIToolCalls(t *testing.T) {
 	formatted := FormatOpenAIToolCalls([]ParsedToolCall{{Name: "search", Input: map[string]any{"q": "x"}}})
 	if len(formatted) != 1 {
@@ -99,55 +16,6 @@ func TestFormatOpenAIToolCalls(t *testing.T) {
 	}
 }

-func TestParseStandaloneToolCallsSupportsMixedProsePayload(t *testing.T) {
-	mixed := `这里是示例：{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`
-	if calls := ParseStandaloneToolCalls(mixed, []string{"search"}); len(calls) != 1 {
-		t.Fatalf("expected standalone parser to parse mixed prose payload, got %#v", calls)
-	}
-
-	standalone := `{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`
-	calls := ParseStandaloneToolCalls(standalone, []string{"search"})
-	if len(calls) != 1 {
-		t.Fatalf("expected standalone parser to match, got %#v", calls)
-	}
-}
-
-func TestParseStandaloneToolCallsIgnoresFencedCodeBlock(t *testing.T) {
-	fenced := "```json\n{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}\n```"
-	if calls := ParseStandaloneToolCalls(fenced, []string{"search"}); len(calls) != 0 {
-		t.Fatalf("expected fenced tool_call payload to be ignored, got %#v", calls)
-	}
-}
-
-func TestParseStandaloneToolCallsIgnoresChatTranscriptEnvelope(t *testing.T) {
-	transcript := `[{"role":"user","content":"请展示完整会话"},{"role":"assistant","content":null,"tool_calls":[{"function":{"name":"search","arguments":"{\"q\":\"go\"}"}}]}]`
-	if calls := ParseStandaloneToolCalls(transcript, []string{"search"}); len(calls) != 0 {
-		t.Fatalf("expected transcript envelope not to trigger tool call parse, got %#v", calls)
-	}
-}
-
-func TestParseToolCallsAllowsQualifiedToolName(t *testing.T) {
-	text := `{"tool_calls":[{"name":"mcp.search_web","input":{"q":"golang"}}]}`
-	calls := ParseToolCalls(text, []string{"search_web"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "mcp.search_web" {
-		t.Fatalf("expected original tool name mcp.search_web, got %q", calls[0].Name)
-	}
-}
-
-func TestParseToolCallsAllowsPunctuationVariantToolName(t *testing.T) {
-	text := `{"tool_calls":[{"name":"read-file","input":{"path":"README.md"}}]}`
-	calls := ParseToolCalls(text, []string{"read_file"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "read-file" {
-		t.Fatalf("expected original tool name read-file, got %q", calls[0].Name)
-	}
-}
-
 func TestParseToolCallsSupportsClaudeXMLToolCall(t *testing.T) {
 	text := `<tool_call><tool_name>Bash</tool_name><parameters><command>pwd</command><description>show cwd</description></parameters></tool_call>`
 	calls := ParseToolCalls(text, []string{"bash"})
@@ -162,6 +30,30 @@ func TestParseToolCallsSupportsClaudeXMLToolCall(t *testing.T) {
 	}
 }

+func TestParseToolCallsSupportsMultilineCDATAAndRepeatedXMLTags(t *testing.T) {
+	text := `<tool_call><tool_name>write_file</tool_name><parameters><path>script.sh</path><content><![CDATA[#!/bin/bash
+echo "hello"
+]]></content><item>first</item><item>second</item></parameters></tool_call>`
+	calls := ParseToolCalls(text, []string{"write_file"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 call, got %#v", calls)
+	}
+	if calls[0].Name != "write_file" {
+		t.Fatalf("expected tool name write_file, got %q", calls[0].Name)
+	}
+	if calls[0].Input["path"] != "script.sh" {
+		t.Fatalf("expected path argument, got %#v", calls[0].Input)
+	}
+	content, _ := calls[0].Input["content"].(string)
+	if !strings.Contains(content, "#!/bin/bash") || !strings.Contains(content, "echo \"hello\"") {
+		t.Fatalf("expected multiline CDATA content to be preserved, got %#v", calls[0].Input["content"])
+	}
+	items, ok := calls[0].Input["item"].([]any)
+	if !ok || len(items) != 2 {
+		t.Fatalf("expected repeated XML tags to become an array, got %#v", calls[0].Input["item"])
+	}
+}
+
 func TestParseToolCallsSupportsCanonicalXMLParametersJSON(t *testing.T) {
 	text := `<tool_call><tool_name>get_weather</tool_name><parameters>{"city":"beijing","unit":"c"}</parameters></tool_call>`
 	calls := ParseToolCalls(text, []string{"get_weather"})
@@ -223,20 +115,6 @@ func TestParseToolCallsDoesNotTreatParameterNameTagAsToolName(t *testing.T) {
 	}
 }

-func TestParseToolCallsPrefersJSONPayloadOverIncidentalXMLInString(t *testing.T) {
-	text := `{"tool_calls":[{"name":"search","input":{"q":"latest <tool_call><tool_name>wrong</tool_name><parameters>{\"x\":1}</parameters></tool_call>"}}]}`
-	calls := ParseToolCallsDetailed(text, []string{"search"}).Calls
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "search" {
-		t.Fatalf("expected tool name search, got %q", calls[0].Name)
-	}
-	if calls[0].Input["q"] == nil {
-		t.Fatalf("expected q argument from json payload, got %#v", calls[0].Input)
-	}
-}
-
 func TestParseToolCallsDetailedMarksXMLToolCallSyntax(t *testing.T) {
 	text := `<tool_call><tool_name>Bash</tool_name><parameters><command>pwd</command></parameters></tool_call>`
 	res := ParseToolCallsDetailed(text, []string{"bash"})
@@ -318,34 +196,6 @@ func TestParseToolCallsSupportsInvokeFunctionCallStyle(t *testing.T) {
 	}
 }

-func TestParseToolCallsSupportsGeminiFunctionCallJSON(t *testing.T) {
-	text := `{"functionCall":{"name":"search_web","args":{"query":"latest"}}}`
-	calls := ParseToolCalls(text, []string{"search_web"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "search_web" {
-		t.Fatalf("expected search_web, got %q", calls[0].Name)
-	}
-	if calls[0].Input["query"] != "latest" {
-		t.Fatalf("expected query argument, got %#v", calls[0].Input)
-	}
-}
-
-func TestParseToolCallsSupportsClaudeToolUseJSON(t *testing.T) {
-	text := `{"type":"tool_use","name":"read_file","input":{"path":"README.md"}}`
-	calls := ParseToolCalls(text, []string{"read_file"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %#v", calls)
-	}
-	if calls[0].Name != "read_file" {
-		t.Fatalf("expected read_file, got %q", calls[0].Name)
-	}
-	if calls[0].Input["path"] != "README.md" {
-		t.Fatalf("expected path argument, got %#v", calls[0].Input)
-	}
-}
-
 func TestParseToolCallsSupportsToolUseFunctionParameterStyle(t *testing.T) {
 	text := `<tool_use><function name="search_web"><parameter name="query">test</parameter></function></tool_use>`
 	calls := ParseToolCalls(text, []string{"search_web"})
@@ -495,104 +345,6 @@ func TestRepairLooseJSON(t *testing.T) {
 	}
 }

-func TestParseToolCallsWithUnquotedKeys(t *testing.T) {
-	text := `这里是列表：{tool_calls: [{"name": "todowrite", "input": {"todos": "test"}}]}`
-	availableTools := []string{"todowrite"}
-
-	parsed := ParseToolCalls(text, availableTools)
-	if len(parsed) != 1 {
-		t.Fatalf("expected 1 tool call, got %d", len(parsed))
-	}
-	if parsed[0].Name != "todowrite" {
-		t.Errorf("expected tool todowrite, got %s", parsed[0].Name)
-	}
-}
-
-func TestParseToolCallsWithInvalidBackslashes(t *testing.T) {
-	// DeepSeek sometimes outputs Windows paths with single backslashes in JSON strings
-	// Note: using raw string to simulate what AI actually sends in the stream
-	text := `好的，执行以下命令：{"name": "execute_command", "input": "{\"command\": \"cd D:\git_codes && dir\"}"}`
-	availableTools := []string{"execute_command"}
-
-	parsed := ParseToolCalls(text, availableTools)
-	// If standard JSON fails, buildToolCallCandidates should still extract the object,
-	// and parseToolCallsPayload should repair it.
-	if len(parsed) != 1 {
-		// If it still fails, let's see why
-		candidates := buildToolCallCandidates(text)
-		t.Logf("Candidates: %v", candidates)
-		t.Fatalf("expected 1 tool call, got %d", len(parsed))
-	}
-
-	cmd, ok := parsed[0].Input["command"].(string)
-	if !ok {
-		t.Fatalf("expected command string in input, got %v", parsed[0].Input)
-	}
-
-	expected := "cd D:\\git_codes && dir"
-	if cmd != expected {
-		t.Errorf("expected command %q, got %q", expected, cmd)
-	}
-}
-
-func TestParseToolCallsWithDeepSeekHallucination(t *testing.T) {
-	// 模拟 DeepSeek 典型的幻觉输出：未加引号的键名 + 包含 Windows 路径的嵌套 JSON 字符串 + 漏掉列表的方括号
-	text := `检测到实施意图——实现经典算法。需在misc/目录创建Python文件。
-关键约束:
-1. Windows UTF-8编码处理
-2. 必须用绝对路径导入
-3. 禁止write覆盖已有文件（misc/目录允许创建新文件）
-将任务分解并委托：
- 研究8皇后算法模式（并行探索）
- 实现带可视化输出的解决方案（unspecified-high）
-先创建todo列表追踪步骤。
-{tool_calls: [{"name": "todowrite", "input": {"todos": {"content": "研究8皇后问题算法模式（回溯法）和输出格式", "status": "pending", "priority": "high"}, {"content": "在misc/目录创建8皇后Python脚本，包含完整解决方案和可视化输出", "status": "pending", "priority": "high"}, {"content": "验证脚本正确性（运行测试）", "status": "pending", "priority": "medium"}}}]}`
-
-	availableTools := []string{"todowrite"}
-	parsed := ParseToolCalls(text, availableTools)
-
-	if len(parsed) != 1 {
-		cands := buildToolCallCandidates(text)
-		for i, c := range cands {
-			t.Logf("CAND %d: %s", i, c)
-			repaired := RepairLooseJSON(c)
-			t.Logf("  REPAIRED: %s", repaired)
-		}
-		t.Fatalf("expected 1 tool call, got %d. Candidates: %v", len(parsed), buildToolCallCandidates(text))
-	}
-
-	if parsed[0].Name != "todowrite" {
-		t.Errorf("expected tool name 'todowrite', got %q", parsed[0].Name)
-	}
-
-	todos, ok := parsed[0].Input["todos"].([]any)
-	if !ok {
-		t.Fatalf("expected 'todos' to be parsed as a list, got %T: %#v", parsed[0].Input["todos"], parsed[0].Input["todos"])
-	}
-	if len(todos) != 3 {
-		t.Errorf("expected 3 todo items, got %d", len(todos))
-	}
-}
-
-func TestParseToolCallsWithMixedWindowsPaths(t *testing.T) {
-	// 更复杂的案例：嵌套 JSON 字符串中的反斜杠未转义
-	text := `关键约束: 1. Windows UTF-8编码处理 2. 必须用绝对路径导入 D:\git_codes\ds2api\misc
-{tool_calls: [{"name": "write_file", "input": "{\"path\": \"D:\\git_codes\\ds2api\\misc\\queens.py\", \"content\": \"print('hello')\"}"}]}`
-
-	availableTools := []string{"write_file"}
-	parsed := ParseToolCalls(text, availableTools)
-
-	if len(parsed) != 1 {
-		t.Fatalf("expected 1 tool call from mixed text with paths, got %d", len(parsed))
-	}
-
-	path, _ := parsed[0].Input["path"].(string)
-	// 在解析后的 Go map 中，反斜杠应该被还原
-	if !strings.Contains(path, "D:\\git_codes") && !strings.Contains(path, "D:/git_codes") {
-		t.Errorf("expected path to contain Windows style separators, got %q", path)
-	}
-}
-
 func TestParseToolCallInputRepairsControlCharsInPath(t *testing.T) {
 	in := `{"path":"D:\tmp\new\readme.txt","content":"line1\nline2"}`
 	parsed := parseToolCallInput(in)
@@ -704,14 +456,32 @@ func TestParseToolCallsUnescapesHTMLEntityArguments(t *testing.T) {
 	}
 }

-func TestParseToolCallsJSONPayloadKeepsLiteralEntities(t *testing.T) {
-	text := `{"tool_calls":[{"name":"bash","input":{"command":"echo &gt; literally"}}]}`
-	calls := ParseToolCalls(text, []string{"bash"})
-	if len(calls) != 1 {
-		t.Fatalf("expected one call, got %#v", calls)
-	}
-	cmd, _ := calls[0].Input["command"].(string)
-	if cmd != "echo &gt; literally" {
-		t.Fatalf("expected json payload to keep literal entities, got %q", cmd)
+func TestParseToolCallsIgnoresXMLInsideFencedCodeBlock(t *testing.T) {
+	text := "Here is an example:\n```xml\n<tool_call><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\nDo not execute it."
+	res := ParseToolCallsDetailed(text, []string{"read_file"})
+	if len(res.Calls) != 0 {
+		t.Fatalf("expected no parsed calls for fenced example, got %#v", res.Calls)
+	}
+}
+
+func TestParseToolCallsParsesOnlyNonFencedXMLToolCall(t *testing.T) {
+	text := "```xml\n<tool_call><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n<tool_call><tool_name>search</tool_name><parameters>{\"q\":\"golang\"}</parameters></tool_call>"
+	res := ParseToolCallsDetailed(text, []string{"read_file", "search"})
+	if len(res.Calls) != 1 {
+		t.Fatalf("expected exactly one parsed call outside fence, got %#v", res.Calls)
+	}
+	if res.Calls[0].Name != "search" {
+		t.Fatalf("expected non-fenced tool call to be parsed, got %#v", res.Calls[0])
+	}
+}
+
+func TestParseToolCallsParsesAfterFourBacktickFence(t *testing.T) {
+	text := "````markdown\n```xml\n<tool_call><tool_name>read_file</tool_name><parameters>{\"path\":\"README.md\"}</parameters></tool_call>\n```\n````\n<tool_call><tool_name>search</tool_name><parameters>{\"q\":\"outside\"}</parameters></tool_call>"
+	res := ParseToolCallsDetailed(text, []string{"read_file", "search"})
+	if len(res.Calls) != 1 {
+		t.Fatalf("expected exactly one parsed call outside four-backtick fence, got %#v", res.Calls)
+	}
+	if res.Calls[0].Name != "search" {
+		t.Fatalf("expected non-fenced tool call to be parsed, got %#v", res.Calls[0])
 	}
 }
--- a/internal/toolcall/toolcalls_textkv.go
+++ b/internal/toolcall/toolcalls_textkv.go
@@ -1,55 +0,0 @@
-package toolcall
-
-import (
-	"regexp"
-	"strings"
-)
-
-var textKVNamePattern = regexp.MustCompile(`(?is)function\.name:\s*([a-zA-Z0-9_\-.]+)`)
-
-func parseTextKVToolCalls(text string) []ParsedToolCall {
-	var out []ParsedToolCall
-	matches := textKVNamePattern.FindAllStringSubmatchIndex(text, -1)
-	if len(matches) == 0 {
-		return nil
-	}
-
-	for i, match := range matches {
-		name := text[match[2]:match[3]]
-
-		offset := match[1]
-		endSearch := len(text)
-		if i+1 < len(matches) {
-			endSearch = matches[i+1][0]
-		}
-
-		searchArea := text[offset:endSearch]
-		argIdx := strings.Index(searchArea, "function.arguments:")
-		if argIdx < 0 {
-			continue
-		}
-
-		startIdx := offset + argIdx + len("function.arguments:")
-		braceIdx := strings.IndexByte(text[startIdx:endSearch], '{')
-		if braceIdx < 0 {
-			continue
-		}
-
-		actualStart := startIdx + braceIdx
-		objJson, _, ok := extractJSONObject(text, actualStart)
-		if !ok {
-			continue
-		}
-
-		input := parseToolCallInput(objJson)
-		out = append(out, ParsedToolCall{
-			Name:  name,
-			Input: input,
-		})
-	}
-
-	if len(out) == 0 {
-		return nil
-	}
-	return out
-}
--- a/internal/toolcall/toolcalls_textkv_test.go
+++ b/internal/toolcall/toolcalls_textkv_test.go
@@ -1,61 +0,0 @@
-package toolcall
-
-import (
-	"testing"
-)
-
-func TestParseTextKVToolCalls_Basic(t *testing.T) {
-	text := `
-status: already_called
-origin: assistant
-not_user_input: true
-tool_call_id: call_3fcd15235eb94f7eae3a8de5a9cfa36b
-function.name: execute_command
-function.arguments: {"command":"cd scripts && python check_syntax.py example.py","cwd":null,"timeout":30}
-
-Some other text thinking...
-`
-	calls := ParseToolCalls(text, []string{"execute_command"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Name != "execute_command" {
-		t.Fatalf("unexpected name: %s", calls[0].Name)
-	}
-	if calls[0].Input["command"] != "cd scripts && python check_syntax.py example.py" {
-		t.Fatalf("unexpected command arg: %v", calls[0].Input["command"])
-	}
-}
-
-func TestParseTextKVToolCalls_Multiple(t *testing.T) {
-	text := `
-function.name: read_file
-function.arguments: {
-	"path": "abc.txt"
-}
-
-function.name: bash
-function.arguments: {"command": "ls"}
-`
-	calls := ParseToolCalls(text, []string{"read_file", "bash"})
-	if len(calls) != 2 {
-		t.Fatalf("expected 2 calls, got %d", len(calls))
-	}
-	if calls[0].Name != "read_file" {
-		t.Fatalf("unexpected 1st name: %s", calls[0].Name)
-	}
-	if calls[1].Name != "bash" {
-		t.Fatalf("unexpected 2nd name: %s", calls[1].Name)
-	}
-}
-
-func TestParseTextKVToolCalls_Standalone(t *testing.T) {
-	text := "function.name: read_file\nfunction.arguments: {\"path\":\"README.md\"}"
-	calls := ParseStandaloneToolCalls(text, []string{"read_file"})
-	if len(calls) != 1 {
-		t.Fatalf("expected 1 call, got %d", len(calls))
-	}
-	if calls[0].Name != "read_file" {
-		t.Fatalf("unexpected name: %s", calls[0].Name)
-	}
-}
--- a/internal/toolcall/toolcalls_xml.go
+++ b/internal/toolcall/toolcalls_xml.go
@@ -0,0 +1,158 @@
+package toolcall
+
+import (
+	"encoding/xml"
+	"html"
+	"strings"
+)
+
+func parseStructuredToolCallInput(raw string) map[string]any {
+	trimmed := strings.TrimSpace(raw)
+	if trimmed == "" {
+		return map[string]any{}
+	}
+
+	if strings.HasPrefix(trimmed, "<") {
+		if parsed, ok := parseXMLFragmentValue(trimmed); ok {
+			switch v := parsed.(type) {
+			case map[string]any:
+				if len(v) > 0 {
+					return v
+				}
+				return map[string]any{}
+			case string:
+				text := strings.TrimSpace(v)
+				if text == "" {
+					return map[string]any{}
+				}
+				if parsedText := parseToolCallInput(text); len(parsedText) > 0 {
+					if isOnlyRawValue(parsedText, text) {
+						// Plain text content, keep it as raw text.
+					} else {
+						return parsedText
+					}
+				}
+				return map[string]any{"_raw": v}
+			}
+		}
+
+		if kv := parseMarkupKVObject(trimmed); len(kv) > 0 {
+			return kv
+		}
+	}
+
+	if kv := parseMarkupKVObject(trimmed); len(kv) > 0 {
+		return kv
+	}
+
+	if parsed := parseToolCallInput(trimmed); len(parsed) > 0 {
+		return parsed
+	}
+
+	return map[string]any{"_raw": html.UnescapeString(trimmed)}
+}
+
+func parseXMLFragmentValue(raw string) (any, bool) {
+	trimmed := strings.TrimSpace(raw)
+	if trimmed == "" {
+		return "", true
+	}
+
+	dec := xml.NewDecoder(strings.NewReader("<root>" + trimmed + "</root>"))
+	tok, err := dec.Token()
+	if err != nil {
+		return nil, false
+	}
+	start, ok := tok.(xml.StartElement)
+	if !ok || !strings.EqualFold(start.Name.Local, "root") {
+		return nil, false
+	}
+
+	value, err := parseXMLNodeValue(dec, start)
+	if err != nil {
+		return nil, false
+	}
+	return value, true
+}
+
+func parseXMLNodeValue(dec *xml.Decoder, start xml.StartElement) (any, error) {
+	children := map[string]any{}
+	var text strings.Builder
+	hasChild := false
+
+	for {
+		tok, err := dec.Token()
+		if err != nil {
+			return nil, err
+		}
+		switch t := tok.(type) {
+		case xml.CharData:
+			s := string([]byte(t))
+			if hasChild && strings.TrimSpace(s) == "" {
+				continue
+			}
+			text.WriteString(s)
+		case xml.StartElement:
+			if !hasChild && strings.TrimSpace(text.String()) == "" {
+				text.Reset()
+			}
+			hasChild = true
+			child, err := parseXMLNodeValue(dec, t)
+			if err != nil {
+				return nil, err
+			}
+			appendXMLChildValue(children, t.Name.Local, child)
+		case xml.EndElement:
+			if t.Name.Local != start.Name.Local {
+				return nil, errXMLMismatch(start.Name.Local, t.Name.Local)
+			}
+			if len(children) == 0 {
+				return text.String(), nil
+			}
+			if txt := text.String(); strings.TrimSpace(txt) != "" {
+				children["_text"] = txt
+			}
+			return children, nil
+		}
+	}
+}
+
+func appendXMLChildValue(dst map[string]any, key string, value any) {
+	if key == "" {
+		return
+	}
+	if existing, ok := dst[key]; ok {
+		switch current := existing.(type) {
+		case []any:
+			dst[key] = append(current, value)
+		default:
+			dst[key] = []any{current, value}
+		}
+		return
+	}
+	dst[key] = value
+}
+
+func isOnlyRawValue(m map[string]any, raw string) bool {
+	if len(m) != 1 {
+		return false
+	}
+	v, ok := m["_raw"].(string)
+	if !ok {
+		return false
+	}
+	return strings.TrimSpace(v) == strings.TrimSpace(raw)
+}
+
+type xmlMismatchError struct {
+	want string
+	got  string
+}
+
+func (e xmlMismatchError) Error() string {
+	return "mismatched xml end tag: want " + e.want + ", got " + e.got
+}
+
+func errXMLMismatch(want, got string) error {
+	return xmlMismatchError{want: want, got: got}
+}
--- a/internal/util/messages_test.go
+++ b/internal/util/messages_test.go
@@ -12,7 +12,7 @@ func TestMessagesPrepareBasic(t *testing.T) {
 	if got == "" {
 		t.Fatal("expected non-empty prompt")
 	}
-	if got != "<｜User｜>\nHello<｜end▁of▁sentence｜>" {
+	if got != "<｜begin▁of▁sentence｜><｜User｜>Hello<｜Assistant｜>" {
 		t.Fatalf("unexpected prompt: %q", got)
 	}
 }
@@ -26,16 +26,19 @@ func TestMessagesPrepareRoles(t *testing.T) {
 		{"role": "user", "content": "How are you"},
 	}
 	got := MessagesPrepare(messages)
-	if !contains(got, "<｜System｜>\nYou are helper<｜end▁of▁instructions｜>\n\n<｜User｜>\nHi<｜end▁of▁sentence｜>") {
+	if !contains(got, "<｜System｜>You are helper<｜end▁of▁instructions｜><｜User｜>Hi") {
 		t.Fatalf("expected system/user separation in %q", got)
 	}
-	if !contains(got, "<｜User｜>\nHi<｜end▁of▁sentence｜>\n\n<｜Assistant｜>\nHello<｜end▁of▁sentence｜>") {
+	if !contains(got, "<｜begin▁of▁sentence｜>") {
+		t.Fatalf("expected begin marker in %q", got)
+	}
+	if !contains(got, "<｜User｜>Hi<｜Assistant｜>Hello<｜end▁of▁sentence｜>") {
 		t.Fatalf("expected user/assistant separation in %q", got)
 	}
-	if !contains(got, "<｜Assistant｜>\nHello<｜end▁of▁sentence｜>\n\n<｜Tool｜>\nSearch results<｜end▁of▁toolresults｜>") {
+	if !contains(got, "<｜Assistant｜>Hello<｜end▁of▁sentence｜><｜Tool｜>Search results<｜end▁of▁toolresults｜>") {
 		t.Fatalf("expected assistant/tool separation in %q", got)
 	}
-	if !contains(got, "<｜Tool｜>\nSearch results<｜end▁of▁toolresults｜>\n\n<｜User｜>\nHow are you<｜end▁of▁sentence｜>") {
+	if !contains(got, "<｜Tool｜>Search results<｜end▁of▁toolresults｜><｜User｜>How are you") {
 		t.Fatalf("expected tool/user separation in %q", got)
 	}
 	if !contains(got, "<｜Assistant｜>") {
@@ -74,7 +77,7 @@ func TestMessagesPrepareArrayTextVariants(t *testing.T) {
 		},
 	}
 	got := MessagesPrepare(messages)
-	if got != "<｜User｜>\nline1\nline2<｜end▁of▁sentence｜>" {
+	if got != "<｜begin▁of▁sentence｜><｜User｜>line1\nline2<｜Assistant｜>" {
 		t.Fatalf("unexpected content from text variants: %q", got)
 	}
 }
--- a/internal/util/render_test.go
+++ b/internal/util/render_test.go
@@ -2,36 +2,6 @@ package util

 import "testing"

-func TestBuildOpenAIChatCompletionWithToolCalls(t *testing.T) {
-	out := BuildOpenAIChatCompletion(
-		"cid1",
-		"deepseek-chat",
-		"prompt",
-		"",
-		`{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`,
-		[]string{"search"},
-	)
-	if out["object"] != "chat.completion" {
-		t.Fatalf("unexpected object: %#v", out["object"])
-	}
-	choices, _ := out["choices"].([]map[string]any)
-	if len(choices) == 0 {
-		// json-like map from generic marshalling may be []any in some paths
-		rawChoices, _ := out["choices"].([]any)
-		if len(rawChoices) == 0 {
-			t.Fatalf("expected choices")
-		}
-		c0, _ := rawChoices[0].(map[string]any)
-		if c0["finish_reason"] != "tool_calls" {
-			t.Fatalf("expected finish_reason=tool_calls, got %#v", c0["finish_reason"])
-		}
-		return
-	}
-	if choices[0]["finish_reason"] != "tool_calls" {
-		t.Fatalf("expected finish_reason=tool_calls, got %#v", choices[0]["finish_reason"])
-	}
-}
-
 func TestBuildOpenAIResponseObjectWithText(t *testing.T) {
 	out := BuildOpenAIResponseObject(
 		"resp_1",
@@ -53,42 +23,3 @@ func TestBuildOpenAIResponseObjectWithText(t *testing.T) {
 		t.Fatalf("expected first output type message, got %#v", first["type"])
 	}
 }
-
-func TestBuildOpenAIResponseObjectToolCallsHidesRawOutputText(t *testing.T) {
-	out := BuildOpenAIResponseObject(
-		"resp_2",
-		"gpt-4o",
-		"prompt",
-		"",
-		`{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`,
-		[]string{"search"},
-	)
-	if out["output_text"] != "" {
-		t.Fatalf("expected empty output_text for tool_calls, got %#v", out["output_text"])
-	}
-	output, _ := out["output"].([]any)
-	if len(output) == 0 {
-		t.Fatalf("expected output entries")
-	}
-	first, _ := output[0].(map[string]any)
-	if first["type"] != "tool_calls" {
-		t.Fatalf("expected first output type tool_calls, got %#v", first["type"])
-	}
-}
-
-func TestBuildClaudeMessageResponseToolUse(t *testing.T) {
-	out := BuildClaudeMessageResponse(
-		"msg_1",
-		"claude-sonnet-4-5",
-		[]any{map[string]any{"role": "user", "content": "hi"}},
-		"",
-		`{"tool_calls":[{"name":"search","input":{"q":"go"}}]}`,
-		[]string{"search"},
-	)
-	if out["type"] != "message" {
-		t.Fatalf("unexpected type: %#v", out["type"])
-	}
-	if out["stop_reason"] != "tool_use" {
-		t.Fatalf("expected stop_reason=tool_use, got %#v", out["stop_reason"])
-	}
-}
--- a/internal/util/standard_request.go
+++ b/internal/util/standard_request.go
@@ -14,6 +14,7 @@ type StandardRequest struct {
 	Stream         bool
 	Thinking       bool
 	Search         bool
+	RefFileIDs     []string
 	PassThrough    map[string]any
 }

@@ -61,12 +62,19 @@ func (r StandardRequest) CompletionPayload(sessionID string) map[string]any {
 	if resolvedType, ok := config.GetModelType(modelID); ok {
 		modelType = resolvedType
 	}
+	refFileIDs := make([]any, 0, len(r.RefFileIDs))
+	for _, fileID := range r.RefFileIDs {
+		if fileID == "" {
+			continue
+		}
+		refFileIDs = append(refFileIDs, fileID)
+	}
 	payload := map[string]any{
 		"chat_session_id":   sessionID,
 		"model_type":        modelType,
 		"parent_message_id": nil,
 		"prompt":            r.FinalPrompt,
-		"ref_file_ids":      []any{},
+		"ref_file_ids":      refFileIDs,
 		"thinking_enabled":  r.Thinking,
 		"search_enabled":    r.Search,
 	}
--- a/internal/util/standard_request_test.go
+++ b/internal/util/standard_request_test.go
@@ -22,6 +22,7 @@ func TestStandardRequestCompletionPayloadSetsModelTypeFromResolvedModel(t *testi
 				FinalPrompt:   "hello",
 				Thinking:      tc.thinking,
 				Search:        tc.search,
+				RefFileIDs:    []string{"file-a", "file-b"},
 				PassThrough: map[string]any{
 					"temperature": 0.3,
 				},
@@ -44,6 +45,13 @@ func TestStandardRequestCompletionPayloadSetsModelTypeFromResolvedModel(t *testi
 			if got := payload["temperature"]; got != 0.3 {
 				t.Fatalf("expected passthrough temperature, got %#v", got)
 			}
+			refFileIDs, ok := payload["ref_file_ids"].([]any)
+			if !ok {
+				t.Fatalf("expected ref_file_ids slice, got %#v", payload["ref_file_ids"])
+			}
+			if len(refFileIDs) != 2 || refFileIDs[0] != "file-a" || refFileIDs[1] != "file-b" {
+				t.Fatalf("unexpected ref_file_ids: %#v", refFileIDs)
+			}
 		})
 	}
 }
--- a/internal/util/util_edge_test.go
+++ b/internal/util/util_edge_test.go
@@ -162,7 +162,7 @@ func TestMessagesPrepareMergesConsecutiveSameRole(t *testing.T) {
 		{"role": "user", "content": "World"},
 	}
 	got := MessagesPrepare(messages)
-	if !strings.HasPrefix(got, "<｜User｜>") {
+	if !strings.HasPrefix(got, "<｜begin▁of▁sentence｜>") {
 		t.Fatalf("expected user marker at the start, got %q", got)
 	}
 	if !strings.Contains(got, "Hello") || !strings.Contains(got, "World") {
@@ -173,8 +173,10 @@ func TestMessagesPrepareMergesConsecutiveSameRole(t *testing.T) {
 	if count != 1 {
 		t.Fatalf("expected one User marker for the merged pair, got %d occurrences", count)
 	}
-	if count := strings.Count(got, "<｜end▁of▁sentence｜>"); count != 1 {
-		t.Fatalf("expected one sentence terminator for the merged pair, got %d occurrences", count)
+	// User messages no longer have end_of_sentence markers in the official format.
+	// The merged pair should have zero end_of_sentence markers (user turn only).
+	if count := strings.Count(got, "<｜end▁of▁sentence｜>"); count != 0 {
+		t.Fatalf("expected zero sentence terminators for user-only merge, got %d occurrences", count)
 	}
 }

@@ -190,12 +192,15 @@ func TestMessagesPrepareAssistantMarkers(t *testing.T) {
 	if !strings.Contains(got, "<｜end▁of▁sentence｜>") {
 		t.Fatalf("expected end of sentence marker, got %q", got)
 	}
-	if strings.Count(got, "<｜end▁of▁sentence｜>") != 2 {
-		t.Fatalf("expected both turns to be terminated, got %q", got)
+	if strings.Count(got, "<｜end▁of▁sentence｜>") != 1 {
+		t.Fatalf("expected one end_of_sentence (assistant only), got %q", got)
 	}
-	if !strings.Contains(got, "<｜Assistant｜>\nHello!<｜end▁of▁sentence｜>") {
+	if !strings.Contains(got, "<｜Assistant｜>Hello!<｜end▁of▁sentence｜>") {
 		t.Fatalf("expected assistant EOS suffix, got %q", got)
 	}
+	if strings.Contains(got, "<think>") || strings.Contains(got, "</think>") {
+		t.Fatalf("did not expect think tags in prompt, got %q", got)
+	}
 	if strings.Contains(got, "<system_instructions>") {
 		t.Fatalf("did not expect legacy system marker, got %q", got)
 	}
--- a/opencode.json.example
+++ b/opencode.json.example
@@ -1,28 +0,0 @@
-{
-  "$schema": "https://opencode.ai/config.json",
-  "provider": {
-    "ds2api": {
-      "npm": "@ai-sdk/openai-compatible",
-      "name": "DS2API",
-      "options": {
-        "baseURL": "http://localhost:5001/v1",
-        "apiKey": "your-api-key"
-      },
-      "models": {
-        "gpt-4o": {
-          "name": "GPT-4o (aliased to deepseek-chat)"
-        },
-        "gpt-5-codex": {
-          "name": "GPT-5 Codex (aliased to deepseek-reasoner)"
-        },
-        "deepseek-chat": {
-          "name": "DeepSeek Chat (DS2API)"
-        },
-        "deepseek-reasoner": {
-          "name": "DeepSeek Reasoner (DS2API)"
-        }
-      }
-    }
-  },
-  "model": "ds2api/gpt-5-codex"
-}
--- a/tests/compat/expected/toolcalls_allowlist_empty.json
+++ b/tests/compat/expected/toolcalls_allowlist_empty.json
@@ -1,13 +0,0 @@
-{
-  "calls": [
-    {
-      "name": "unknown_tool",
-      "input": {
-        "x": 1
-      }
-    }
-  ],
-  "sawToolCallSyntax": true,
-  "rejectedByPolicy": false,
-  "rejectedToolNames": []
-}
--- a/tests/compat/expected/toolcalls_case_insensitive_canonical.json
+++ b/tests/compat/expected/toolcalls_case_insensitive_canonical.json
@@ -1,13 +0,0 @@
-{
-  "calls": [
-    {
-      "name": "Read_File",
-      "input": {
-        "path": "README.MD"
-      }
-    }
-  ],
-  "sawToolCallSyntax": true,
-  "rejectedByPolicy": false,
-  "rejectedToolNames": []
-}
--- a/tests/compat/expected/toolcalls_fenced_json.json
+++ b/tests/compat/expected/toolcalls_fenced_json.json
@@ -1,6 +0,0 @@
-{
-  "calls": [],
-  "sawToolCallSyntax": true,
-  "rejectedByPolicy": false,
-  "rejectedToolNames": []
-}
--- a/tests/compat/expected/toolcalls_json_payload_with_incidental_xml_text.json
+++ b/tests/compat/expected/toolcalls_json_payload_with_incidental_xml_text.json
@@ -1,13 +0,0 @@
-{
-  "calls": [
-    {
-      "name": "search",
-      "input": {
-        "q": "latest <tool_call><tool_name>wrong</tool_name><parameters>{\"x\":1}</parameters></tool_call>"
-      }
-    }
-  ],
-  "sawToolCallSyntax": true,
-  "rejectedByPolicy": false,
-  "rejectedToolNames": []
-}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
CJACK.	2ba8b143d0	Merge pull request #268 from CJackHwang/dev chore: bump version to 3.5.0	2026-04-20 01:26:09 +08:00
CJACK	70603a5a90	chore: bump version to 3.5.0	2026-04-20 01:24:31 +08:00
CJACK.	fa51aafdc5	Merge pull request #265 from CJackHwang/dev refactor: enforce mandatory CDATA wrapping for all string parameters in tool call XML output ## XML工具调用解析与代码围栏感知 - `f313d00` – 在工具筛选中增加代码围栏感知能力，防止代码块内的XML工具调用被误判，并优化了提示词指令。 - `69eb711` – 扩展工具调用解析器，支持变长Markdown围栏（如 ```` ``` ````）。 - `5b7cdaa` – 修复了被Markdown围栏包裹的XML工具调用解析问题。 ## 系统提示与思考模式 - `10d681f` – 开启思考模式时，向系统提示词中注入对话连贯性与推理指令。 ## API与文档对齐 - `08f32c4` – 使API文档与当前已实现的路由保持一致。 - `0e7f5cd` – 同步工具调用语义文档与当前实现。 - `2c08375` – 将模型别名示例刷新为当前默认值。 ## 代码质量与强制规范 - `69b7bc0` – 强制要求工具调用XML输出中所有字符串参数必须使用CDATA包裹，提升鲁棒性。 ## 合并请求 - `12256ce` – 合并PR #266：文档准确性更新。 - `fa38934` – 合并PR #267：XML解析修复。	2026-04-20 01:20:11 +08:00
CJACK	10d681ffe7	feat: inject conversation continuity and reasoning instructions into system prompt when thinking is enabled	2026-04-20 00:47:05 +08:00
CJACK	f313d0068f	feat: implement code fence awareness in tool sieve to prevent false-positive XML tool detection inside code blocks and refine prompt instructions.	2026-04-20 00:13:14 +08:00
CJACK.	12256ceb24	Merge pull request #266 from CJackHwang/codex/update-documentation-for-accuracy docs: align API docs with implemented routes and limits	2026-04-19 23:43:28 +08:00
CJACK.	2c08375b49	docs: refresh model alias examples to current defaults	2026-04-19 23:42:34 +08:00
CJACK.	fa38934114	Merge pull request #267 from CJackHwang/codex/fix-xml-parsing-for-tool-calls Strip fenced code blocks before XML tool-call parsing to avoid executing examples	2026-04-19 23:40:26 +08:00
CJACK.	69eb71159d	Handle variable-length markdown fences in toolcall parser	2026-04-19 23:37:31 +08:00
CJACK.	0e7f5cdc86	docs: sync tool-calling semantics with current implementation	2026-04-19 23:12:13 +08:00
CJACK.	5b7cdaa729	Fix XML tool-call parsing for fenced markdown examples	2026-04-19 23:11:24 +08:00
CJACK.	08f32c4c40	docs: align API docs with implemented routes	2026-04-19 21:04:06 +08:00
CJACK	69b7bc0c1a	refactor: enforce mandatory CDATA wrapping for all string parameters in tool call XML output	2026-04-19 20:11:53 +08:00
CJACK	0f2b5fee23	refactor: enhance XML tool call parsing to support nested structures, CDATA, and repeated tags	2026-04-19 19:58:45 +08:00
CJACK	26d195f2a6	refactor: update tool call format to prefer XML-style parameters with CDATA support for robust content handling	2026-04-19 18:51:25 +08:00
CJACK	790a8ca980	refactor: implement robust think tag stripping and CDATA handling for SSE stream parsing	2026-04-19 18:35:56 +08:00
CJACK	a1ce954ad5	refactor: implement auto-transition from thinking to text content upon detecting </think> tags and remove unused helper functions	2026-04-19 18:05:38 +08:00
CJACK	6688e0ba35	refactor: remove unnecessary whitespace and end-of-sentence markers to align with official DeepSeek chat template encoding	2026-04-19 17:47:45 +08:00
CJACK	c945f49fc4	refactor: remove JSON-based tool call parsing from sieve and delete associated compatibility tests	2026-04-19 13:39:47 +08:00
CJACK	0c644d1f4d	refactor: remove legacy function call support and simplify tool sieve logic	2026-04-19 04:38:48 +08:00
CJACK.	146d59e7bf	Merge pull request #263 from utafrali/fix/issue-261-bug fix: Increase account page size limit to 5000	2026-04-18 12:49:08 +08:00
ugurtafrali	daf3307b88	fix: Increase account page size limit to 5000	2026-04-18 05:16:57 +03:00
CJACK.	67501cf4d2	Merge pull request #256 from CJackHwang/dev 全模型全渠道附件上传deepseek功能全接口兼容性待测试	2026-04-13 04:00:49 +08:00
CJACK	25234af301	feat: enforce request body size limits and restrict inline file count to prevent resource exhaustion	2026-04-13 03:55:14 +08:00
CJACK	2aee80d0d3	fix: update URL decoding method and refine file ID extraction logic to exclude text-based inputs	2026-04-13 03:49:06 +08:00
CJACK	ab9f3cc417	refactor: remove unused leakedDanglingThinkOpenPattern regex from output sanitizer	2026-04-13 03:40:20 +08:00
CJACK	c92ed8d3c3	refactor: rename apiTester testSuccess key to requestSuccess and update localization files	2026-04-13 03:24:39 +08:00
CJACK	d78789a66e	feat: implement error handling for empty upstream responses in chat streams and update UI to display stream-level errors	2026-04-13 03:22:38 +08:00
CJACK	acb110865f	feat: implement cross-account validation and improved error handling for file attachments in API tester	2026-04-13 03:15:12 +08:00
CJACK	ffca8be597	feat: implement file readiness polling and add IsImage field to upload results	2026-04-13 02:55:45 +08:00
CJACK	7ef6a7d11f	feat: update to v3.4.0 and redesign model selection UI with a dropdown and descriptive panel	2026-04-13 02:27:12 +08:00
CJACK	d53a2ea7d2	refactor: remove unused purpose parameter from upload and upstream empty output handlers	2026-04-13 01:59:51 +08:00
CJACK	daa636e040	refactor: handle upstream thinking-only responses as errors and sanitize dangling think tags in output	2026-04-13 01:55:14 +08:00
CJACK	aa41bae044	feat: add file attachment support to chat interface and API requests	2026-04-13 00:04:38 +08:00
CJACK	2027c7cd77	fix: add JSON headers to DeepSeek requests and prevent string content from being parsed as file IDs in OpenAI adapter	2026-04-12 23:49:56 +08:00
CJACK	0591128601	refactor: fix file handling error suppression, optimize hash calculation, and update API documentation with additional models	2026-04-12 23:35:57 +08:00
CJACK	caafdedb00	feat: implement OpenAI-compatible file upload and reference handling for DeepSeek API	2026-04-12 23:30:22 +08:00
CJACK	0a23c77ff7	feat: add sanitization for think tags and BOS markers in leaked output and update golang.org/x/net dependency	2026-04-12 17:43:57 +08:00
CJACK.	d759804c33	Merge pull request #255 from CJackHwang/codex/refactor-prompt-concatenation-using-tokenizer feat(prompt): tokenizer-style prompt stitching with thinking-prefix support	2026-04-12 17:14:48 +08:00
CJACK.	433a3a877d	feat(prompt): align DeepSeek prompt assembly with tokenizer-style turns	2026-04-12 13:59:42 +08:00
CJACK.	792e295512	Merge pull request #254 from CJackHwang/main Update VERSION	2026-04-08 20:24:03 +08:00
CJACK.	d053d9ad04	Update VERSION	2026-04-08 20:22:55 +08:00
CJACK.	04e025c5e1	Update README.MD	2026-04-08 18:21:09 +08:00
@@ -1 +1 @@
 .2.0
 .5.0