Merge pull request #256 from CJackHwang/dev

全模型全渠道附件上传deepseek功能全接口兼容性待测试
feat: enforce request body size limits and restrict inline file count to prevent resource exhaustion
2026-05-05 00:45:29 +08:00 · 2026-04-13 04:00:49 +08:00 · 2026-04-13 03:55:14 +08:00 · 2026-04-13 03:49:06 +08:00 · 2026-04-13 03:40:20 +08:00 · 2026-04-13 03:24:39 +08:00
208 changed files with 74782 additions and 1673 deletions
--- a/.github/workflows/quality-gates.yml
+++ b/.github/workflows/quality-gates.yml
@@ -28,6 +28,16 @@ jobs:
          cache: "npm"
          cache-dependency-path: webui/package-lock.json
      - name: Setup golangci-lint
        uses: golangci/golangci-lint-action@v8
        with:
          version: v2.11.4
          install-mode: binary
          verify: true
      - name: Go Format & Lint Gates
        run: ./scripts/lint.sh
      - name: Refactor Line Gate
        run: ./tests/scripts/check-refactor-line-gate.sh
--- a/.github/workflows/release-artifacts.yml
+++ b/.github/workflows/release-artifacts.yml
@@ -79,7 +79,7 @@ jobs:
            CGO_ENABLED=0 GOOS="${GOOS}" GOARCH="${GOARCH}" \
              go build -trimpath -ldflags="-s -w -X ds2api/internal/version.BuildVersion=${BUILD_VERSION}" -o "${STAGE}/${BIN}" ./cmd/ds2api
-            cp config.example.json .env.example internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm LICENSE README.MD README.en.md "${STAGE}/"
+            cp config.example.json .env.example LICENSE README.MD README.en.md "${STAGE}/"
            cp -R static/admin "${STAGE}/static/admin"
            if [ "${GOOS}" = "windows" ]; then
--- a/.gitignore
+++ b/.gitignore
@@ -59,3 +59,6 @@ Thumbs.db
 # Claude Code
 .claude/
 CLAUDE.local.md
 # Local tool bootstrap cache
 .tmp/
--- a/.golangci.yml
+++ b/.golangci.yml
@@ -0,0 +1,73 @@
 version: "2"
 run:
  tests: true
 linters:
  default: standard
  enable:
    - errcheck
    - govet
    - ineffassign
    - staticcheck
    - unused
  settings:
    dupl:
      threshold: 100
    goconst:
      min-len: 2
      min-occurrences: 2
    gocritic:
      enabled-tags:
        - diagnostic
        - experimental
        - opinionated
        - performance
        - style
      disabled-checks:
        - wrapperFunc
        - rangeValCopy
        - hugeParam
    gocyclo:
      min-complexity: 15
    lll:
      line-length: 140
    misspell:
      locale: US
    nakedret:
      max-func-lines: 30
    prealloc:
      simple: true
      range-loops: true
      for-loops: false
  exclusions:
    generated: lax
    rules:
      - path: (.+)\.go$
        text: "ST1000: at least one file in a package should have a package comment"
    paths:
      - third_party$
      - builtin$
      - examples$
      - vendor$
      - webui/node_modules$
 issues:
  max-issues-per-linter: 0
  max-same-issues: 0
 formatters:
  enable:
    - gofmt
  settings:
    goimports:
      local-prefixes:
        - ds2api
  exclusions:
    generated: lax
    paths:
      - third_party$
      - builtin$
      - examples$
      - vendor$
      - webui/node_modules$
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,23 @@
 # AGENTS.md
 These rules apply to all agent-made changes in this repository.
 ## PR Gate
 - Before opening or updating a PR, run the same local gates as `.github/workflows/quality-gates.yml`.
 - Required commands:
  - `./scripts/lint.sh`
  - `./tests/scripts/check-refactor-line-gate.sh`
  - `./tests/scripts/run-unit-all.sh`
  - `npm run build --prefix webui`
 ## Go Lint Rules
 - Run `gofmt -w` on every changed Go file before commit or push.
 - Do not ignore error returns from I/O-style cleanup calls such as `Close`, `Flush`, `Sync`, or similar methods.
 - If a cleanup error cannot be returned, log it explicitly.
 ## Change Scope
 - Keep changes additive and tightly scoped to the requested feature or bugfix.
 - Do not mix unrelated refactors into feature PRs unless they are required to make the change pass gates.
--- a/API.en.md
+++ b/API.en.md
@@ -4,6 +4,8 @@ Language: [中文](API.md) | [English](API.en.md)
 This document describes the actual behavior of the current Go codebase.
 Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Deployment](docs/DEPLOY.en.md) / [Testing](docs/TESTING.md)
 ---
 ## Table of Contents
@@ -138,6 +140,9 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
 | POST | `/admin/accounts/sessions/delete-all` | Admin | Delete all sessions for one account |
 | POST | `/admin/import` | Admin | Batch import keys/accounts |
 | POST | `/admin/test` | Admin | Test API through service |
 | POST | `/admin/dev/raw-samples/capture` | Admin | Fire one request and persist it as a raw sample |
 | GET | `/admin/dev/raw-samples/query` | Admin | Search current in-memory capture chains by prompt keyword |
 | POST | `/admin/dev/raw-samples/save` | Admin | Persist a selected in-memory capture chain as a raw sample |
 | POST | `/admin/vercel/sync` | Admin | Sync config to Vercel |
 | GET | `/admin/vercel/status` | Admin | Vercel sync status |
 | POST | `/admin/vercel/status` | Admin | Vercel sync status / draft compare |
@@ -168,7 +173,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
 ### `GET /v1/models`
-No auth required. Returns supported models.
+No auth required. Returns the currently supported DeepSeek native model list.
 **Response**:
@@ -179,11 +184,21 @@ No auth required. Returns supported models.
    {"id": "deepseek-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
-    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
+    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
  ]
 }
 ```
 > Note: `/v1/models` returns normalized DeepSeek native model IDs. Common aliases are accepted only as request input and are not expanded as separate items in this endpoint.
 ### Model Alias Resolution
 For `chat` / `responses` / `embeddings`, DS2API follows a wide-input/strict-output policy:
@@ -206,7 +221,7 @@ Content-Type: application/json
 | Field | Type | Required | Notes |
 | --- | --- | --- | --- |
-| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, etc.) |
+| `model` | string | ✅ | DeepSeek native models + common aliases (`gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, `gemini-2.5-pro`, etc.) |
 | `messages` | array | ✅ | OpenAI-style messages |
 | `stream` | boolean | ❌ | Default `false` |
 | `tools` | array | ❌ | Function calling schema |
@@ -264,6 +279,7 @@ data: [DONE]
 - `deepseek-reasoner` / `deepseek-reasoner-search` models emit `delta.reasoning_content`
 - Text emits `delta.content`
 - Last chunk includes `finish_reason` and `usage`
 - Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent
 #### Tool Calls
@@ -380,6 +396,7 @@ Business auth required. Returns OpenAI-compatible embeddings shape.
 ## Claude-Compatible API
 Besides `/anthropic/v1/*`, DS2API also supports shortcut paths: `/v1/messages`, `/messages`, `/v1/messages/count_tokens`, `/messages/count_tokens`.
 Implementation-wise this path is unified on the OpenAI Chat Completions parse-and-translate pipeline to avoid maintaining divergent parsing chains.
 ### `GET /anthropic/v1/models`
@@ -401,7 +418,7 @@ No auth required.
 }
 ```
-> Note: the example is partial; the real response includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases.
+> Note: the example is partial; besides the current primary aliases, the real response also includes Claude 4.x snapshots plus historical 3.x / 2.x / 1.x IDs and common aliases.
 ### `POST /anthropic/v1/messages`
@@ -514,6 +531,7 @@ Supported paths:
 - `/v1/models/{model}:streamGenerateContent` (compat path)
 Authentication is the same as other business routes (`Authorization: Bearer <token>` or `x-api-key`).
 Implementation-wise this path is unified on the OpenAI Chat Completions parse-and-translate pipeline to avoid maintaining divergent parsing chains.
 ### `POST /v1beta/models/{model}:generateContent`
@@ -532,6 +550,7 @@ Returns SSE (`text/event-stream`), each chunk as `data: <json>`:
 - regular text: incremental text chunks
 - `tools` mode: buffered and emitted as `functionCall` at finalize phase
 - final chunk: includes `finishReason: "STOP"` and `usageMetadata`
 - Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent
 ---
@@ -883,6 +902,74 @@ Test API availability through the service itself.
 }
 ```
 ### `POST /admin/dev/raw-samples/capture`
 Internally issues one `/v1/chat/completions` request through the service, then persists the request metadata and raw upstream SSE into `tests/raw_stream_samples/<sample-id>/`.
 Common request fields:
 | Field | Required | Default | Notes |
 | --- | --- | --- | --- |
 | `message` | No | `你好` | Convenience single-turn user message |
 | `messages` | No | Auto-derived from `message` | OpenAI-style message array |
 | `model` | No | `deepseek-chat` | Target model |
 | `stream` | No | `true` | Recommended to keep streaming enabled so raw SSE is recorded |
 | `api_key` | No | First configured key | Business API key to use |
 | `sample_id` | No | Auto-generated | Sample directory name |
 On success, the response headers include:
 - `X-Ds2-Sample-Id`
 - `X-Ds2-Sample-Dir`
 - `X-Ds2-Sample-Meta`
 - `X-Ds2-Sample-Upstream`
 If the request itself succeeds but the process did not record a new upstream capture, the endpoint returns:
 ```json
 {"detail":"no upstream capture was recorded"}
 ```
 ### `GET /admin/dev/raw-samples/query`
 Searches the current process's in-memory capture entries and groups `completion + continue` rounds by `chat_session_id`.
 **Query parameters**:
 | Param | Default | Notes |
 | --- | --- | --- |
 | `q` | empty | Fuzzy match against request/response text |
 | `limit` | `20` | Max number of chains returned |
 **Response fields** include:
 - `items[].chain_key`
 - `items[].capture_ids`
 - `items[].round_count`
 - `items[].initial_label`
 - `items[].request_preview`
 - `items[].response_preview`
 ### `POST /admin/dev/raw-samples/save`
 Persists one selected in-memory capture chain into `tests/raw_stream_samples/<sample-id>/`.
 Any one of these selectors is accepted:
 ```json
 {"chain_key":"session:xxxx","sample_id":"tmp-from-memory"}
 ```
 ```json
 {"capture_id":"cap_xxx","sample_id":"tmp-from-memory"}
 ```
 ```json
 {"query":"Guangzhou weather","sample_id":"tmp-from-memory"}
 ```
 The success payload includes `sample_id`, `dir`, `meta_path`, and `upstream_path`.
 ### `POST /admin/vercel/sync`
 | Field | Required | Notes |
--- a/API.md
+++ b/API.md
@@ -4,6 +4,8 @@
 本文档描述当前 Go 代码库的实际 API 行为。
 文档导航：[总览](README.MD) / [架构说明](docs/ARCHITECTURE.md) / [部署指南](docs/DEPLOY.md) / [测试指南](docs/TESTING.md)
 ---
 ## 目录
@@ -138,6 +140,9 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
 | POST | `/admin/accounts/sessions/delete-all` | Admin | 删除某账号的全部会话 |
 | POST | `/admin/import` | Admin | 批量导入 keys/accounts |
 | POST | `/admin/test` | Admin | 测试当前 API 可用性 |
 | POST | `/admin/dev/raw-samples/capture` | Admin | 直接发起一次请求并保存为 raw sample |
 | GET | `/admin/dev/raw-samples/query` | Admin | 按问题关键词查询当前内存抓包链 |
 | POST | `/admin/dev/raw-samples/save` | Admin | 把命中的内存抓包链保存为 raw sample |
 | POST | `/admin/vercel/sync` | Admin | 同步配置到 Vercel |
 | GET | `/admin/vercel/status` | Admin | Vercel 同步状态 |
 | POST | `/admin/vercel/status` | Admin | Vercel 同步状态 / 草稿对比 |
@@ -168,7 +173,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
 ### `GET /v1/models`
-无需鉴权。返回当前支持的模型列表。
+无需鉴权。返回当前支持的 DeepSeek 原生模型列表。
 **响应示例**：
@@ -179,11 +184,21 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
    {"id": "deepseek-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
-    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
+    {"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-expert-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
    {"id": "deepseek-vision-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
  ]
 }
 ```
 > 说明：`/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID；常见 alias 仅用于请求入参解析，不会在该接口中单独展开返回。
 ### 模型 alias 解析策略
 对 `chat` / `responses` / `embeddings` 的 `model` 字段采用“宽进严出”：
@@ -206,7 +221,7 @@ Content-Type: application/json
 | 字段 | 类型 | 必填 | 说明 |
 | --- | --- | --- | --- |
-| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias（如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`） |
+| `model` | string | ✅ | 支持 DeepSeek 原生模型 + 常见 alias（如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`、`gemini-2.5-pro` 等） |
 | `messages` | array | ✅ | OpenAI 风格消息数组 |
 | `stream` | boolean | ❌ | 默认 `false` |
 | `tools` | array | ❌ | Function Calling 定义 |
@@ -264,6 +279,7 @@ data: [DONE]
 - `deepseek-reasoner` / `deepseek-reasoner-search` 模型输出 `delta.reasoning_content`
 - 普通文本输出 `delta.content`
 - 最后一段包含 `finish_reason` 和 `usage`
 - token 计数优先透传上游 DeepSeek SSE（如 `accumulated_token_usage` / `token_usage`）；仅在上游缺失时回退本地估算
 #### Tool Calls
@@ -386,6 +402,7 @@ data: [DONE]
 ## Claude 兼容接口
 除标准路径 `/anthropic/v1/*` 外，还支持快捷路径 `/v1/messages`、`/messages`、`/v1/messages/count_tokens`、`/messages/count_tokens`。
 实现上统一走 OpenAI Chat Completions 解析与回译链路，避免多套解析逻辑分叉维护。
 ### `GET /anthropic/v1/models`
@@ -407,7 +424,7 @@ data: [DONE]
 }
 ```
-> 说明：示例仅展示部分模型；实际返回包含 Claude 1.x/2.x/3.x/4.x 历史模型 ID 与常见别名。
+> 说明：示例仅展示部分模型；实际返回除当前主别名外，还包含 Claude 4.x snapshots，以及 3.x / 2.x / 1.x 历史模型 ID 与常见别名。
 ### `POST /anthropic/v1/messages`
@@ -520,6 +537,7 @@ data: {"type":"message_stop"}
 - `/v1/models/{model}:streamGenerateContent`（兼容路径）
 鉴权方式同业务接口（`Authorization: Bearer <token>` 或 `x-api-key`）。
 实现上统一走 OpenAI Chat Completions 解析与回译链路，避免多套解析逻辑分叉维护。
 ### `POST /v1beta/models/{model}:generateContent`
@@ -538,6 +556,7 @@ data: {"type":"message_stop"}
 - 常规文本：持续返回增量文本 chunk
 - `tools` 场景：会缓冲并在结束时输出 `functionCall` 结构
 - 结束 chunk：包含 `finishReason: "STOP"` 与 `usageMetadata`
 - token 计数优先透传上游 DeepSeek SSE（如 `accumulated_token_usage` / `token_usage`）；仅在上游缺失时回退本地估算
 ---
@@ -886,6 +905,74 @@ data: {"type":"message_stop"}
 }
 ```
 ### `POST /admin/dev/raw-samples/capture`
 直接通过服务自身发起一次 `/v1/chat/completions` 请求，并把请求元信息和上游原始 SSE 保存到 `tests/raw_stream_samples/<sample-id>/`。
 常用请求字段：
 | 字段 | 必填 | 默认值 | 说明 |
 | --- | --- | --- | --- |
 | `message` | 否 | `你好` | 便捷单轮用户消息 |
 | `messages` | 否 | 自动由 `message` 生成 | OpenAI 风格消息数组 |
 | `model` | 否 | `deepseek-chat` | 目标模型 |
 | `stream` | 否 | `true` | 建议保留流式，以记录原始 SSE |
 | `api_key` | 否 | 配置中第一个 key | 调用业务接口使用的 key |
 | `sample_id` | 否 | 自动生成 | 样本目录名 |
 成功时会在响应头里附带：
 - `X-Ds2-Sample-Id`
 - `X-Ds2-Sample-Dir`
 - `X-Ds2-Sample-Meta`
 - `X-Ds2-Sample-Upstream`
 如果请求本身成功，但当前进程没有记录到新的上游抓包，会返回：
 ```json
 {"detail":"no upstream capture was recorded"}
 ```
 ### `GET /admin/dev/raw-samples/query`
 按关键词查询当前进程内存里的抓包记录，并按 `chat_session_id` 归并 `completion + continue` 链。
 **查询参数**：
 | 参数 | 默认值 | 说明 |
 | --- | --- | --- |
 | `q` | 空 | 按请求体/响应体关键词模糊匹配 |
 | `limit` | `20` | 返回链条数上限 |
 **响应字段**包含：
 - `items[].chain_key`
 - `items[].capture_ids`
 - `items[].round_count`
 - `items[].initial_label`
 - `items[].request_preview`
 - `items[].response_preview`
 ### `POST /admin/dev/raw-samples/save`
 把当前内存中的某条抓包链落盘为 `tests/raw_stream_samples/<sample-id>/`。
 支持以下任一种选中方式：
 ```json
 {"chain_key":"session:xxxx","sample_id":"tmp-from-memory"}
 ```
 ```json
 {"capture_id":"cap_xxx","sample_id":"tmp-from-memory"}
 ```
 ```json
 {"query":"广州天气","sample_id":"tmp-from-memory"}
 ```
 成功响应会返回 `sample_id`、`dir`、`meta_path`、`upstream_path`。
 ### `POST /admin/vercel/sync`
 | 字段 | 必填 | 说明 |
--- a/6
+++ b/6
@@ -34,7 +34,7 @@ CMD ["/usr/local/bin/ds2api"]
 FROM runtime-base AS runtime-from-source
 COPY --from=go-builder /out/ds2api /usr/local/bin/ds2api
-COPY --from=go-builder /app/internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm /app/sha3_wasm_bg.7b9ca65ddd.wasm
+
 COPY --from=go-builder /app/config.example.json /app/config.example.json
 COPY --from=webui-builder /app/static/admin /app/static/admin
@@ -53,13 +53,13 @@ RUN set -eux; \
    test -n "${PKG_DIR}"; \
    mkdir -p /out/static; \
    cp "${PKG_DIR}/ds2api" /out/ds2api; \
-    cp "${PKG_DIR}/sha3_wasm_bg.7b9ca65ddd.wasm" /out/sha3_wasm_bg.7b9ca65ddd.wasm; \
+
    cp "${PKG_DIR}/config.example.json" /out/config.example.json; \
    cp -R "${PKG_DIR}/static/admin" /out/static/admin
 FROM runtime-base AS runtime-from-dist
 COPY --from=dist-extract /out/ds2api /usr/local/bin/ds2api
-COPY --from=dist-extract /out/sha3_wasm_bg.7b9ca65ddd.wasm /app/sha3_wasm_bg.7b9ca65ddd.wasm
+
 COPY --from=dist-extract /out/config.example.json /app/config.example.json
 COPY --from=dist-extract /out/static/admin /app/static/admin
--- a/README.MD
+++ b/README.MD
@@ -16,6 +16,10 @@
 将 DeepSeek Web 对话能力转换为 OpenAI、Claude 与 Gemini 兼容 API。后端为 **Go 全量实现**，前端为 React WebUI 管理台（源码在 `webui/`，部署时自动构建到 `static/admin`）。
 文档入口：[文档导航](docs/README.md) / [架构说明](docs/ARCHITECTURE.md) / [接口文档](API.md)
 【感谢Linux.do社区及GitHub社区各位开发者对项目的支持与贡献】
 > **重要免责声明**
 >
 > 本仓库仅供学习、研究、个人实验和内部验证使用，不提供任何形式的商业授权、适用性保证或结果保证。
@@ -24,7 +28,7 @@
 >
 > 请勿将本项目用于违反服务条款、协议、法律法规或平台规则的场景。商业使用前请自行确认 `LICENSE`、相关协议以及你是否获得了作者的书面许可。
-## 架构概览
+## 架构概览（摘要）
 ```mermaid
 flowchart LR
@@ -48,7 +52,7 @@ flowchart LR
            Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
            Pool["Account Pool + Queue\n(并发槽位 + 等待队列)"]
            DSClient["DeepSeek Client\n(Session / Auth / HTTP)"]
-            Pow["PoW WASM\n(wazero 预加载)"]
+            Pow["PoW 实现\n(纯 Go 毫秒级)"]
            Tool["Tool Sieve\n(Go/Node 语义对齐)"]
        end
    end
@@ -72,6 +76,8 @@ flowchart LR
    Bridge --> Client
 ```
 详细架构拆分与目录职责见 [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)。
 - **后端**：Go（`cmd/ds2api/`、`api/`、`internal/`），不依赖 Python 运行时
 - **前端**：React 管理台（`webui/`），运行时托管静态构建产物
 - **部署**：本地运行、Docker、Vercel Serverless、Linux systemd
@@ -81,7 +87,7 @@ flowchart LR
 - **统一路由内核**：所有协议入口统一汇聚到 `internal/server/router.go`，并在同一路由树中注册 OpenAI / Claude / Gemini / Admin / WebUI 路由，避免多入口行为漂移。
 - **统一执行链路**：Claude / Gemini 入口先经 `internal/translatorcliproxy` 做协议转换，再进入 `openai.ChatCompletions` 统一处理工具调用与流式语义，最后再转换回原协议响应。
 - **适配器分层更清晰**：`internal/adapter/{claude,gemini}` 负责入口/出口协议封装，`internal/adapter/openai` 负责核心执行，DeepSeek 侧调用只保留在 OpenAI 内核中。
- **Tool Calling 双运行时对齐**：Go 侧（`internal/util`）与 Vercel Node 侧（`internal/js/helpers/stream-tool-sieve`）保持一致的解析/防泄漏语义，覆盖 JSON / XML / invoke / text-kv 多风格输入。
+- **Tool Calling 双运行时对齐**：Go 侧（`internal/toolcall`）与 Vercel Node 侧（`internal/js/helpers/stream-tool-sieve`）保持一致的解析/防泄漏语义，覆盖 JSON / XML / invoke / text-kv 多风格输入。
 - **配置与运行时设置解耦**：静态配置（`config`）与运行时策略（`settings`）通过 Admin API 分离管理，支持热更新和密码轮换失效旧 JWT。
 - **流式能力升级**：`/v1/responses` 与 `/v1/chat/completions` 共享更一致的工具调用增量输出策略，降低不同 SDK 下的行为差异。
 - **可观测与可运维增强**：`/healthz`、`/readyz`、`/admin/version`、`/admin/dev/captures` 形成排障闭环，便于发布后验证。
@@ -95,7 +101,7 @@ flowchart LR
 | Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`（及 `/v1/models/{model}:*` 路径） |
 | 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 |
 | 并发队列控制 | 每账号 in-flight 上限 + 等待队列，动态计算建议并发值 |
-| DeepSeek PoW | WASM 计算（`wazero`），无需外部 Node.js 依赖 |
+| DeepSeek PoW | 纯 Go 高性能实现（DeepSeekHashV1），毫秒级响应 |
 | Tool Calling | 防泄漏处理：非代码块高置信特征识别、`delta.tool_calls` 早发、结构化增量输出 |
 | Admin API | 配置管理、运行时设置热更新、账号测试 / 批量测试、会话清理、导入导出、Vercel 同步、版本检查 |
 | WebUI 管理台 | `/admin` 单页应用（中英文双语、深色模式） |
@@ -114,26 +120,35 @@ flowchart LR
 ## 模型支持
-### OpenAI 接口
+### OpenAI 接口（`GET /v1/models`）
-| 模型 | thinking | search |
+| 模型类型 | 模型 ID | thinking | search |
-| --- | --- | --- |
+| --- | --- | --- | --- |
-| `deepseek-chat` | ❌ | ❌ |
+| default | `deepseek-chat` | ❌ | ❌ |
-| `deepseek-reasoner` | ✅ | ❌ |
+| default | `deepseek-reasoner` | ✅ | ❌ |
-| `deepseek-chat-search` | ❌ | ✅ |
+| default | `deepseek-chat-search` | ❌ | ✅ |
-| `deepseek-reasoner-search` | ✅ | ✅ |
+| default | `deepseek-reasoner-search` | ✅ | ✅ |
 | expert | `deepseek-expert-chat` | ❌ | ❌ |
 | expert | `deepseek-expert-reasoner` | ✅ | ❌ |
 | expert | `deepseek-expert-chat-search` | ❌ | ✅ |
 | expert | `deepseek-expert-reasoner-search` | ✅ | ✅ |
 | vision | `deepseek-vision-chat` | ❌ | ❌ |
 | vision | `deepseek-vision-reasoner` | ✅ | ❌ |
 | vision | `deepseek-vision-chat-search` | ❌ | ✅ |
 | vision | `deepseek-vision-reasoner-search` | ✅ | ✅ |
-### Claude 接口
+除原生模型外，也支持常见 alias 输入（如 `gpt-4o`、`gpt-5-codex`、`o3`、`claude-sonnet-4-5`、`gemini-2.5-pro` 等），但 `/v1/models` 返回的是规范化后的 DeepSeek 原生模型 ID。
-| 模型 | 默认映射 |
+### Claude 接口（`GET /anthropic/v1/models`）
 | 当前常用模型 | 默认映射 |
 | --- | --- |
 | `claude-sonnet-4-5` | `deepseek-chat` |
 | `claude-haiku-4-5`（兼容 `claude-3-5-haiku-latest`） | `deepseek-chat` |
 | `claude-opus-4-6` | `deepseek-reasoner` |
 可通过配置中的 `claude_mapping` 或 `claude_model_mapping` 覆盖映射关系。
-另外，`/anthropic/v1/models` 现已包含 Claude 1.x/2.x/3.x/4.x 历史模型 ID 与常见别名，便于旧客户端直接兼容。
+`/anthropic/v1/models` 除上述当前主别名外，还会返回 Claude 4.x snapshots，以及 3.x / 2.x / 1.x 历史模型 ID 与常见 alias，便于旧客户端直接兼容。
 #### Claude Code 接入避坑（实测）
@@ -148,6 +163,15 @@ Gemini 适配器将模型名通过 `model_aliases` 或内置规则映射到 Deep
 ## 快速开始
 ### 部署方式优先级建议
 推荐按以下顺序选择部署方式：
 1. **下载 Release 构建包运行**：最省事，产物已编译完成，最适合大多数用户。
 2. **Docker / GHCR 镜像部署**：适合需要容器化、编排或云环境部署。
 3. **Vercel 部署**：适合已有 Vercel 环境且接受其平台约束的场景。
 4. **本地源码运行 / 自行编译**：适合开发、调试或需要自行修改代码的场景。
 ### 通用第一步（所有部署方式）
 把 `config.json` 作为唯一配置源（推荐做法）：
@@ -161,29 +185,19 @@ cp config.example.json config.json
 - 本地运行：直接读取 `config.json`
 - Docker / Vercel：由 `config.json` 生成 `DS2API_CONFIG_JSON`（Base64）注入环境变量，也可以直接写原始 JSON
-### 方式一：本地运行
+### 方式一：下载 Release 构建包
-**前置要求**：Go 1.26+，Node.js `20.19+` 或 `22.12+`（仅在需要构建 WebUI 时）
+每次发布 Release 时，GitHub Actions 会自动构建多平台二进制包：
 ```bash
-# 1. 克隆仓库
+# 下载对应平台的压缩包后
-git clone https://github.com/CJackHwang/ds2api.git
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api
+cd ds2api_<tag>_linux_amd64
 # 2. 配置
 cp config.example.json config.json
-# 编辑 config.json，填入你的 DeepSeek 账号信息和 API key
+# 编辑 config.json
-
+./ds2api
 # 3. 启动
 go run ./cmd/ds2api
 ```
 默认本地访问地址：`http://127.0.0.1:5001`
 服务实际绑定：`0.0.0.0:5001`，因此同一局域网设备通常也可以通过你的内网 IP 访问。
 > **WebUI 自动构建**：本地首次启动时，若 `static/admin` 不存在，会自动尝试执行 `npm ci`（仅在缺少依赖时）和 `npm run build -- --outDir static/admin --emptyOutDir`（需要本机有 Node.js）。你也可以手动构建：`./scripts/build-webui.sh`
 ### 方式二：Docker 运行
 ```bash
@@ -237,35 +251,28 @@ base64 < config.json | tr -d '\n'
 详细部署说明请参阅 [部署指南](docs/DEPLOY.md)。
-### 方式四：下载 Release 构建包
+### 方式四：本地源码运行
-每次发布 Release 时，GitHub Actions 会自动构建多平台二进制包：
+**前置要求**：Go 1.26+，Node.js `20.19+` 或 `22.12+`（仅在需要构建 WebUI 时）
 ```bash
-# 下载对应平台的压缩包后
+# 1. 克隆仓库
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api_<tag>_linux_amd64
+cd ds2api
 # 2. 配置
 cp config.example.json config.json
-# 编辑 config.json
+# 编辑 config.json，填入你的 DeepSeek 账号信息和 API key
-./ds2api
+
 # 3. 启动
 go run ./cmd/ds2api
 ```
-### 方式五：OpenCode CLI 接入
+默认本地访问地址：`http://127.0.0.1:5001`
-1. 复制示例配置：
+服务实际绑定：`0.0.0.0:5001`，因此同一局域网设备通常也可以通过你的内网 IP 访问。
-```bash
+> **WebUI 自动构建**：本地首次启动时，若 `static/admin` 不存在，会自动尝试执行 `npm ci`（仅在缺少依赖时）和 `npm run build -- --outDir static/admin --emptyOutDir`（需要本机有 Node.js）。你也可以手动构建：`./scripts/build-webui.sh`
 cp opencode.json.example opencode.json
 ```
 2. 编辑 `opencode.json`：
 - 将 `baseURL` 改为你的 DS2API 地址（例如 `https://your-domain.com/v1`）
 - 将 `apiKey` 改为你的 DS2API key（对应 `config.keys`）
 3. 在项目目录启动 OpenCode CLI（按你的安装方式运行 `opencode`）。
 > 建议优先使用 OpenAI 兼容路径（`/v1/*`），即示例里的 `@ai-sdk/openai-compatible` provider。
 > 若客户端支持 `wire_api`，可分别测试 `responses` 与 `chat`，DS2API 两条链路都兼容。
 ## 配置说明
@@ -344,12 +351,11 @@ cp opencode.json.example opencode.json
 | `DS2API_CONFIG_PATH` | 配置文件路径 | `config.json` |
 | `DS2API_CONFIG_JSON` | 直接注入配置（JSON 或 Base64） | — |
 | `DS2API_ENV_WRITEBACK` | 环境变量模式下自动写回配置文件并切换文件模式（`1/true/yes/on`） | 关闭 |
 | `DS2API_WASM_PATH` | PoW WASM 文件路径 | 自动查找 |
 | `DS2API_STATIC_ADMIN_DIR` | 管理台静态文件目录 | `static/admin` |
 | `DS2API_AUTO_BUILD_WEBUI` | 启动时自动构建 WebUI | 本地开启，Vercel 关闭 |
 | `DS2API_DEV_PACKET_CAPTURE` | 本地开发抓包开关（记录最近会话请求/响应体） | 本地非 Vercel 默认开启 |
-| `DS2API_DEV_PACKET_CAPTURE_LIMIT` | 本地抓包保留条数（超出自动淘汰） | `5` |
+| `DS2API_DEV_PACKET_CAPTURE_LIMIT` | 本地抓包保留条数（超出自动淘汰） | `20` |
-| `DS2API_DEV_PACKET_CAPTURE_MAX_BODY_BYTES` | 单条响应体最大记录字节数 | `2097152` |
+| `DS2API_DEV_PACKET_CAPTURE_MAX_BODY_BYTES` | 单条响应体最大记录字节数 | `5242880` |
 | `DS2API_ACCOUNT_MAX_INFLIGHT` | 每账号最大并发 in-flight 请求数 | `2` |
 | `DS2API_ACCOUNT_MAX_QUEUE` | 等待队列上限 | `recommended_concurrency` |
 | `DS2API_GLOBAL_MAX_INFLIGHT` | 全局最大 in-flight 请求数 | `recommended_concurrency` |
@@ -403,13 +409,13 @@ Gemini 路由还可以使用 `x-goog-api-key`，或在没有认证头时使用 `
 ## 本地开发抓包工具
-用于定位「responses 思考流/工具调用」等问题。开启后会自动记录最近 N 条 DeepSeek 对话上游请求体与响应体（默认 5 条，超出自动淘汰）。
+用于定位「responses 思考流/工具调用」等问题。开启后会自动记录最近 N 条 DeepSeek 对话上游请求体与响应体（默认 20 条，超出自动淘汰；单条响应体默认最多记录 5 MB）。
 启用示例：
 ```bash
 DS2API_DEV_PACKET_CAPTURE=true \
-DS2API_DEV_PACKET_CAPTURE_LIMIT=5 \
+DS2API_DEV_PACKET_CAPTURE_LIMIT=20 \
 go run ./cmd/ds2api
 ```
@@ -417,6 +423,8 @@ go run ./cmd/ds2api
 - `GET /admin/dev/captures`：查看抓包列表（最新在前）
 - `DELETE /admin/dev/captures`：清空抓包
 - `GET /admin/dev/raw-samples/query?q=关键词&limit=20`：按问题关键词查询当前内存抓包，并按 `chat_session_id` 归并 `completion + continue` 链
 - `POST /admin/dev/raw-samples/save`：把命中的某条抓包链保存为 `tests/raw_stream_samples/<sample-id>/` 回放样本
 返回字段包含：
@@ -424,69 +432,10 @@ go run ./cmd/ds2api
 - `response_body`：上游返回的原始流式内容拼接文本
 - `response_truncated`：是否触发单条大小截断
-## 项目结构
+保存接口支持用 `query`、`chain_key` 或 `capture_id` 选中目标。例如：
-```text
+```json
-ds2api/
+{"query":"广州天气","sample_id":"gz-weather-from-memory"}
 ├── app/                     # 统一 HTTP Handler 组装层（供本地与 Serverless 复用）
 ├── cmd/
 │   ├── ds2api/              # 本地 / 容器启动入口
 │   └── ds2api-tests/        # 端到端测试集入口
 ├── api/
 │   ├── index.go             # Vercel Serverless Go 入口
 │   ├── chat-stream.js       # Vercel Node.js 流式转发
 │   └── (rewrite targets in vercel.json)
 ├── internal/
 │   ├── account/             # 账号池与并发队列
 │   ├── adapter/
 │   │   ├── openai/          # OpenAI 兼容适配器（含 Tool Call 解析、Vercel 流式 prepare/release）
 │   │   ├── claude/          # Claude 兼容适配器
 │   │   └── gemini/          # Gemini 兼容适配器（generateContent / streamGenerateContent）
 │   ├── admin/               # Admin API handlers（含 Settings 热更新）
 │   ├── auth/                # 鉴权与 JWT
 │   ├── claudeconv/          # Claude 消息格式转换
 │   ├── compat/              # Go 版本兼容与回归测试辅助
 │   ├── config/              # 配置加载、校验与热更新
 │   ├── deepseek/            # DeepSeek API 客户端、PoW WASM
 │   ├── js/                  # Node 运行时流式处理与兼容逻辑
 │   ├── devcapture/          # 开发抓包模块
 │   ├── rawsample/           # 原始流样本可见文本提取与回放辅助
 │   ├── format/              # 输出格式化
 │   ├── prompt/              # Prompt 构建
 │   ├── server/              # HTTP 路由与中间件（chi router）
 │   ├── sse/                 # SSE 解析工具
 │   ├── stream/              # 统一流式消费引擎
 │   ├── testsuite/           # 端到端测试框架与用例编排
 │   ├── translatorcliproxy/  # CLIProxy 桥接与流写入组件
 │   ├── util/                # 通用工具函数
 │   ├── version/             # 版本解析 / 比较与 tag 规范化
 │   └── webui/               # WebUI 静态文件托管与自动构建
 ├── webui/                   # React WebUI 源码（Vite + Tailwind）
 │   └── src/
 │       ├── app/             # 路由、鉴权、配置状态管理
 │       ├── features/        # 业务功能模块（account/settings/vercel/apiTester）
 │       ├── components/      # 登录/落地页等通用组件
 │       └── locales/         # 中英文语言包（zh.json / en.json）
 ├── scripts/
 │   └── build-webui.sh       # WebUI 手动构建脚本
 ├── tests/
 │   ├── compat/              # 兼容性测试夹具与期望输出
 │   ├── node/                # Node 侧单元测试（chat-stream / tool-sieve）
 │   ├── raw_stream_samples/  # 原始 SSE 样本与回放元数据
 │   └── scripts/             # 统一测试脚本入口（unit/e2e）
 ├── docs/                    # 部署 / 贡献 / 测试等辅助文档
 ├── static/admin/            # WebUI 构建产物（不提交到 Git）
 ├── .github/
 │   ├── workflows/           # GitHub Actions（质量门禁 + Release 自动构建）
 │   ├── ISSUE_TEMPLATE/      # Issue 模板
 │   └── PULL_REQUEST_TEMPLATE.md
 ├── config.example.json      # 配置文件示例
 ├── .env.example             # 环境变量示例
 ├── Dockerfile               # 多阶段构建（WebUI + Go）
 ├── docker-compose.yml       # 生产环境 Docker Compose
 ├── docker-compose.dev.yml   # 开发环境 Docker Compose
 ├── vercel.json              # Vercel 路由与构建配置
 └── go.mod / go.sum          # Go 模块依赖
 ```
 ## 文档索引
@@ -535,7 +484,7 @@ npm ci --prefix webui && npm run build --prefix webui
 go test ./...
 # 运行 tool calls 相关测试（调试工具调用问题）
-go test -v -run 'TestParseToolCalls|TestRepair' ./internal/util/
+go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
 # 运行端到端测试
 ./tests/scripts/run-live.sh
--- a/README.en.md
+++ b/README.en.md
@@ -16,6 +16,8 @@ Language: [中文](README.MD) | [English](README.en.md)
 DS2API converts DeepSeek Web chat capability into OpenAI-compatible, Claude-compatible, and Gemini-compatible APIs. The backend is a **pure Go implementation**, with a React WebUI admin panel (source in `webui/`, build output auto-generated to `static/admin` during deployment).
 Documentation entry: [Docs Index](docs/README.md) / [Architecture](docs/ARCHITECTURE.en.md) / [API Reference](API.en.md)
 > **Important Disclaimer**
 >
 > This repository is provided for learning, research, personal experimentation, and internal validation only. It does not grant any commercial authorization and comes with no warranty of fitness, stability, or results.
@@ -24,7 +26,7 @@ DS2API converts DeepSeek Web chat capability into OpenAI-compatible, Claude-comp
 >
 > Do not use this project in ways that violate service terms, agreements, laws, or platform rules. Before any commercial use, review the `LICENSE`, the relevant terms, and confirm that you have the author's written permission.
-## Architecture Overview
+## Architecture Overview (Summary)
 ```mermaid
 flowchart LR
@@ -48,7 +50,7 @@ flowchart LR
            Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
            Pool["Account Pool + Queue\n(in-flight slots + wait queue)"]
            DSClient["DeepSeek Client\n(session / auth / HTTP)"]
-            Pow["PoW WASM\n(wazero preload)"]
+            Pow["PoW Solver\n(Pure Go ms-level)"]
            Tool["Tool Sieve\n(Go/Node semantic parity)"]
        end
    end
@@ -72,6 +74,8 @@ flowchart LR
    Bridge --> Client
 ```
 For the full module-by-module architecture and directory responsibilities, see [docs/ARCHITECTURE.en.md](docs/ARCHITECTURE.en.md).
 - **Backend**: Go (`cmd/ds2api/`, `api/`, `internal/`), no Python runtime
 - **Frontend**: React admin panel (`webui/`), served as static build at runtime
 - **Deployment**: local run, Docker, Vercel serverless, Linux systemd
@@ -81,7 +85,7 @@ flowchart LR
 - **Unified routing core**: all protocol entries are now centralized through `internal/server/router.go`, with OpenAI / Claude / Gemini / Admin / WebUI routes registered in one tree to avoid multi-entry drift.
 - **Unified execution chain**: Claude/Gemini entries are translated by `internal/translatorcliproxy`, then executed through `openai.ChatCompletions` for shared tool-calling and stream semantics, then translated back to the client protocol.
 - **Cleaner adapter boundaries**: `internal/adapter/{claude,gemini}` handles protocol wrappers, while `internal/adapter/openai` remains the execution core; upstream DeepSeek calls are retained only in the OpenAI core.
- **Tool-calling parity across runtimes**: Go (`internal/util`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) follow aligned parsing/anti-leak semantics across JSON / XML / invoke / text-kv inputs.
+- **Tool-calling parity across runtimes**: Go (`internal/toolcall`) and Vercel Node (`internal/js/helpers/stream-tool-sieve`) follow aligned parsing/anti-leak semantics across JSON / XML / invoke / text-kv inputs.
 - **Config/runtime separation**: static config (`config`) and runtime policy (`settings`) are managed independently via Admin APIs, enabling hot updates and password rotation with JWT invalidation.
 - **Streaming behavior upgrade**: `/v1/responses` and `/v1/chat/completions` now share a more consistent incremental tool-call emission strategy across SDK ecosystems.
 - **Improved operability**: `/healthz`, `/readyz`, `/admin/version`, and `/admin/dev/captures` form a tighter post-deploy diagnostics loop.
@@ -95,7 +99,7 @@ flowchart LR
 | Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
 | Multi-account rotation | Auto token refresh, email/mobile dual login |
 | Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency |
-| DeepSeek PoW | WASM solving via `wazero`, no external Node.js dependency |
+| DeepSeek PoW | Pure Go high-performance solver (DeepSeekHashV1), ms-level response |
 | Tool Calling | Anti-leak handling: non-code-block feature match, early `delta.tool_calls`, structured incremental output |
 | Admin API | Config management, runtime settings hot-reload, account testing/batch test, session cleanup, import/export, Vercel sync, version check |
 | WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode) |
@@ -114,26 +118,35 @@ flowchart LR
 ## Model Support
-### OpenAI Endpoint
+### OpenAI Endpoint (`GET /v1/models`)
-| Model | thinking | search |
+| Family | Model ID | thinking | search |
-| --- | --- | --- |
+| --- | --- | --- | --- |
-| `deepseek-chat` | ❌ | ❌ |
+| default | `deepseek-chat` | ❌ | ❌ |
-| `deepseek-reasoner` | ✅ | ❌ |
+| default | `deepseek-reasoner` | ✅ | ❌ |
-| `deepseek-chat-search` | ❌ | ✅ |
+| default | `deepseek-chat-search` | ❌ | ✅ |
-| `deepseek-reasoner-search` | ✅ | ✅ |
+| default | `deepseek-reasoner-search` | ✅ | ✅ |
 | expert | `deepseek-expert-chat` | ❌ | ❌ |
 | expert | `deepseek-expert-reasoner` | ✅ | ❌ |
 | expert | `deepseek-expert-chat-search` | ❌ | ✅ |
 | expert | `deepseek-expert-reasoner-search` | ✅ | ✅ |
 | vision | `deepseek-vision-chat` | ❌ | ❌ |
 | vision | `deepseek-vision-reasoner` | ✅ | ❌ |
 | vision | `deepseek-vision-chat-search` | ❌ | ✅ |
 | vision | `deepseek-vision-reasoner-search` | ✅ | ✅ |
-### Claude Endpoint
+Besides native IDs, DS2API also accepts common aliases as input (for example `gpt-4o`, `gpt-5-codex`, `o3`, `claude-sonnet-4-5`, `gemini-2.5-pro`), but `/v1/models` returns normalized DeepSeek native model IDs.
-| Model | Default Mapping |
+### Claude Endpoint (`GET /anthropic/v1/models`)
 | Current common model | Default Mapping |
 | --- | --- |
 | `claude-sonnet-4-5` | `deepseek-chat` |
 | `claude-haiku-4-5` (compatible with `claude-3-5-haiku-latest`) | `deepseek-chat` |
 | `claude-opus-4-6` | `deepseek-reasoner` |
 Override mapping via `claude_mapping` or `claude_model_mapping` in config.
-In addition, `/anthropic/v1/models` now includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases for legacy client compatibility.
+Besides the current primary aliases above, `/anthropic/v1/models` also returns Claude 4.x snapshots plus historical 3.x / 2.x / 1.x IDs and common aliases for legacy client compatibility.
 #### Claude Code integration pitfalls (validated)
@@ -148,6 +161,15 @@ The Gemini adapter maps model names to DeepSeek native models via `model_aliases
 ## Quick Start
 ### Recommended deployment priority
 Recommended order when choosing a deployment method:
 1. **Download and run release binaries**: the easiest path for most users because the artifacts are already built.
 2. **Docker / GHCR image deployment**: suitable for containerized, orchestrated, or cloud environments.
 3. **Vercel deployment**: suitable if you already use Vercel and accept its platform constraints.
 4. **Run from source / build locally**: suitable for development, debugging, or when you need to modify the code yourself.
 ### Universal First Step (all deployment modes)
 Use `config.json` as the single source of truth (recommended):
@@ -161,47 +183,37 @@ Recommended per deployment mode:
 - Local run: read `config.json` directly
 - Docker / Vercel: generate Base64 from `config.json` and inject as `DS2API_CONFIG_JSON`, or paste raw JSON directly
-### Option 1: Local Run
+### Option 1: Download Release Binaries
-**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)
+GitHub Actions automatically builds multi-platform archives on each Release:
 ```bash
-# 1. Clone
+# After downloading the archive for your platform
-git clone https://github.com/CJackHwang/ds2api.git
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
-cd ds2api
+cd ds2api_<tag>_linux_amd64
 # 2. Configure
 cp config.example.json config.json
-# Edit config.json with your DeepSeek account info and API keys
+# Edit config.json
-
+./ds2api
 # 3. Start
 go run ./cmd/ds2api
 ```
-Default local URL: `http://127.0.0.1:5001`
+### Option 2: Docker / GHCR
 The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.
 > **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`
 ### Option 2: Docker
 ```bash
-# 1. Prepare env file and config file
+# Pull prebuilt image
 docker pull ghcr.io/cjackhwang/ds2api:latest
 # Or run a pinned version
 # docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
 # Prepare env file and config file
 cp .env.example .env
 cp config.example.json config.json
-# 2. Edit .env (at least set DS2API_ADMIN_KEY; optionally set DS2API_HOST_PORT to change the host port)
+# Start with compose
 #    DS2API_ADMIN_KEY=replace-with-a-strong-secret
 # 3. Start
 docker-compose up -d
 # 4. View logs
 docker-compose logs -f
 ```
-The default `docker-compose.yml` maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
+The default `docker-compose.yml` uses `ghcr.io/cjackhwang/ds2api:latest` and maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
 Rebuild after updates: `docker-compose up -d --build`
@@ -237,35 +249,28 @@ base64 < config.json | tr -d '\n'
 For detailed deployment instructions, see the [Deployment Guide](docs/DEPLOY.en.md).
-### Option 4: Download Release Binaries
+### Option 4: Local Run
-GitHub Actions automatically builds multi-platform archives on each Release:
+**Prerequisites**: Go 1.26+, Node.js `20.19+` or `22.12+` (only if building WebUI locally)
 ```bash
-# After downloading the archive for your platform
+# 1. Clone
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api_<tag>_linux_amd64
+cd ds2api
 # 2. Configure
 cp config.example.json config.json
-# Edit config.json
+# Edit config.json with your DeepSeek account info and API keys
-./ds2api
+
 # 3. Start
 go run ./cmd/ds2api
 ```
-### Option 5: OpenCode CLI
+Default local URL: `http://127.0.0.1:5001`
-1. Copy the example config:
+The server actually binds to `0.0.0.0:5001`, so devices on the same LAN can usually reach it through your private IP as well.
-```bash
+> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm ci` (only when dependencies are missing) and `npm run build -- --outDir static/admin --emptyOutDir` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`
 cp opencode.json.example opencode.json
 ```
 2. Edit `opencode.json`:
 - Set `baseURL` to your DS2API endpoint (for example, `https://your-domain.com/v1`)
 - Set `apiKey` to your DS2API key (from `config.keys`)
 3. Start OpenCode CLI in the project directory (run `opencode` using your installed method).
 > Recommended: use the OpenAI-compatible path (`/v1/*`) via `@ai-sdk/openai-compatible` as shown in the example.
 > If your client supports `wire_api`, test both `responses` and `chat`; DS2API supports both paths.
 ## Configuration
@@ -344,7 +349,6 @@ cp opencode.json.example opencode.json
 | `DS2API_CONFIG_PATH` | Config file path | `config.json` |
 | `DS2API_CONFIG_JSON` | Inline config (JSON or Base64) | — |
 | `DS2API_ENV_WRITEBACK` | Auto-write env-backed config to file and transition to file mode (`1/true/yes/on`) | Disabled |
 | `DS2API_WASM_PATH` | PoW WASM file path | Auto-detect |
 | `DS2API_STATIC_ADMIN_DIR` | Admin static assets dir | `static/admin` |
 | `DS2API_AUTO_BUILD_WEBUI` | Auto-build WebUI on startup | Enabled locally, disabled on Vercel |
 | `DS2API_ACCOUNT_MAX_INFLIGHT` | Max in-flight requests per account | `2` |
@@ -353,8 +357,8 @@ cp opencode.json.example opencode.json
 | `DS2API_VERCEL_INTERNAL_SECRET` | Vercel hybrid streaming internal auth | Falls back to `DS2API_ADMIN_KEY` |
 | `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | Stream lease TTL seconds | `900` |
 | `DS2API_DEV_PACKET_CAPTURE` | Local dev packet capture switch (record recent request/response bodies) | Enabled by default on non-Vercel local runtime |
-| `DS2API_DEV_PACKET_CAPTURE_LIMIT` | Number of captured sessions to retain (auto-evict overflow) | `5` |
+| `DS2API_DEV_PACKET_CAPTURE_LIMIT` | Number of captured sessions to retain (auto-evict overflow) | `20` |
-| `DS2API_DEV_PACKET_CAPTURE_MAX_BODY_BYTES` | Max recorded bytes per captured response body | `2097152` |
+| `DS2API_DEV_PACKET_CAPTURE_MAX_BODY_BYTES` | Max recorded bytes per captured response body | `5242880` |
 | `VERCEL_TOKEN` | Vercel sync token | — |
 | `VERCEL_PROJECT_ID` | Vercel project ID | — |
 | `VERCEL_TEAM_ID` | Vercel team ID | — |
@@ -392,21 +396,22 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
 When `tools` is present in the request, DS2API performs anti-leak handling:
 1. Toolcall feature matching is enabled only in **non-code-block context** (fenced examples are ignored)
-   - In non-code-block context, tool JSON may still be recognized even when mixed with normal prose; surrounding prose can remain as text output.
+2. The parser prioritizes XML/Markup, while also accepting JSON / ANTML / invoke / text-kv, and normalizes everything into the internal tool-call structure
-2. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
+3. `responses` streaming strictly uses official item lifecycle events (`response.output_item.*`, `response.content_part.*`, `response.function_call_arguments.*`)
 3. Tool names not declared in the `tools` schema are strictly rejected and will not be emitted as valid tool calls
 4. `responses` supports and enforces `tool_choice` (`auto`/`none`/`required`/forced function); `required` violations return `422` for non-stream and `response.failed` for stream
-5. Valid tool call events are only emitted after passing policy validation, preventing invalid tool names from entering the client execution chain
+5. The output protocol follows the client request (OpenAI / Claude / Gemini native shapes); model-side prompting can prefer XML, and the compatibility layer handles the protocol-specific translation
 > Note: the current parser still prioritizes “parse successfully whenever possible”; hard allow-list rejection for undeclared tool names is not enabled yet.
 ## Local Dev Packet Capture
-This is for debugging issues such as Responses reasoning streaming and tool-call handoff. When enabled, DS2API stores the latest N DeepSeek conversation payload pairs (request body + upstream response body), defaulting to 5 entries with auto-eviction.
+This is for debugging issues such as Responses reasoning streaming and tool-call handoff. When enabled, DS2API stores the latest N DeepSeek conversation payload pairs (request body + upstream response body), defaulting to 20 entries with auto-eviction; each response body is capped at 5 MB by default.
 Enable example:
 ```bash
 DS2API_DEV_PACKET_CAPTURE=true \
-DS2API_DEV_PACKET_CAPTURE_LIMIT=5 \
+DS2API_DEV_PACKET_CAPTURE_LIMIT=20 \
 go run ./cmd/ds2api
 ```
@@ -414,6 +419,8 @@ Inspect/clear (Admin JWT required):
 - `GET /admin/dev/captures`: list captured items (newest first)
 - `DELETE /admin/dev/captures`: clear captured items
 - `GET /admin/dev/raw-samples/query?q=keyword&limit=20`: search current in-memory captures by prompt keyword and group `completion + continue` by `chat_session_id`
 - `POST /admin/dev/raw-samples/save`: persist a selected capture chain as `tests/raw_stream_samples/<sample-id>/`
 Response fields include:
@@ -421,69 +428,10 @@ Response fields include:
 - `response_body`: concatenated raw upstream stream body text
 - `response_truncated`: whether body-size truncation happened
-## Project Structure
+The save endpoint can target a chain by `query`, `chain_key`, or `capture_id`. Example:
-```text
+```json
-ds2api/
+{"query":"Guangzhou weather","sample_id":"gz-weather-from-memory"}
 ├── app/                     # Unified HTTP handler assembly (shared by local + serverless)
 ├── cmd/
 │   ├── ds2api/              # Local / container entrypoint
 │   └── ds2api-tests/        # End-to-end testsuite entrypoint
 ├── api/
 │   ├── index.go             # Vercel Serverless Go entry
 │   ├── chat-stream.js       # Vercel Node.js stream relay
 │   └── (rewrite targets in vercel.json)
 ├── internal/
 │   ├── account/             # Account pool and concurrency queue
 │   ├── adapter/
 │   │   ├── openai/          # OpenAI adapter (incl. tool call parsing, Vercel stream prepare/release)
 │   │   ├── claude/          # Claude adapter
 │   │   └── gemini/          # Gemini adapter (generateContent / streamGenerateContent)
 │   ├── admin/               # Admin API handlers (incl. Settings hot-reload)
 │   ├── auth/                # Auth and JWT
 │   ├── claudeconv/          # Claude message format conversion
 │   ├── compat/              # Go-version compatibility and regression helpers
 │   ├── config/              # Config loading, validation, and hot-reload
 │   ├── deepseek/            # DeepSeek API client, PoW WASM
 │   ├── js/                  # Node runtime stream/compat logic
 │   ├── devcapture/          # Dev packet capture module
 │   ├── rawsample/           # Visible-text extraction and replay helpers for raw stream samples
 │   ├── format/              # Output formatting
 │   ├── prompt/              # Prompt construction
 │   ├── server/              # HTTP routing and middleware (chi router)
 │   ├── sse/                 # SSE parsing utilities
 │   ├── stream/              # Unified stream consumption engine
 │   ├── testsuite/           # End-to-end testsuite framework and case orchestration
 │   ├── translatorcliproxy/  # CLIProxy bridge and stream writer components
 │   ├── util/                # Common utilities
 │   ├── version/             # Version parsing/comparison and tag normalization
 │   └── webui/               # WebUI static file serving and auto-build
 ├── webui/                   # React WebUI source (Vite + Tailwind)
 │   └── src/
 │       ├── app/             # Routing, auth, config state
 │       ├── features/        # Feature modules (account/settings/vercel/apiTester)
 │       ├── components/      # Shared UI pieces (login/landing, etc.)
 │       └── locales/         # Language packs (zh.json / en.json)
 ├── scripts/
 │   └── build-webui.sh       # Manual WebUI build script
 ├── tests/
 │   ├── compat/              # Compatibility fixtures and expected outputs
 │   ├── node/                # Node-side unit tests (chat-stream / tool-sieve)
 │   ├── raw_stream_samples/  # Raw SSE samples and replay metadata
 │   └── scripts/             # Unified test script entrypoints (unit/e2e)
 ├── docs/                    # Deployment / contributing / testing docs
 ├── static/admin/            # WebUI build output (not committed to Git)
 ├── .github/
 │   ├── workflows/           # GitHub Actions (quality gates + release automation)
 │   ├── ISSUE_TEMPLATE/      # Issue templates
 │   └── PULL_REQUEST_TEMPLATE.md
 ├── config.example.json      # Config file template
 ├── .env.example             # Environment variable template
 ├── Dockerfile               # Multi-stage build (WebUI + Go)
 ├── docker-compose.yml       # Production Docker Compose
 ├── docker-compose.dev.yml   # Development Docker Compose
 ├── vercel.json              # Vercel routing and build config
 └── go.mod / go.sum          # Go module dependencies
 ```
 ## Documentation Index
--- a/2
+++ b/2
@@ -1 +1 @@
-3.1.0
+3.4.0
--- a/cmd/ds2api-tests/main.go
+++ b/cmd/ds2api-tests/main.go
@@ -30,8 +30,8 @@ func main() {
 	opts.Timeout = time.Duration(timeoutSeconds) * time.Second
 	if err := testsuite.Run(context.Background(), opts); err != nil {
-		fmt.Fprintln(os.Stderr, err.Error())
+		_, _ = fmt.Fprintln(os.Stderr, err.Error())
 		os.Exit(1)
 	}
-	fmt.Fprintln(os.Stdout, "testsuite completed successfully")
+	_, _ = fmt.Fprintln(os.Stdout, "testsuite completed successfully")
 }
--- a/docs/ARCHITECTURE.en.md
+++ b/docs/ARCHITECTURE.en.md
@@ -0,0 +1,136 @@
 # DS2API Architecture & Project Layout
 Language: [中文](ARCHITECTURE.md) | [English](ARCHITECTURE.en.md)
 > This file is the single architecture source for directory layout, module boundaries, and execution flow.
 ## 1. Top-level Layout (expanded)
 > Notes: this is the **fully expanded** project directory list (excluding metadata/dependency dirs such as `.git/` and `webui/node_modules/`), with each folder annotated by purpose.
 ```text
 ds2api/
 ├── .github/                              # GitHub collaboration and CI config
 │   ├── ISSUE_TEMPLATE/                   # Issue templates
 │   └── workflows/                        # GitHub Actions workflows
 ├── api/                                  # Serverless entrypoints (Vercel Go/Node)
 ├── app/                                  # Application-level handler assembly
 ├── cmd/                                  # Executable entrypoints
 │   ├── ds2api/                           # Main service bootstrap
 │   └── ds2api-tests/                     # E2E testsuite CLI bootstrap
 ├── docs/                                 # Project documentation
 ├── internal/                             # Core implementation (non-public packages)
 │   ├── account/                          # Account pool, inflight slots, waiting queue
 │   ├── adapter/                          # Multi-protocol adapters
 │   │   ├── claude/                       # Claude protocol adapter
 │   │   ├── gemini/                       # Gemini protocol adapter
 │   │   └── openai/                       # OpenAI adapter and shared execution core
 │   ├── admin/                            # Admin API (config/accounts/ops)
 │   ├── auth/                             # Auth/JWT/credential resolution
 │   ├── claudeconv/                       # Claude message conversion helpers
 │   ├── compat/                           # Compatibility and regression helpers
 │   ├── config/                           # Config loading/validation/hot reload
 │   ├── deepseek/                         # DeepSeek upstream client capabilities
 │   │   └── transport/                    # DeepSeek transport details
 │   ├── devcapture/                       # Dev capture and troubleshooting
 │   ├── format/                           # Response formatting layer
 │   │   ├── claude/                       # Claude output formatting
 │   │   └── openai/                       # OpenAI output formatting
 │   ├── js/                               # Node runtime related logic
 │   │   ├── chat-stream/                  # Node streaming bridge
 │   │   ├── helpers/                      # JS helper modules
 │   │   │   └── stream-tool-sieve/        # JS implementation of tool sieve
 │   │   └── shared/                       # Shared semantics between Go/Node
 │   ├── prompt/                           # Prompt composition
 │   ├── rawsample/                        # Raw sample read/write and management
 │   ├── server/                           # Router and middleware assembly
 │   ├── sse/                              # SSE parsing utilities
 │   ├── stream/                           # Unified stream consumption engine
 │   ├── testsuite/                        # Testsuite execution framework
 │   ├── textclean/                        # Text cleanup
 │   ├── toolcall/                         # Tool-call parsing and repair
 │   ├── translatorcliproxy/               # Cross-protocol translation bridge
 │   ├── util/                             # Shared utility helpers
 │   ├── version/                          # Version query/compare
 │   └── webui/                            # WebUI static hosting logic
 ├── plans/                                # Stage plans and manual QA records
 ├── pow/                                  # PoW standalone implementation + benchmarks
 ├── scripts/                              # Build/release helper scripts
 ├── tests/                                # Test assets and scripts
 │   ├── compat/                           # Compatibility fixtures + expected outputs
 │   │   ├── expected/                     # Expected output samples
 │   │   └── fixtures/                     # Fixture inputs
 │   │       ├── sse_chunks/               # SSE chunk fixtures
 │   │       └── toolcalls/                # Tool-call fixtures
 │   ├── node/                             # Node unit tests
 │   ├── raw_stream_samples/               # Upstream raw SSE samples
 │   │   ├── content-filter-trigger-20260405-jwt3/          # Content-filter terminal sample
 │   │   ├── continue-thinking-snapshot-replay-20260405/    # Continue-thinking sample
 │   │   ├── guangzhou-weather-reasoner-search-20260404/    # Search/reference sample
 │   │   ├── markdown-format-example-20260405/              # Markdown sample
 │   │   └── markdown-format-example-20260405-spacefix/     # Space-fix sample
 │   ├── scripts/                          # Test entry scripts
 │   └── tools/                            # Testing helper tools
 └── webui/                                # React admin console source
    ├── public/                           # Static assets
    └── src/                              # Frontend source code
        ├── app/                          # Routing/state scaffolding
        ├── components/                   # Shared UI components
        ├── features/                     # Feature modules
        │   ├── account/                  # Account management page
        │   ├── apiTester/                # API tester page
        │   ├── settings/                 # Settings page
        │   └── vercel/                   # Vercel sync page
        ├── layout/                       # Layout components
        ├── locales/                      # i18n strings
        └── utils/                        # Frontend utilities
 ```
 ## 2. Primary Request Flow
 ```mermaid
 flowchart LR
    C[Client/SDK] --> R[internal/server/router.go]
    R --> OA[OpenAI Adapter]
    R --> CA[Claude Adapter]
    R --> GA[Gemini Adapter]
    R --> AD[Admin API]
    CA --> BR[translatorcliproxy]
    GA --> BR
    BR --> CORE[internal/adapter/openai ChatCompletions]
    OA --> CORE
    CORE --> AUTH[internal/auth + config key/account resolver]
    CORE --> POOL[internal/account queue + concurrency]
    CORE --> TOOL[internal/toolcall parser + sieve]
    CORE --> DS[internal/deepseek client]
    DS --> U[DeepSeek upstream]
 ```
 ## 3. Responsibilities in `internal/`
 - `internal/server`: router tree + middlewares (health, protocol routes, Admin/WebUI).
 - `internal/adapter/openai`: shared execution core (chat/responses/embeddings + tool semantics).
 - `internal/adapter/{claude,gemini}`: protocol wrappers only (no duplicated upstream execution).
 - `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI.
 - `internal/deepseek`: upstream request/session/PoW/SSE handling.
 - `internal/stream` + `internal/sse`: stream parsing and incremental assembly.
 - `internal/toolcall`: JSON/XML/invoke/text-kv tool-call parsing + anti-leak sieve.
 - `internal/admin`: config/accounts/vercel sync/version/dev-capture endpoints.
 - `internal/config`: config loading/validation + runtime settings hot-reload.
 - `internal/account`: managed account pool, inflight slots, waiting queue.
 ## 4. WebUI Runtime Relation
 - `webui/` stores frontend source (Vite + React).
 - Runtime serves static output from `static/admin`.
 - On first local startup, if `static/admin` is missing, DS2API may auto-build it (Node.js required).
 ## 5. Documentation Split Strategy
 - Onboarding & quick start: `README.MD` / `README.en.md`
 - Architecture & layout: `docs/ARCHITECTURE*.md` (this file)
 - API contracts: `API.md` / `API.en.md`
 - Deployment/testing/contributing: `docs/DEPLOY*`, `docs/TESTING.md`, `docs/CONTRIBUTING*`
 - Deep topics: `docs/toolcall-semantics.md`, `docs/DeepSeekSSE行为结构说明-2026-04-05.md`
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -0,0 +1,136 @@
 # DS2API 架构与项目结构说明
 语言 / Language: [中文](ARCHITECTURE.md) | [English](ARCHITECTURE.en.md)
 > 本文档用于集中维护“代码目录结构 + 模块边界 + 主链路调用关系”。
 ## 1. 顶层目录结构（展开）
 > 说明：以下为仓库内业务相关目录的**完整展开**（排除 `.git/` 与 `webui/node_modules/` 这类依赖/元数据目录），并标注每个文件夹作用。
 ```text
 ds2api/
 ├── .github/                              # GitHub 协作与 CI 配置
 │   ├── ISSUE_TEMPLATE/                   # Issue 模板
 │   └── workflows/                        # GitHub Actions 工作流
 ├── api/                                  # Serverless 入口（Vercel Go/Node）
 ├── app/                                  # 应用级 handler 装配层
 ├── cmd/                                  # 可执行程序入口
 │   ├── ds2api/                           # 主服务启动入口
 │   └── ds2api-tests/                     # E2E 测试集 CLI 入口
 ├── docs/                                 # 项目文档目录
 ├── internal/                             # 核心业务实现（不对外暴露）
 │   ├── account/                          # 账号池、并发槽位、等待队列
 │   ├── adapter/                          # 多协议适配层
 │   │   ├── claude/                       # Claude 协议适配
 │   │   ├── gemini/                       # Gemini 协议适配
 │   │   └── openai/                       # OpenAI 协议与统一执行核心
 │   ├── admin/                            # Admin API（配置/账号/运维）
 │   ├── auth/                             # 鉴权/JWT/凭证解析
 │   ├── claudeconv/                       # Claude 消息格式转换工具
 │   ├── compat/                           # 兼容性辅助与回归支持
 │   ├── config/                           # 配置加载、校验、热更新
 │   ├── deepseek/                         # DeepSeek 上游客户端能力
 │   │   └── transport/                    # DeepSeek 传输层细节
 │   ├── devcapture/                       # 开发抓包与调试采集
 │   ├── format/                           # 响应格式化层
 │   │   ├── claude/                       # Claude 输出格式化
 │   │   └── openai/                       # OpenAI 输出格式化
 │   ├── js/                               # Node Runtime 相关逻辑
 │   │   ├── chat-stream/                  # Node 流式输出桥接
 │   │   ├── helpers/                      # JS 辅助函数
 │   │   │   └── stream-tool-sieve/        # Tool sieve JS 实现
 │   │   └── shared/                       # Go/Node 共用语义片段
 │   ├── prompt/                           # Prompt 组装
 │   ├── rawsample/                        # raw sample 读写与管理
 │   ├── server/                           # 路由与中间件装配
 │   ├── sse/                              # SSE 解析工具
 │   ├── stream/                           # 统一流式消费引擎
 │   ├── testsuite/                        # 测试集执行框架
 │   ├── textclean/                        # 文本清洗
 │   ├── toolcall/                         # 工具调用解析与修复
 │   ├── translatorcliproxy/               # 多协议互转桥
 │   ├── util/                             # 通用工具函数
 │   ├── version/                          # 版本查询/比较
 │   └── webui/                            # WebUI 静态托管相关逻辑
 ├── plans/                                # 阶段计划与人工验收记录
 ├── pow/                                  # PoW 独立实现与基准
 ├── scripts/                              # 构建/发布/辅助脚本
 ├── tests/                                # 测试资源与脚本
 │   ├── compat/                           # 兼容性夹具与期望输出
 │   │   ├── expected/                     # 预期结果样本
 │   │   └── fixtures/                     # 测试输入夹具
 │   │       ├── sse_chunks/               # SSE chunk 夹具
 │   │       └── toolcalls/                # toolcall 夹具
 │   ├── node/                             # Node 单元测试
 │   ├── raw_stream_samples/               # 上游原始 SSE 样本
 │   │   ├── content-filter-trigger-20260405-jwt3/          # 风控终态样本
 │   │   ├── continue-thinking-snapshot-replay-20260405/    # continue 样本
 │   │   ├── guangzhou-weather-reasoner-search-20260404/    # 搜索+引用样本
 │   │   ├── markdown-format-example-20260405/              # Markdown 样本
 │   │   └── markdown-format-example-20260405-spacefix/     # 空格修复样本
 │   ├── scripts/                          # 测试脚本入口
 │   └── tools/                            # 测试辅助工具
 └── webui/                                # React 管理台源码
    ├── public/                           # 静态资源
    └── src/                              # 前端源码
        ├── app/                          # 路由/状态框架
        ├── components/                   # 共享组件
        ├── features/                     # 功能模块
        │   ├── account/                  # 账号管理页面
        │   ├── apiTester/                # API 测试页面
        │   ├── settings/                 # 设置页面
        │   └── vercel/                   # Vercel 同步页面
        ├── layout/                       # 布局组件
        ├── locales/                      # 国际化文案
        └── utils/                        # 前端工具函数
 ```
 ## 2. 请求主链路
 ```mermaid
 flowchart LR
    C[Client/SDK] --> R[internal/server/router.go]
    R --> OA[OpenAI Adapter]
    R --> CA[Claude Adapter]
    R --> GA[Gemini Adapter]
    R --> AD[Admin API]
    CA --> BR[translatorcliproxy]
    GA --> BR
    BR --> CORE[internal/adapter/openai ChatCompletions]
    OA --> CORE
    CORE --> AUTH[internal/auth + config key/account resolver]
    CORE --> POOL[internal/account queue + concurrency]
    CORE --> TOOL[internal/toolcall parser + sieve]
    CORE --> DS[internal/deepseek client]
    DS --> U[DeepSeek upstream]
 ```
 ## 3. internal/ 子模块职责
 - `internal/server`：路由树和中间件挂载（健康检查、协议入口、Admin/WebUI）。
 - `internal/adapter/openai`：统一执行内核（chat/responses/embeddings 与 tool calling 语义）。
 - `internal/adapter/{claude,gemini}`：协议输入输出适配，不重复实现上游调用逻辑。
 - `internal/translatorcliproxy`：Claude/Gemini 与 OpenAI 结构互转。
 - `internal/deepseek`：上游请求、会话、PoW、SSE 消费。
 - `internal/stream` + `internal/sse`：流式解析与增量处理。
 - `internal/toolcall`：JSON/XML/invoke/text-kv 工具调用解析及防泄漏筛分。
 - `internal/admin`：配置管理、账号管理、Vercel 同步、版本检查、开发抓包。
 - `internal/config`：配置加载、校验、运行时 settings 热更新。
 - `internal/account`：托管账号池、并发槽位、等待队列。
 ## 4. WebUI 与运行时关系
 - `webui/` 是前端源码（Vite + React）。
 - 运行时托管目录是 `static/admin`（构建产物）。
 - 本地首次启动若 `static/admin` 缺失，会尝试自动构建（依赖 Node.js）。
 ## 5. 文档拆分策略
 - 总览与快速开始：`README.MD` / `README.en.md`
 - 架构与目录：`docs/ARCHITECTURE*.md`（本文件）
 - 接口协议：`API.md` / `API.en.md`
 - 部署、测试、贡献：`docs/DEPLOY*`、`docs/TESTING.md`、`docs/CONTRIBUTING*`
 - 专题：`docs/toolcall-semantics.md`、`docs/DeepSeekSSE行为结构说明-2026-04-05.md`
--- a/docs/CONTRIBUTING.en.md
+++ b/docs/CONTRIBUTING.en.md
@@ -41,6 +41,7 @@ npm install
 # 3. Start dev server (hot reload)
 npm run dev
 # Default: http://localhost:5173, auto-proxies API to backend
 # host: 0.0.0.0 is not configured, so LAN access is not enabled by default
 ```
 WebUI tech stack:
@@ -58,7 +59,7 @@ docker-compose -f docker-compose.dev.yml up
 | Language | Standards |
 | --- | --- |
-| **Go** | Run `gofmt` and ensure `go test ./...` passes before committing |
+| **Go** | Run `./scripts/lint.sh` (gofmt + golangci-lint) and ensure `go test ./...` passes before committing |
 | **JavaScript/React** | Follow existing project style (functional components) |
 | **Commit messages** | Use semantic prefixes: `feat:`, `fix:`, `docs:`, `refactor:`, `style:`, `perf:`, `chore:` |
@@ -93,58 +94,12 @@ Manually build WebUI to `static/admin/`:
 ## Project Structure
-```text
+To avoid documentation drift, directory layout and module responsibilities were moved to:
-ds2api/
+
-├── app/                     # Shared HTTP handler assembly (local + serverless)
+- [docs/ARCHITECTURE.en.md](./ARCHITECTURE.en.md)
-├── cmd/
+- [docs/README.md](./README.md)
-│   ├── ds2api/              # Local/container entrypoint
+
-│   └── ds2api-tests/        # End-to-end testsuite entrypoint
+Before contributing, review the architecture doc sections for request flow and `internal/` module boundaries.
 ├── api/
 │   ├── index.go             # Vercel Serverless Go entry
 │   ├── chat-stream.js       # Vercel Node.js stream relay
 │   └── (rewrite targets in vercel.json)
 ├── internal/
 │   ├── account/             # Account pool and concurrency queue
 │   ├── adapter/
 │   │   ├── openai/          # OpenAI adapter
 │   │   ├── claude/          # Claude adapter
 │   │   └── gemini/          # Gemini adapter
 │   ├── admin/               # Admin API handlers
 │   ├── auth/                # Auth and JWT
 │   ├── claudeconv/          # Claude message conversion
 │   ├── compat/              # Go-version compatibility and regression helpers
 │   ├── config/              # Config loading, validation, and hot-reload
 │   ├── deepseek/            # DeepSeek client, PoW WASM
 │   ├── js/                  # Node runtime stream/compat logic
 │   ├── devcapture/          # Dev packet capture
 │   ├── format/              # Output formatting
 │   ├── prompt/              # Prompt building
 │   ├── server/              # HTTP routing (chi router)
 │   ├── sse/                 # SSE parsing utilities
 │   ├── stream/              # Unified stream consumption engine
 │   ├── testsuite/           # Testsuite framework and scenario orchestration
 │   ├── translatorcliproxy/  # CLIProxy bridge and stream writer
 │   ├── util/                # Common utilities
 │   ├── version/             # Version parsing and comparison
 │   └── webui/               # WebUI static hosting
 ├── webui/                   # React WebUI source
 │   └── src/
 │       ├── app/             # Routing, auth, config state
 │       ├── features/        # Feature modules
 │       ├── components/      # Shared components
 │       └── locales/         # Language packs
 ├── scripts/                 # Build and test scripts
 ├── tests/
 │   ├── compat/              # Compatibility fixtures and expected outputs
 │   ├── node/                # Node-side unit tests
 │   └── scripts/             # Test script entrypoints (unit/e2e)
 ├── plans/                   # Plans, gates, and manual smoke-test records
 ├── static/admin/            # WebUI build output (not committed)
 ├── Dockerfile               # Multi-stage build
 ├── docker-compose.yml       # Production
 ├── docker-compose.dev.yml   # Development
 └── vercel.json              # Vercel config
 ```
 ## Reporting Issues
--- a/docs/CONTRIBUTING.md
+++ b/docs/CONTRIBUTING.md
@@ -41,6 +41,7 @@ npm install
 # 3. 启动开发服务器（热更新）
 npm run dev
 # 默认监听 http://localhost:5173，自动代理 API 到后端
 # 当前未配置 host: 0.0.0.0，因此默认不对局域网开放
 ```
 WebUI 技术栈：
@@ -58,7 +59,7 @@ docker-compose -f docker-compose.dev.yml up
 | 语言 | 规范 |
 | --- | --- |
-| **Go** | 提交前运行 `gofmt`，确保 `go test ./...` 通过 |
+| **Go** | 提交前运行 `./scripts/lint.sh`（包含 gofmt+golangci-lint）并确保 `go test ./...` 通过 |
 | **JavaScript/React** | 保持现有代码风格（函数组件） |
 | **提交信息** | 使用语义化前缀：`feat:`、`fix:`、`docs:`、`refactor:`、`style:`、`perf:`、`chore:` |
@@ -93,58 +94,12 @@ docker-compose -f docker-compose.dev.yml up
 ## 项目结构
-```text
+为避免与其他文档重复维护，目录结构与模块职责已迁移到：
-ds2api/
+
-├── app/                     # 统一 HTTP Handler 装配（本地 + Serverless）
+- [docs/ARCHITECTURE.md](./ARCHITECTURE.md)
-├── cmd/
+- [docs/README.md](./README.md)
-│   ├── ds2api/              # 本地/容器启动入口
+
-│   └── ds2api-tests/        # 端到端测试集入口
+贡献前建议先阅读架构文档中的“请求主链路”和 `internal/` 模块职责，再定位改动范围。
 ├── api/
 │   ├── index.go             # Vercel Serverless Go 入口
 │   ├── chat-stream.js       # Vercel Node.js 流式转发
 │   └── (rewrite targets in vercel.json)
 ├── internal/
 │   ├── account/             # 账号池与并发队列
 │   ├── adapter/
 │   │   ├── openai/          # OpenAI 兼容适配器
 │   │   ├── claude/          # Claude 兼容适配器
 │   │   └── gemini/          # Gemini 兼容适配器
 │   ├── admin/               # Admin API handlers
 │   ├── auth/                # 鉴权与 JWT
 │   ├── claudeconv/          # Claude 消息格式转换
 │   ├── compat/              # Go 版本兼容与回归测试辅助
 │   ├── config/              # 配置加载、校验与热更新
 │   ├── deepseek/            # DeepSeek 客户端、PoW WASM
 │   ├── js/                  # Node 运行时流式/兼容逻辑
 │   ├── devcapture/          # 开发抓包
 │   ├── format/              # 输出格式化
 │   ├── prompt/              # Prompt 构建
 │   ├── server/              # HTTP 路由（chi router）
 │   ├── sse/                 # SSE 解析工具
 │   ├── stream/              # 统一流式消费引擎
 │   ├── testsuite/           # 测试集框架与场景编排
 │   ├── translatorcliproxy/  # CLIProxy 桥接与流式写入
 │   ├── util/                # 通用工具
 │   ├── version/             # 版本解析与比较
 │   └── webui/               # WebUI 静态托管
 ├── webui/                   # React WebUI 源码
 │   └── src/
 │       ├── app/             # 路由、鉴权、配置状态
 │       ├── features/        # 业务功能模块
 │       ├── components/      # 通用组件
 │       └── locales/         # 语言包
 ├── scripts/                 # 构建与测试脚本
 ├── tests/
 │   ├── compat/              # 兼容夹具与期望输出
 │   ├── node/                # Node 侧单元测试
 │   └── scripts/             # 测试脚本入口（unit/e2e）
 ├── plans/                   # 计划、门禁和手工烟测记录
 ├── static/admin/            # WebUI 构建产物（不提交）
 ├── Dockerfile               # 多阶段构建
 ├── docker-compose.yml       # 生产环境
 ├── docker-compose.dev.yml   # 开发环境
 └── vercel.json              # Vercel 配置
 ```
 ## 问题反馈
--- a/docs/DEPLOY.en.md
+++ b/docs/DEPLOY.en.md
@@ -4,15 +4,18 @@ Language: [中文](DEPLOY.md) | [English](DEPLOY.en.md)
 This guide covers all deployment methods for the current Go-based codebase.
 Doc map: [Index](./README.md) | [Architecture](./ARCHITECTURE.en.md) | [API](../API.en.md) | [Testing](./TESTING.md)
 ---
 ## Table of Contents
 - [Recommended deployment priority](#recommended-deployment-priority)
 - [Prerequisites](#0-prerequisites)
- [1. Local Run](#1-local-run)
+- [1. Download Release Binaries](#1-download-release-binaries)
- [2. Docker Deployment](#2-docker-deployment)
+- [2. Docker / GHCR Deployment](#2-docker--ghcr-deployment)
 - [3. Vercel Deployment](#3-vercel-deployment)
- [4. Download Release Binaries](#4-download-release-binaries)
+- [4. Local Run from Source](#4-local-run-from-source)
 - [5. Reverse Proxy (Nginx)](#5-reverse-proxy-nginx)
 - [6. Linux systemd Service](#6-linux-systemd-service)
 - [7. Post-Deploy Checks](#7-post-deploy-checks)
@@ -20,6 +23,17 @@ This guide covers all deployment methods for the current Go-based codebase.
 ---
 ## Recommended deployment priority
 Recommended order when choosing a deployment method:
 1. **Download and run release binaries**: the easiest path for most users because the artifacts are already built.
 2. **Docker / GHCR image deployment**: suitable for containerized, orchestrated, or cloud environments.
 3. **Vercel deployment**: suitable if you already use Vercel and accept its platform constraints.
 4. **Run from source / build locally**: suitable for development, debugging, or when you need to modify the code yourself.
 ---
 ## 0. Prerequisites
 | Dependency | Minimum Version | Notes |
@@ -46,70 +60,59 @@ Use `config.json` as the single source of truth:
 ---
-## 1. Local Run
+## 1. Download Release Binaries
-### 1.1 Basic Steps
+Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
 - **Trigger**: only on Release `published` (no build on normal push)
 - **Outputs**: multi-platform binary archives + `sha256sums.txt`
 - **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
 | Platform | Architecture | Format |
 | --- | --- | --- |
 | Linux | amd64, arm64 | `.tar.gz` |
 | macOS | amd64, arm64 | `.tar.gz` |
 | Windows | amd64 | `.zip` |
 Each archive includes:
 - `ds2api` executable (`ds2api.exe` on Windows)
 - `static/admin/` (built WebUI assets)
 - `config.example.json`, `.env.example`
 - `README.MD`, `README.en.md`, `LICENSE`
 ### Usage
 ```bash
-# Clone
+# 1. Download the archive for your platform
-git clone https://github.com/CJackHwang/ds2api.git
+# 2. Extract
-cd ds2api
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
 cd ds2api_<tag>_linux_amd64
-# Copy and edit config
+# 3. Configure
 cp config.example.json config.json
-# Open config.json and fill in:
+# Edit config.json
 #   - keys: your API access keys
 #   - accounts: DeepSeek accounts (email or mobile + password)
-# Start
+# 4. Start
 go run ./cmd/ds2api
 ```
 Default address: `http://0.0.0.0:5001` (override with `PORT`).
 ### 1.2 WebUI Build
 On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
 Manual build:
 ```bash
 ./scripts/build-webui.sh
 ```
 Or step by step:
 ```bash
 cd webui
 npm install
 npm run build
 # Output goes to static/admin/
 ```
 Control auto-build via environment variable:
 ```bash
 # Disable auto-build
 DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
 # Force enable auto-build
 DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
 ```
 ### 1.3 Compile to Binary
 ```bash
 go build -o ds2api ./cmd/ds2api
 ./ds2api
 ```
 ### Maintainer Release Flow
 1. Create and publish a GitHub Release (with tag, for example `vX.Y.Z`)
 2. Wait for the `Release Artifacts` workflow to complete
 3. Download the matching archive from Release Assets
 ---
-## 2. Docker Deployment
+## 2. Docker / GHCR Deployment
 ### 2.1 Basic Steps
 ```bash
 # Pull prebuilt image
 docker pull ghcr.io/cjackhwang/ds2api:latest
 # Copy env template and config file
 cp .env.example .env
 cp config.example.json config.json
@@ -126,7 +129,13 @@ docker-compose up -d
 docker-compose logs -f
 ```
-The default `docker-compose.yml` maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
+The default `docker-compose.yml` directly uses `ghcr.io/cjackhwang/ds2api:latest` and maps host port `6011` to container port `5001`. If you want `5001` exposed directly, set `DS2API_HOST_PORT=5001` (or adjust the `ports` mapping).
 If you want a pinned version instead of `latest`, you can also pull a specific tag directly:
 ```bash
 docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
 ```
 ### 2.2 Update
@@ -348,58 +357,61 @@ If API responses return Vercel HTML `Authentication Required`:
 ---
-## 4. Download Release Binaries
+## 4. Local Run from Source
-Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
+### 4.1 Basic Steps
 - **Trigger**: only on Release `published` (no build on normal push)
 - **Outputs**: multi-platform binary archives + `sha256sums.txt`
 - **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
 | Platform | Architecture | Format |
 | --- | --- | --- |
 | Linux | amd64, arm64 | `.tar.gz` |
 | macOS | amd64, arm64 | `.tar.gz` |
 | Windows | amd64 | `.zip` |
 Each archive includes:
 - `ds2api` executable (`ds2api.exe` on Windows)
 - `static/admin/` (built WebUI assets)
 - `sha3_wasm_bg.7b9ca65ddd.wasm` (optional; binary has embedded fallback)
 - `config.example.json`, `.env.example`
 - `README.MD`, `README.en.md`, `LICENSE`
 ### Usage
 ```bash
-# 1. Download the archive for your platform
+# Clone
-# 2. Extract
+git clone https://github.com/CJackHwang/ds2api.git
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api
 cd ds2api_<tag>_linux_amd64
-# 3. Configure
+# Copy and edit config
 cp config.example.json config.json
-# Edit config.json
+# Open config.json and fill in:
 #   - keys: your API access keys
 #   - accounts: DeepSeek accounts (email or mobile + password)
-# 4. Start
+# Start
-./ds2api
+go run ./cmd/ds2api
 ```
-### Maintainer Release Flow
+Default local access URL: `http://127.0.0.1:5001`; the server actually binds to `0.0.0.0:5001` (override with `PORT`).
-1. Create and publish a GitHub Release (with tag, for example `vX.Y.Z`)
+### 4.2 WebUI Build
 2. Wait for the `Release Artifacts` workflow to complete
 3. Download the matching archive from Release Assets
-### Pull from GHCR (Optional)
+On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm; when dependencies are missing it runs `npm ci` first, then `npm run build -- --outDir static/admin --emptyOutDir`).
 Manual build:
 ```bash
-# latest
+./scripts/build-webui.sh
-docker pull ghcr.io/cjackhwang/ds2api:latest
+```
-# specific version (example)
+Or step by step:
-docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+
 ```bash
 cd webui
 npm install
 npm run build
 # Output goes to static/admin/
 ```
 Control auto-build via environment variable:
 ```bash
 # Disable auto-build
 DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
 # Force enable auto-build
 DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
 ```
 ### 4.3 Compile to Binary
 ```bash
 go build -o ds2api ./cmd/ds2api
 ./ds2api
 ```
 ---
@@ -456,8 +468,6 @@ server {
 # Copy compiled binary and related files to target directory
 sudo mkdir -p /opt/ds2api
 sudo cp ds2api config.json /opt/ds2api/
 # Optional: if you want to use an external WASM file (override the embedded one, from a release package or build output)
 # sudo cp /path/to/sha3_wasm_bg.7b9ca65ddd.wasm /opt/ds2api/
 sudo cp -r static/admin /opt/ds2api/static/admin
 ```
--- a/docs/DEPLOY.md
+++ b/docs/DEPLOY.md
@@ -4,15 +4,18 @@
 本指南基于当前 Go 代码库，详细说明各种部署方式。
 本页导航：[文档总索引](./README.md)｜[架构说明](./ARCHITECTURE.md)｜[接口文档](../API.md)｜[测试指南](./TESTING.md)
 ---
 ## 目录
 - [部署方式优先级建议](#部署方式优先级建议)
 - [前置要求](#0-前置要求)
- [一、本地运行](#一本地运行)
+- [一、下载 Release 构建包](#一下载-release-构建包)
- [二、Docker 部署](#二docker-部署)
+- [二、Docker / GHCR 部署](#二docker--ghcr-部署)
 - [三、Vercel 部署](#三vercel-部署)
- [四、下载 Release 构建包](#四下载-release-构建包)
+- [四、本地源码运行](#四本地源码运行)
 - [五、反向代理（Nginx）](#五反向代理nginx)
 - [六、Linux systemd 服务化](#六linux-systemd-服务化)
 - [七、部署后检查](#七部署后检查)
@@ -20,6 +23,17 @@
 ---
 ## 部署方式优先级建议
 推荐按以下顺序选择部署方式：
 1. **下载 Release 构建包运行**：最省事，产物已编译完成，最适合大多数用户。
 2. **Docker / GHCR 镜像部署**：适合需要容器化、编排或云环境部署。
 3. **Vercel 部署**：适合已有 Vercel 环境且接受其平台约束的场景。
 4. **本地源码运行 / 自行编译**：适合开发、调试或需要自行修改代码的场景。
 ---
 ## 0. 前置要求
 | 依赖 | 最低版本 | 说明 |
@@ -46,70 +60,59 @@ cp config.example.json config.json
 ---
-## 一、本地运行
+## 一、下载 Release 构建包
-### 1.1 基本步骤
+仓库内置 GitHub Actions 工作流：`.github/workflows/release-artifacts.yml`
 - **触发条件**：仅在 Release `published` 时触发（普通 push 不会构建）
 - **构建产物**：多平台二进制压缩包 + `sha256sums.txt`
 - **容器镜像发布**：仅发布到 GHCR（`ghcr.io/cjackhwang/ds2api`）
 | 平台 | 架构 | 文件格式 |
 | --- | --- | --- |
 | Linux | amd64, arm64 | `.tar.gz` |
 | macOS | amd64, arm64 | `.tar.gz` |
 | Windows | amd64 | `.zip` |
 每个压缩包包含：
 - `ds2api` 可执行文件（Windows 为 `ds2api.exe`）
 - `static/admin/`（WebUI 构建产物）
 - `config.example.json`、`.env.example`
 - `README.MD`、`README.en.md`、`LICENSE`
 ### 使用步骤
 ```bash
-# 克隆仓库
+# 1. 下载对应平台的压缩包
-git clone https://github.com/CJackHwang/ds2api.git
+# 2. 解压
-cd ds2api
+tar -xzf ds2api_<tag>_linux_amd64.tar.gz
 cd ds2api_<tag>_linux_amd64
-# 复制并编辑配置
+# 3. 配置
 cp config.example.json config.json
-# 使用你喜欢的编辑器打开 config.json，填入：
+# 编辑 config.json
 #   - keys: 你的 API 访问密钥
 #   - accounts: DeepSeek 账号（email 或 mobile + password）
-# 启动服务
+# 4. 启动
 go run ./cmd/ds2api
 ```
 默认监听 `http://0.0.0.0:5001`，可通过 `PORT` 环境变量覆盖。
 ### 1.2 WebUI 构建
 本地首次启动时，若 `static/admin/` 不存在，服务会自动尝试构建 WebUI（需要 Node.js/npm；缺依赖时会先执行 `npm ci`，再执行 `npm run build -- --outDir static/admin --emptyOutDir`）。
 你也可以手动构建：
 ```bash
 ./scripts/build-webui.sh
 ```
 或手动执行：
 ```bash
 cd webui
 npm install
 npm run build
 # 产物输出到 static/admin/
 ```
 通过环境变量控制自动构建行为：
 ```bash
 # 强制关闭自动构建
 DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
 # 强制开启自动构建
 DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
 ```
 ### 1.3 编译为二进制文件
 ```bash
 go build -o ds2api ./cmd/ds2api
 ./ds2api
 ```
 ### 维护者发布步骤
 1. 在 GitHub 创建并发布 Release（带 tag，如 `vX.Y.Z`）
 2. 等待 Actions 工作流 `Release Artifacts` 完成
 3. 在 Release 的 Assets 下载对应平台压缩包
 ---
-## 二、Docker 部署
+## 二、Docker / GHCR 部署
 ### 2.1 基本步骤
 ```bash
 # 拉取预编译镜像
 docker pull ghcr.io/cjackhwang/ds2api:latest
 # 复制环境变量模板和配置文件
 cp .env.example .env
 cp config.example.json config.json
@@ -126,7 +129,13 @@ docker-compose up -d
 docker-compose logs -f
 ```
-默认 `docker-compose.yml` 会把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`，请设置 `DS2API_HOST_PORT=5001`（或者手动调整 `ports` 配置）。
+默认 `docker-compose.yml` 直接使用 `ghcr.io/cjackhwang/ds2api:latest`，并把宿主机 `6011` 映射到容器内的 `5001`。如果你希望直接对外暴露 `5001`，请设置 `DS2API_HOST_PORT=5001`（或者手动调整 `ports` 配置）。
 如需固定版本，也可以直接拉取指定 tag：
 ```bash
 docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
 ```
 ### 2.2 更新
@@ -348,58 +357,61 @@ No Output Directory named "public" found after the Build completed.
 ---
-## 四、下载 Release 构建包
+## 四、本地源码运行
-仓库内置 GitHub Actions 工作流：`.github/workflows/release-artifacts.yml`
+### 4.1 基本步骤
 - **触发条件**：仅在 Release `published` 时触发（普通 push 不会构建）
 - **构建产物**：多平台二进制压缩包 + `sha256sums.txt`
 - **容器镜像发布**：仅发布到 GHCR（`ghcr.io/cjackhwang/ds2api`）
 | 平台 | 架构 | 文件格式 |
 | --- | --- | --- |
 | Linux | amd64, arm64 | `.tar.gz` |
 | macOS | amd64, arm64 | `.tar.gz` |
 | Windows | amd64 | `.zip` |
 每个压缩包包含：
 - `ds2api` 可执行文件（Windows 为 `ds2api.exe`）
 - `static/admin/`（WebUI 构建产物）
 - `sha3_wasm_bg.7b9ca65ddd.wasm`（可选；程序内置 embed fallback）
 - `config.example.json`、`.env.example`
 - `README.MD`、`README.en.md`、`LICENSE`
 ### 使用步骤
 ```bash
-# 1. 下载对应平台的压缩包
+# 克隆仓库
-# 2. 解压
+git clone https://github.com/CJackHwang/ds2api.git
-tar -xzf ds2api_<tag>_linux_amd64.tar.gz
+cd ds2api
 cd ds2api_<tag>_linux_amd64
-# 3. 配置
+# 复制并编辑配置
 cp config.example.json config.json
-# 编辑 config.json
+# 使用你喜欢的编辑器打开 config.json，填入：
 #   - keys: 你的 API 访问密钥
 #   - accounts: DeepSeek 账号（email 或 mobile + password）
-# 4. 启动
+# 启动服务
-./ds2api
+go run ./cmd/ds2api
 ```
-### 维护者发布步骤
+默认本地访问地址是 `http://127.0.0.1:5001`；服务实际绑定 `0.0.0.0:5001`，可通过 `PORT` 环境变量覆盖。
-1. 在 GitHub 创建并发布 Release（带 tag，如 `vX.Y.Z`）
+### 4.2 WebUI 构建
 2. 等待 Actions 工作流 `Release Artifacts` 完成
 3. 在 Release 的 Assets 下载对应平台压缩包
-### 拉取 GHCR 镜像（可选）
+本地首次启动时，若 `static/admin/` 不存在，服务会自动尝试构建 WebUI（需要 Node.js/npm；缺依赖时会先执行 `npm ci`，再执行 `npm run build -- --outDir static/admin --emptyOutDir`）。
 你也可以手动构建：
 ```bash
-# latest
+./scripts/build-webui.sh
-docker pull ghcr.io/cjackhwang/ds2api:latest
+```
-# 指定版本（示例）
+或手动执行：
-docker pull ghcr.io/cjackhwang/ds2api:v3.0.0
+
 ```bash
 cd webui
 npm install
 npm run build
 # 产物输出到 static/admin/
 ```
 通过环境变量控制自动构建行为：
 ```bash
 # 强制关闭自动构建
 DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
 # 强制开启自动构建
 DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
 ```
 ### 4.3 编译为二进制文件
 ```bash
 go build -o ds2api ./cmd/ds2api
 ./ds2api
 ```
 ---
@@ -456,8 +468,6 @@ server {
 # 将编译好的二进制文件和相关文件复制到目标目录
 sudo mkdir -p /opt/ds2api
 sudo cp ds2api config.json /opt/ds2api/
 # 可选：若你希望使用外置 WASM 文件（覆盖内置版本，来自 release 包或构建产物）
 # sudo cp /path/to/sha3_wasm_bg.7b9ca65ddd.wasm /opt/ds2api/
 sudo cp -r static/admin /opt/ds2api/static/admin
 ```
--- a/docs/DeepSeekSSE行为结构说明-2026-04-05.md
+++ b/docs/DeepSeekSSE行为结构说明-2026-04-05.md
@@ -4,6 +4,8 @@
 > 当前 corpus 由 4 份原始流组成，覆盖搜索+引用、风控终态、Markdown 输出和空格敏感输出等行为。
 > 补充：文末还会注明少量“当前实现已确认、但 corpus 尚未完整覆盖”的行为，例如长思考场景下的自动续写状态。
 文档导航：[文档总索引](./README.md) / [测试指南](./TESTING.md) / [样本目录说明](../tests/raw_stream_samples/README.md)
 ## 1. 样本覆盖
 下列样本共同构成了本文的观察基础：
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,53 @@
 # DS2API 文档导航 | Documentation Index
 语言 / Language: [中文](README.md) | [English](README.md#english)
 ## 中文
 为减少重复维护，本仓库文档按“入口文档 + 专题文档”拆分。建议从下列顺序阅读：
 1. [项目总览（README）](../README.MD)
 2. [架构与目录说明](./ARCHITECTURE.md)
 3. [接口文档（API）](../API.md)
 4. [部署指南](./DEPLOY.md)
 5. [测试指南](./TESTING.md)
 6. [贡献指南](./CONTRIBUTING.md)
 ### 专题文档
 - [Tool Calling 统一语义](./toolcall-semantics.md)
 - [DeepSeek SSE 行为结构说明（逆向观察）](./DeepSeekSSE行为结构说明-2026-04-05.md)
 ### 文档维护约定
 - `README.MD` / `README.en.md`：面向首次接触用户，保留“是什么 + 怎么快速跑起来”。
 - `docs/ARCHITECTURE*.md`：面向开发者，集中维护项目结构、模块职责与调用链。
 - `API*.md`：面向客户端接入者，聚焦接口行为、鉴权和示例。
 - 其他 `docs/*.md`：主题化说明，避免在多个文档重复粘贴同一段内容。
 ---
 ## English
 To reduce maintenance drift, docs are split into an “entry doc + topical docs” layout.
 Recommended reading order:
 1. [Project overview (README)](../README.en.md)
 2. [Architecture and project layout](./ARCHITECTURE.en.md)
 3. [API reference](../API.en.md)
 4. [Deployment guide](./DEPLOY.en.md)
 5. [Testing guide](./TESTING.md)
 6. [Contributing guide](./CONTRIBUTING.en.md)
 ### Topical docs
 - [Tool-calling unified semantics](./toolcall-semantics.md)
 - [DeepSeek SSE behavior notes (reverse-engineered)](./DeepSeekSSE行为结构说明-2026-04-05.md)
 ### Maintenance conventions
 - `README.MD` / `README.en.md`: onboarding-oriented (“what + quick start”).
 - `docs/ARCHITECTURE*.md`: developer-oriented source of truth for module boundaries and execution flow.
 - `API*.md`: integration-oriented behavior/contracts.
 - Other `docs/*.md`: focused topics, avoid copy-pasting the same section into multiple files.
--- a/docs/TESTING.md
+++ b/docs/TESTING.md
@@ -2,6 +2,8 @@
 语言 / Language: 中文 + English（同页）
 文档导航： [总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [部署指南](./DEPLOY.md) / [接口文档](../API.md)
 ## 概述 | Overview
 DS2API 提供两个层级的测试：
@@ -180,10 +182,10 @@ go test ./...
 ```bash
 # 运行 tool calls 相关测试（推荐用于调试 tool call 解析问题）
-go test -v -run 'TestParseToolCalls|TestRepair' ./internal/util/
+go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
 # 运行单个测试用例
-go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/util/
+go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/
 # 运行 format 相关测试
 go test -v ./internal/format/...
@@ -198,13 +200,13 @@ go test -v ./internal/adapter/openai/...
 ```bash
 # 1. 运行 tool calls 相关的所有测试
-go test -v -run 'TestParseToolCalls|TestRepair' ./internal/util/
+go test -v -run 'TestParseToolCalls|TestRepair' ./internal/toolcall/
 # 2. 查看测试输出中的详细调试信息
-go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/util/ 2>&1
+go test -v -run TestParseToolCallsWithDeepSeekHallucination ./internal/toolcall/ 2>&1
 # 3. 检查具体测试用例的修复效果
-# 测试用例位于 internal/util/toolcalls_test.go，包含：
+# 测试用例位于 internal/toolcall/toolcalls_test.go，包含：
 # - TestParseToolCallsWithDeepSeekHallucination: DeepSeek 典型幻觉输出
 # - TestRepairLooseJSONWithNestedObjects: 嵌套对象的方括号修复
 # - TestParseToolCallsWithMixedWindowsPaths: Windows 路径处理
@@ -235,6 +237,7 @@ go run ./cmd/ds2api-tests --no-preflight
 说明：
 - 该工具默认重放 `tests/raw_stream_samples/manifest.json` 声明的 canonical 样本，按上游 SSE 顺序做 1:1 仿真解析。
 - 默认校验不出现 `FINISHED` 文本泄露，并要求存在结束信号。
 - 默认**不**把 `raw accumulated_token_usage` 与本地解析 token 做强一致校验（当前实现以内容估算为准）；如需强校验可显式加 `--fail-on-token-mismatch`。
 - 每次运行都会把本地派生结果写入 `artifacts/raw-stream-sim/<run-id>/<sample-id>/replay.output.txt`，并输出结构化报告。
 - 如果你有历史基线目录，可以通过 `--baseline-root` 让工具直接做文本对比。
 - 更完整的协议级行为结构说明见 [DeepSeekSSE行为结构说明-2026-04-05.md](./DeepSeekSSE行为结构说明-2026-04-05.md)。
@@ -260,6 +263,21 @@ POST /admin/dev/raw-samples/capture
 这个接口会把请求元信息和上游原始流写入 `tests/raw_stream_samples/<sample-id>/`，以后可以直接拿来做回放和字段分析。派生输出会在本地回放时再生成，不再落在样本目录里。
 ### 从内存抓包查询并保存样本
 如果问题刚刚在本地复现过，也可以先查当前进程内存里的抓包，再选择性落盘：
 ```bash
 GET /admin/dev/raw-samples/query?q=广州&limit=10
 POST /admin/dev/raw-samples/save
 {"chain_key":"session:xxxx","sample_id":"tmp-from-memory"}
 ```
 说明：
 - `query` 会按 `chat_session_id` 把 `completion + continue` 归并成一条链，适合定位接续思考问题。
 - `save` 支持用 `query`、`chain_key` 或 `capture_id` 选中目标。
 - 生成的样本目录仍然是 `tests/raw_stream_samples/<sample-id>/`，可以直接喂给回放脚本。
 ### 指定输出目录和超时
 ```bash
--- a/docs/toolcall-semantics.md
+++ b/docs/toolcall-semantics.md
@@ -2,6 +2,8 @@
 本文档描述当前代码中 `ParseToolCallsDetailed` / `parseToolCallsDetailed` 的**实际行为**，用于对齐 Go 与 Node Runtime。
 文档导航：[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [测试指南](./TESTING.md)
 ## 1) 输出结构（当前实现）
 - `calls`：解析得到的工具调用列表（`name` + `input`）。
--- a/go.mod
+++ b/go.mod
@@ -8,7 +8,6 @@ require (
 	github.com/google/uuid v1.6.0
 	github.com/refraction-networking/utls v1.8.2
 	github.com/router-for-me/CLIProxyAPI/v6 v6.9.14
 	github.com/tetratelabs/wazero v1.11.0
 )
 require (
@@ -19,7 +18,7 @@ require (
 	github.com/tidwall/pretty v1.2.1 // indirect
 	github.com/tidwall/sjson v1.2.5 // indirect
 	golang.org/x/crypto v0.49.0 // indirect
-	golang.org/x/net v0.52.0 // indirect
+	golang.org/x/net v0.52.0
 	golang.org/x/sys v0.42.0 // indirect
 	gopkg.in/yaml.v3 v3.0.1 // indirect
 )
--- a/go.sum
+++ b/go.sum
@@ -18,8 +18,6 @@ github.com/sirupsen/logrus v1.9.4 h1:TsZE7l11zFCLZnZ+teH4Umoq5BhEIfIzfRDZ1Uzql2w
 github.com/sirupsen/logrus v1.9.4/go.mod h1:ftWc9WdOfJ0a92nsE2jF5u5ZwH8Bv2zdeOC42RjbV2g=
 github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
 github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
 github.com/tetratelabs/wazero v1.11.0 h1:+gKemEuKCTevU4d7ZTzlsvgd1uaToIDtlQlmNbwqYhA=
 github.com/tetratelabs/wazero v1.11.0/go.mod h1:eV28rsN8Q+xwjogd7f4/Pp4xFxO7uOGbLcD/LzB1wiU=
 github.com/tidwall/gjson v1.14.2/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk=
 github.com/tidwall/gjson v1.18.0 h1:FIDeeyB800efLX89e5a8Y0BNH+LOngJyGrIWxG2FKQY=
 github.com/tidwall/gjson v1.18.0/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk=
--- a/internal/adapter/claude/handler_messages.go
+++ b/internal/adapter/claude/handler_messages.go
@@ -64,7 +64,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
 		rec := httptest.NewRecorder()
 		h.OpenAI.ChatCompletions(rec, proxyReq)
 		res := rec.Result()
-		defer res.Body.Close()
+		defer func() { _ = res.Body.Close() }()
 		body, _ := io.ReadAll(res.Body)
 		for k, vv := range res.Header {
 			for _, v := range vv {
@@ -94,7 +94,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
 	rec := httptest.NewRecorder()
 	h.OpenAI.ChatCompletions(rec, proxyReq)
 	res := rec.Result()
-	defer res.Body.Close()
+	defer func() { _ = res.Body.Close() }()
 	body, _ := io.ReadAll(res.Body)
 	if res.StatusCode < 200 || res.StatusCode >= 300 {
 		for k, vv := range res.Header {
@@ -124,7 +124,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
 }
 func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Request, resp *http.Response, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeClaudeError(w, http.StatusInternalServerError, string(body))
--- a/internal/adapter/claude/handler_utils.go
+++ b/internal/adapter/claude/handler_utils.go
@@ -1,12 +1,12 @@
 package claude
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"fmt"
 	"strings"
 	"ds2api/internal/prompt"
 	"ds2api/internal/util"
 )
 func normalizeClaudeMessages(messages []any) []any {
@@ -98,9 +98,10 @@ func buildClaudeToolPrompt(tools []any) string {
 	}
 	return "You have access to these tools:\n\n" +
 		strings.Join(toolSchemas, "\n\n") + "\n\n" +
-		util.BuildToolCallInstructions(names)
+		toolcall.BuildToolCallInstructions(names)
 }
 //nolint:unused // retained for compatibility with pending Claude tool-result prompt flow.
 func formatClaudeToolResultForPrompt(block map[string]any) string {
 	if block == nil {
 		return ""
--- a/internal/adapter/claude/handler_utils_sanitize.go
+++ b/internal/adapter/claude/handler_utils_sanitize.go
@@ -96,6 +96,7 @@ func looksLikeBase64Payload(v string) bool {
 	return true
 }
 //nolint:unused // helper kept for compatibility with upcoming sanitize pipeline.
 func marshalCompactJSON(v any) string {
 	b, err := json.Marshal(v)
 	if err != nil {
--- a/internal/adapter/claude/proxy_vercel_test.go
+++ b/internal/adapter/claude/proxy_vercel_test.go
@@ -34,11 +34,13 @@ func (s openAIProxyStub) ChatCompletions(w http.ResponseWriter, _ *http.Request)
 type openAIProxyCaptureStub struct {
 	seenModel string
 	seenReq   map[string]any
 }
 func (s *openAIProxyCaptureStub) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
 	s.seenReq = req
 	if m, ok := req["model"].(string); ok {
 		s.seenModel = m
 	}
@@ -84,3 +86,33 @@ func TestClaudeProxyViaOpenAIPreservesClaudeMapping(t *testing.T) {
 		t.Fatalf("expected mapped proxy model deepseek-reasoner, got %q", got)
 	}
 }
 func TestClaudeProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
 	openAI := &openAIProxyCaptureStub{}
 	h := &Handler{OpenAI: openAI}
 	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":[{"type":"text","text":"hello"},{"type":"image","source":{"type":"base64","media_type":"image/png","data":"QUJDRA=="}}]}],"stream":false}`))
 	rec := httptest.NewRecorder()
 	h.Messages(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	messages, _ := openAI.seenReq["messages"].([]any)
 	if len(messages) != 1 {
 		t.Fatalf("expected one translated message, got %#v", openAI.seenReq)
 	}
 	msg, _ := messages[0].(map[string]any)
 	content, _ := msg["content"].([]any)
 	if len(content) != 2 {
 		t.Fatalf("expected translated content blocks, got %#v", msg)
 	}
 	imageBlock, _ := content[1].(map[string]any)
 	if strings.TrimSpace(asString(imageBlock["type"])) != "image_url" {
 		t.Fatalf("expected image_url block, got %#v", imageBlock)
 	}
 	imageURL, _ := imageBlock["image_url"].(map[string]any)
 	if !strings.HasPrefix(strings.TrimSpace(asString(imageURL["url"])), "data:image/png;base64,") {
 		t.Fatalf("expected translated data url, got %#v", imageBlock)
 	}
 }
--- a/internal/adapter/claude/standard_request.go
+++ b/internal/adapter/claude/standard_request.go
@@ -18,7 +18,7 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
 	model, _ := req["model"].(string)
 	messagesRaw, _ := req["messages"].([]any)
 	if strings.TrimSpace(model) == "" || len(messagesRaw) == 0 {
-		return claudeNormalizedRequest{}, fmt.Errorf("Request must include 'model' and 'messages'.")
+		return claudeNormalizedRequest{}, fmt.Errorf("request must include 'model' and 'messages'")
 	}
 	if _, ok := req["max_tokens"]; !ok {
 		req["max_tokens"] = 8192
@@ -36,7 +36,7 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
 		thinkingEnabled = false
 		searchEnabled = false
 	}
-	finalPrompt := deepseek.MessagesPrepare(toMessageMaps(dsPayload["messages"]))
+	finalPrompt := deepseek.MessagesPrepareWithThinking(toMessageMaps(dsPayload["messages"]), thinkingEnabled)
 	toolNames := extractClaudeToolNames(toolsRequested)
 	if len(toolNames) == 0 && len(toolsRequested) > 0 {
 		toolNames = []string{"__any_tool__"}
--- a/internal/adapter/claude/stream_runtime_core.go
+++ b/internal/adapter/claude/stream_runtime_core.go
@@ -24,10 +24,9 @@ type claudeStreamRuntime struct {
 	bufferToolContent     bool
 	stripReferenceMarkers bool
-	messageID    string
+	messageID string
-	thinking     strings.Builder
+	thinking  strings.Builder
-	text         strings.Builder
+	text      strings.Builder
 	outputTokens int
 	nextBlockIndex     int
 	thinkingBlockOpen  bool
@@ -70,9 +69,6 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 	if !parsed.Parsed {
 		return streamengine.ParsedDecision{}
 	}
 	if parsed.OutputTokens > 0 {
 		s.outputTokens = parsed.OutputTokens
 	}
 	if parsed.ErrorMessage != "" {
 		s.upstreamErr = parsed.ErrorMessage
 		return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("upstream_error")}
@@ -96,7 +92,11 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 			if !s.thinkingEnabled {
 				continue
 			}
-			s.thinking.WriteString(cleanedText)
+			trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
 			if trimmed == "" {
 				continue
 			}
 			s.thinking.WriteString(trimmed)
 			s.closeTextBlock()
 			if !s.thinkingBlockOpen {
 				s.thinkingBlockIndex = s.nextBlockIndex
@@ -116,13 +116,17 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 				"index": s.thinkingBlockIndex,
 				"delta": map[string]any{
 					"type":     "thinking_delta",
-					"thinking": cleanedText,
+					"thinking": trimmed,
 				},
 			})
 			continue
 		}
-		s.text.WriteString(cleanedText)
+		trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
 		if trimmed == "" {
 			continue
 		}
 		s.text.WriteString(trimmed)
 		if s.bufferToolContent {
 			if hasUnclosedCodeFence(s.text.String()) {
 				continue
@@ -148,7 +152,7 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 			"index": s.textBlockIndex,
 			"delta": map[string]any{
 				"type": "text_delta",
-				"text": cleanedText,
+				"text": trimmed,
 			},
 		})
 	}
--- a/internal/adapter/claude/stream_runtime_finalize.go
+++ b/internal/adapter/claude/stream_runtime_finalize.go
@@ -1,6 +1,7 @@
 package claude
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"fmt"
 	"time"
@@ -46,9 +47,9 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
 	if s.bufferToolContent {
-		detected := util.ParseStandaloneToolCalls(finalText, s.toolNames)
+		detected := toolcall.ParseStandaloneToolCalls(finalText, s.toolNames)
 		if len(detected) == 0 && finalText == "" && finalThinking != "" {
-			detected = util.ParseStandaloneToolCalls(finalThinking, s.toolNames)
+			detected = toolcall.ParseStandaloneToolCalls(finalThinking, s.toolNames)
 		}
 		if len(detected) > 0 {
 			stopReason = "tool_use"
@@ -108,9 +109,6 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
 	}
 	outputTokens := util.EstimateTokens(finalThinking) + util.EstimateTokens(finalText)
 	if s.outputTokens > 0 {
 		outputTokens = s.outputTokens
 	}
 	s.send("message_delta", map[string]any{
 		"type": "message_delta",
 		"delta": map[string]any{
--- a/internal/adapter/gemini/convert_passthrough.go
+++ b/internal/adapter/gemini/convert_passthrough.go
@@ -5,6 +5,7 @@ import (
 	"strings"
 )
 //nolint:unused // compatibility hook for native Gemini request normalization path.
 func collectGeminiPassThrough(req map[string]any) map[string]any {
 	cfg, _ := req["generationConfig"].(map[string]any)
 	if len(cfg) == 0 {
--- a/internal/adapter/gemini/convert_request.go
+++ b/internal/adapter/gemini/convert_request.go
@@ -9,6 +9,7 @@ import (
 	"ds2api/internal/util"
 )
 //nolint:unused // kept for native Gemini adapter route compatibility.
 func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[string]any, stream bool) (util.StandardRequest, error) {
 	requestedModel := strings.TrimSpace(routeModel)
 	if requestedModel == "" {
@@ -17,17 +18,17 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
 	resolvedModel, ok := config.ResolveModel(store, requestedModel)
 	if !ok {
-		return util.StandardRequest{}, fmt.Errorf("Model '%s' is not available.", requestedModel)
+		return util.StandardRequest{}, fmt.Errorf("model %q is not available", requestedModel)
 	}
 	thinkingEnabled, searchEnabled, _ := config.GetModelConfig(resolvedModel)
 	messagesRaw := geminiMessagesFromRequest(req)
 	if len(messagesRaw) == 0 {
-		return util.StandardRequest{}, fmt.Errorf("Request must include non-empty contents.")
+		return util.StandardRequest{}, fmt.Errorf("request must include non-empty contents")
 	}
 	toolsRaw := convertGeminiTools(req["tools"])
-	finalPrompt, toolNames := openai.BuildPromptForAdapter(messagesRaw, toolsRaw, "")
+	finalPrompt, toolNames := openai.BuildPromptForAdapter(messagesRaw, toolsRaw, "", thinkingEnabled)
 	passThrough := collectGeminiPassThrough(req)
 	return util.StandardRequest{
--- a/internal/adapter/gemini/convert_tools.go
+++ b/internal/adapter/gemini/convert_tools.go
@@ -2,6 +2,7 @@ package gemini
 import "strings"
 //nolint:unused // kept for native Gemini adapter route compatibility.
 func convertGeminiTools(raw any) []any {
 	tools, _ := raw.([]any)
 	if len(tools) == 0 {
--- a/internal/adapter/gemini/handler_generate.go
+++ b/internal/adapter/gemini/handler_generate.go
@@ -2,6 +2,7 @@ package gemini
 import (
 	"bytes"
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"io"
 	"net/http"
@@ -57,7 +58,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream
 		rec := httptest.NewRecorder()
 		h.OpenAI.ChatCompletions(rec, proxyReq)
 		res := rec.Result()
-		defer res.Body.Close()
+		defer func() { _ = res.Body.Close() }()
 		body, _ := io.ReadAll(res.Body)
 		for k, vv := range res.Header {
 			for _, v := range vv {
@@ -87,7 +88,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream
 	rec := httptest.NewRecorder()
 	h.OpenAI.ChatCompletions(rec, proxyReq)
 	res := rec.Result()
-	defer res.Body.Close()
+	defer func() { _ = res.Body.Close() }()
 	body, _ := io.ReadAll(res.Body)
 	if res.StatusCode < 200 || res.StatusCode >= 300 {
 		for k, vv := range res.Header {
@@ -131,8 +132,9 @@ func writeGeminiErrorFromOpenAI(w http.ResponseWriter, status int, raw []byte) {
 	writeGeminiError(w, status, message)
 }
 //nolint:unused // retained for native Gemini non-stream handling path.
 func (h *Handler) handleNonStreamGenerateContent(w http.ResponseWriter, resp *http.Response, model, finalPrompt string, thinkingEnabled bool, toolNames []string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeGeminiError(w, resp.StatusCode, strings.TrimSpace(string(body)))
@@ -147,13 +149,13 @@ func (h *Handler) handleNonStreamGenerateContent(w http.ResponseWriter, resp *ht
 		cleanVisibleOutput(result.Thinking, stripReferenceMarkers),
 		cleanVisibleOutput(result.Text, stripReferenceMarkers),
 		toolNames,
 		result.OutputTokens,
 	))
 }
-func buildGeminiGenerateContentResponse(model, finalPrompt, finalThinking, finalText string, toolNames []string, outputTokens int) map[string]any {
+//nolint:unused // retained for native Gemini non-stream handling path.
 func buildGeminiGenerateContentResponse(model, finalPrompt, finalThinking, finalText string, toolNames []string) map[string]any {
 	parts := buildGeminiPartsFromFinal(finalText, finalThinking, toolNames)
-	usage := buildGeminiUsage(finalPrompt, finalThinking, finalText, outputTokens)
+	usage := buildGeminiUsage(finalPrompt, finalThinking, finalText)
 	return map[string]any{
 		"candidates": []map[string]any{
 			{
@@ -170,14 +172,11 @@ func buildGeminiGenerateContentResponse(model, finalPrompt, finalThinking, final
 	}
 }
-func buildGeminiUsage(finalPrompt, finalThinking, finalText string, outputTokens int) map[string]any {
+//nolint:unused // retained for native Gemini non-stream handling path.
 func buildGeminiUsage(finalPrompt, finalThinking, finalText string) map[string]any {
 	promptTokens := util.EstimateTokens(finalPrompt)
 	reasoningTokens := util.EstimateTokens(finalThinking)
 	completionTokens := util.EstimateTokens(finalText)
 	if outputTokens > 0 {
 		completionTokens = outputTokens
 		reasoningTokens = 0
 	}
 	return map[string]any{
 		"promptTokenCount":     promptTokens,
 		"candidatesTokenCount": reasoningTokens + completionTokens,
@@ -185,10 +184,11 @@ func buildGeminiUsage(finalPrompt, finalThinking, finalText string, outputTokens
 	}
 }
 //nolint:unused // retained for native Gemini non-stream handling path.
 func buildGeminiPartsFromFinal(finalText, finalThinking string, toolNames []string) []map[string]any {
-	detected := util.ParseToolCalls(finalText, toolNames)
+	detected := toolcall.ParseToolCalls(finalText, toolNames)
 	if len(detected) == 0 && finalThinking != "" {
-		detected = util.ParseToolCalls(finalThinking, toolNames)
+		detected = toolcall.ParseToolCalls(finalThinking, toolNames)
 	}
 	if len(detected) > 0 {
 		parts := make([]map[string]any, 0, len(detected))
--- a/internal/adapter/gemini/handler_routes.go
+++ b/internal/adapter/gemini/handler_routes.go
@@ -17,6 +17,7 @@ type Handler struct {
 	OpenAI OpenAIChatRunner
 }
 //nolint:unused // used by native Gemini stream/non-stream runtime helpers.
 func (h *Handler) compatStripReferenceMarkers() bool {
 	if h == nil || h.Store == nil {
 		return true
--- a/internal/adapter/gemini/handler_stream_runtime.go
+++ b/internal/adapter/gemini/handler_stream_runtime.go
@@ -12,8 +12,9 @@ import (
 	streamengine "ds2api/internal/stream"
 )
 //nolint:unused // retained for native Gemini stream handling path.
 func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Request, resp *http.Response, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeGeminiError(w, resp.StatusCode, strings.TrimSpace(string(body)))
@@ -49,6 +50,7 @@ func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Req
 	})
 }
 //nolint:unused // retained for native Gemini stream handling path.
 type geminiStreamRuntime struct {
 	w        http.ResponseWriter
 	rc       *http.ResponseController
@@ -63,11 +65,11 @@ type geminiStreamRuntime struct {
 	stripReferenceMarkers bool
 	toolNames             []string
-	thinking     strings.Builder
+	thinking strings.Builder
-	text         strings.Builder
+	text     strings.Builder
 	outputTokens int
 }
 //nolint:unused // retained for native Gemini stream handling path.
 func newGeminiStreamRuntime(
 	w http.ResponseWriter,
 	rc *http.ResponseController,
@@ -93,6 +95,7 @@ func newGeminiStreamRuntime(
 	}
 }
 //nolint:unused // retained for native Gemini stream handling path.
 func (s *geminiStreamRuntime) sendChunk(payload map[string]any) {
 	b, _ := json.Marshal(payload)
 	_, _ = s.w.Write([]byte("data: "))
@@ -103,13 +106,11 @@ func (s *geminiStreamRuntime) sendChunk(payload map[string]any) {
 	}
 }
 //nolint:unused // retained for native Gemini stream handling path.
 func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedDecision {
 	if !parsed.Parsed {
 		return streamengine.ParsedDecision{}
 	}
 	if parsed.OutputTokens > 0 {
 		s.outputTokens = parsed.OutputTokens
 	}
 	if parsed.ContentFilter || parsed.ErrorMessage != "" || parsed.Stop {
 		return streamengine.ParsedDecision{Stop: true}
 	}
@@ -126,11 +127,19 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 		contentSeen = true
 		if p.Type == "thinking" {
 			if s.thinkingEnabled {
-				s.thinking.WriteString(cleanedText)
+				trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
 				if trimmed == "" {
 					continue
 				}
 				s.thinking.WriteString(trimmed)
 			}
 			continue
 		}
-		s.text.WriteString(cleanedText)
+		trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
 		if trimmed == "" {
 			continue
 		}
 		s.text.WriteString(trimmed)
 		if s.bufferContent {
 			continue
 		}
@@ -140,7 +149,7 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 					"index": 0,
 					"content": map[string]any{
 						"role":  "model",
-						"parts": []map[string]any{{"text": cleanedText}},
+						"parts": []map[string]any{{"text": trimmed}},
 					},
 				},
 			},
@@ -150,6 +159,7 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
 	return streamengine.ParsedDecision{ContentSeen: contentSeen}
 }
 //nolint:unused // retained for native Gemini stream handling path.
 func (s *geminiStreamRuntime) finalize() {
 	finalThinking := s.thinking.String()
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
@@ -184,6 +194,6 @@ func (s *geminiStreamRuntime) finalize() {
 			},
 		},
 		"modelVersion":  s.model,
-		"usageMetadata": buildGeminiUsage(s.finalPrompt, finalThinking, finalText, s.outputTokens),
+		"usageMetadata": buildGeminiUsage(s.finalPrompt, finalThinking, finalText),
 	})
 }
--- a/internal/adapter/gemini/handler_test.go
+++ b/internal/adapter/gemini/handler_test.go
@@ -42,19 +42,23 @@ func (m testGeminiAuth) Determine(_ *http.Request) (*auth.RequestAuth, error) {
 func (testGeminiAuth) Release(_ *auth.RequestAuth) {}
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
 type testGeminiDS struct {
 	resp *http.Response
 	err  error
 }
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
 func (m testGeminiDS) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "session-id", nil
 }
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
 func (m testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "pow", nil
 }
 //nolint:unused // reserved test double for native Gemini DS-call path coverage.
 func (m testGeminiDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	if m.err != nil {
 		return nil, m.err
@@ -78,11 +82,17 @@ func (s geminiOpenAIErrorStub) ChatCompletions(w http.ResponseWriter, _ *http.Re
 }
 type geminiOpenAISuccessStub struct {
-	stream bool
+	stream  bool
-	body   string
+	body    string
 	seenReq map[string]any
 }
-func (s geminiOpenAISuccessStub) ChatCompletions(w http.ResponseWriter, _ *http.Request) {
+func (s *geminiOpenAISuccessStub) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 	if r != nil {
 		var req map[string]any
 		_ = json.NewDecoder(r.Body).Decode(&req)
 		s.seenReq = req
 	}
 	if s.stream {
 		w.Header().Set("Content-Type", "text/event-stream")
 		w.WriteHeader(http.StatusOK)
@@ -100,6 +110,7 @@ func (s geminiOpenAISuccessStub) ChatCompletions(w http.ResponseWriter, _ *http.
 	_, _ = w.Write([]byte(out))
 }
 //nolint:unused // helper retained for native Gemini stream fixture tests.
 func makeGeminiUpstreamResponse(lines ...string) *http.Response {
 	body := strings.Join(lines, "\n")
 	if !strings.HasSuffix(body, "\n") {
@@ -139,7 +150,7 @@ func TestGeminiRoutesRegistered(t *testing.T) {
 func TestGenerateContentReturnsFunctionCallParts(t *testing.T) {
 	h := &Handler{
 		Store: testGeminiConfig{},
-		OpenAI: geminiOpenAISuccessStub{
+		OpenAI: &geminiOpenAISuccessStub{
 			body: `{"id":"chatcmpl-1","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","tool_calls":[{"id":"call_1","type":"function","function":{"name":"eval_javascript","arguments":"{\"code\":\"1+1\"}"}}]},"finish_reason":"tool_calls"}]}`,
 		},
 	}
@@ -179,7 +190,7 @@ func TestGenerateContentReturnsFunctionCallParts(t *testing.T) {
 }
 func TestGenerateContentMixedToolSnippetAlsoTriggersFunctionCall(t *testing.T) {
-	h := &Handler{Store: testGeminiConfig{}, OpenAI: geminiOpenAISuccessStub{}}
+	h := &Handler{Store: testGeminiConfig{}, OpenAI: &geminiOpenAISuccessStub{}}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
@@ -212,7 +223,7 @@ func TestGenerateContentMixedToolSnippetAlsoTriggersFunctionCall(t *testing.T) {
 func TestStreamGenerateContentEmitsSSE(t *testing.T) {
 	h := &Handler{
 		Store:  testGeminiConfig{},
-		OpenAI: geminiOpenAISuccessStub{stream: true},
+		OpenAI: &geminiOpenAISuccessStub{stream: true},
 	}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
@@ -246,6 +257,39 @@ func TestStreamGenerateContentEmitsSSE(t *testing.T) {
 	}
 }
 func TestGeminiProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
 	openAI := &geminiOpenAISuccessStub{}
 	h := &Handler{Store: testGeminiConfig{}, OpenAI: openAI}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	body := `{"contents":[{"role":"user","parts":[{"text":"hello"},{"inlineData":{"mimeType":"image/png","data":"QUJDRA=="}}]}]}`
 	req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:generateContent", strings.NewReader(body))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	messages, _ := openAI.seenReq["messages"].([]any)
 	if len(messages) != 1 {
 		t.Fatalf("expected one translated message, got %#v", openAI.seenReq)
 	}
 	msg, _ := messages[0].(map[string]any)
 	content, _ := msg["content"].([]any)
 	if len(content) != 2 {
 		t.Fatalf("expected translated content blocks, got %#v", msg)
 	}
 	imageBlock, _ := content[1].(map[string]any)
 	if strings.TrimSpace(asString(imageBlock["type"])) != "image_url" {
 		t.Fatalf("expected image_url block, got %#v", imageBlock)
 	}
 	imageURL, _ := imageBlock["image_url"].(map[string]any)
 	if !strings.HasPrefix(strings.TrimSpace(asString(imageURL["url"])), "data:image/png;base64,") {
 		t.Fatalf("expected translated data url, got %#v", imageBlock)
 	}
 }
 func TestGenerateContentOpenAIProxyErrorUsesGeminiEnvelope(t *testing.T) {
 	h := &Handler{
 		Store: testGeminiConfig{},
--- a/internal/adapter/gemini/output_clean.go
+++ b/internal/adapter/gemini/output_clean.go
@@ -2,6 +2,7 @@ package gemini
 import textclean "ds2api/internal/textclean"
 //nolint:unused // retained for native Gemini output post-processing path.
 func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
 	if text == "" {
 		return text
--- a/internal/adapter/openai/chat_stream_runtime.go
+++ b/internal/adapter/openai/chat_stream_runtime.go
@@ -1,6 +1,7 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"net/http"
 	"strings"
@@ -8,7 +9,6 @@ import (
 	openaifmt "ds2api/internal/format/openai"
 	"ds2api/internal/sse"
 	streamengine "ds2api/internal/stream"
 	"ds2api/internal/util"
 )
 type chatStreamRuntime struct {
@@ -37,7 +37,6 @@ type chatStreamRuntime struct {
 	streamToolNames   map[int]string
 	thinking          strings.Builder
 	text              strings.Builder
 	outputTokens      int
 }
 func newChatStreamRuntime(
@@ -99,10 +98,23 @@ func (s *chatStreamRuntime) sendDone() {
 	}
 }
 func (s *chatStreamRuntime) sendFailedChunk(status int, message, code string) {
 	s.sendChunk(map[string]any{
 		"status_code": status,
 		"error": map[string]any{
 			"message": message,
 			"type":    openAIErrorType(status),
 			"code":    code,
 			"param":   nil,
 		},
 	})
 	s.sendDone()
 }
 func (s *chatStreamRuntime) finalize(finishReason string) {
 	finalThinking := s.thinking.String()
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
-	detected := util.ParseStandaloneToolCallsDetailed(finalText, s.toolNames)
+	detected := toolcall.ParseStandaloneToolCallsDetailed(finalText, s.toolNames)
 	if len(detected.Calls) > 0 && !s.toolCallsDoneEmitted {
 		finishReason = "tool_calls"
 		delta := map[string]any{
@@ -169,13 +181,22 @@ func (s *chatStreamRuntime) finalize(finishReason string) {
 	if len(detected.Calls) > 0 || s.toolCallsEmitted {
 		finishReason = "tool_calls"
 	}
-	usage := openaifmt.BuildChatUsage(s.finalPrompt, finalThinking, finalText)
+	if len(detected.Calls) == 0 && !s.toolCallsEmitted && strings.TrimSpace(finalText) == "" {
-	if s.outputTokens > 0 {
+		status := http.StatusTooManyRequests
-		usage["completion_tokens"] = s.outputTokens
+		message := "Upstream model returned empty output."
-		if prompt, ok := usage["prompt_tokens"].(int); ok {
+		code := "upstream_empty_output"
-			usage["total_tokens"] = prompt + s.outputTokens
+		if strings.TrimSpace(finalThinking) != "" {
 			message = "Upstream model returned reasoning without visible output."
 		}
 		if finishReason == "content_filter" {
 			status = http.StatusBadRequest
 			message = "Upstream content filtered the response and returned no output."
 			code = "content_filter"
 		}
 		s.sendFailedChunk(status, message, code)
 		return
 	}
 	usage := openaifmt.BuildChatUsage(s.finalPrompt, finalThinking, finalText)
 	s.sendChunk(openaifmt.BuildChatStreamChunk(
 		s.completionID,
 		s.created,
@@ -190,10 +211,10 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
 	if !parsed.Parsed {
 		return streamengine.ParsedDecision{}
 	}
 	if parsed.OutputTokens > 0 {
 		s.outputTokens = parsed.OutputTokens
 	}
 	if parsed.ContentFilter {
 		if strings.TrimSpace(s.text.String()) == "" {
 			return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("content_filter")}
 		}
 		return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
 	}
 	if parsed.ErrorMessage != "" {
@@ -221,21 +242,29 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
 		}
 		if p.Type == "thinking" {
 			if s.thinkingEnabled {
-				s.thinking.WriteString(cleanedText)
+				trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
-				delta["reasoning_content"] = cleanedText
+				if trimmed == "" {
 					continue
 				}
 				s.thinking.WriteString(trimmed)
 				delta["reasoning_content"] = trimmed
 			}
 		} else {
-			s.text.WriteString(cleanedText)
+			trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
 			if trimmed == "" {
 				continue
 			}
 			s.text.WriteString(trimmed)
 			if !s.bufferToolContent {
-				delta["content"] = cleanedText
+				delta["content"] = trimmed
 			} else {
-				events := processToolSieveChunk(&s.toolSieve, cleanedText, s.toolNames)
+				events := processToolSieveChunk(&s.toolSieve, trimmed, s.toolNames)
 				for _, evt := range events {
 					if len(evt.ToolCallDeltas) > 0 {
 						if !s.emitEarlyToolDeltas {
 							continue
 						}
-						filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.toolNames, s.streamToolNames)
+						filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.streamToolNames)
 						if len(filtered) == 0 {
 							continue
 						}
--- a/internal/adapter/openai/deps.go
+++ b/internal/adapter/openai/deps.go
@@ -18,6 +18,7 @@ type AuthResolver interface {
 type DeepSeekCaller interface {
 	CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
 	GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
 	UploadFile(ctx context.Context, a *auth.RequestAuth, req deepseek.UploadFileRequest, maxAttempts int) (*deepseek.UploadFileResult, error)
 	CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
 	DeleteSessionForToken(ctx context.Context, token string, sessionID string) (*deepseek.DeleteSessionResult, error)
 	DeleteAllSessionsForToken(ctx context.Context, token string) error
--- a/internal/adapter/openai/embeddings_handler.go
+++ b/internal/adapter/openai/embeddings_handler.go
@@ -26,8 +26,13 @@ func (h *Handler) Embeddings(w http.ResponseWriter, r *http.Request) {
 	}
 	defer h.Auth.Release(a)
 	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
 		if strings.Contains(strings.ToLower(err.Error()), "too large") {
 			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
 			return
 		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
--- a/internal/adapter/openai/file_inline_upload.go
+++ b/internal/adapter/openai/file_inline_upload.go
@@ -0,0 +1,382 @@
 package openai
 import (
 	"context"
 	"crypto/sha256"
 	"encoding/base64"
 	"fmt"
 	"mime"
 	"net/http"
 	"net/url"
 	"path/filepath"
 	"strings"
 	"ds2api/internal/auth"
 	"ds2api/internal/deepseek"
 )
 const maxInlineFilesPerRequest = 50
 type inlineFileUploadError struct {
 	status  int
 	message string
 	err     error
 }
 func (e *inlineFileUploadError) Error() string {
 	if e == nil {
 		return ""
 	}
 	if strings.TrimSpace(e.message) != "" {
 		return e.message
 	}
 	if e.err != nil {
 		return e.err.Error()
 	}
 	return "inline file processing failed"
 }
 type inlineUploadState struct {
 	ctx          context.Context
 	handler      *Handler
 	auth         *auth.RequestAuth
 	uploadedByID map[string]string
 	uploadCount  int
 }
 type inlineDecodedFile struct {
 	Data            []byte
 	ContentType     string
 	Filename        string
 	ReplacementType string
 }
 func (h *Handler) preprocessInlineFileInputs(ctx context.Context, a *auth.RequestAuth, req map[string]any) error {
 	if h == nil || h.DS == nil || len(req) == 0 {
 		return nil
 	}
 	state := &inlineUploadState{
 		ctx:          ctx,
 		handler:      h,
 		auth:         a,
 		uploadedByID: map[string]string{},
 	}
 	for _, key := range []string{"messages", "input", "attachments"} {
 		if raw, ok := req[key]; ok {
 			updated, err := state.walk(raw)
 			if err != nil {
 				return err
 			}
 			req[key] = updated
 		}
 	}
 	if refIDs := collectOpenAIRefFileIDs(req); len(refIDs) > 0 {
 		req["ref_file_ids"] = stringsToAnySlice(refIDs)
 	}
 	return nil
 }
 func writeOpenAIInlineFileError(w http.ResponseWriter, err error) {
 	inlineErr, ok := err.(*inlineFileUploadError)
 	if !ok || inlineErr == nil {
 		writeOpenAIError(w, http.StatusInternalServerError, "Failed to process file input.")
 		return
 	}
 	status := inlineErr.status
 	if status == 0 {
 		status = http.StatusInternalServerError
 	}
 	message := strings.TrimSpace(inlineErr.message)
 	if message == "" {
 		message = "Failed to process file input."
 	}
 	writeOpenAIError(w, status, message)
 }
 func (s *inlineUploadState) walk(raw any) (any, error) {
 	switch x := raw.(type) {
 	case []any:
 		out := make([]any, len(x))
 		for i, item := range x {
 			updated, err := s.walk(item)
 			if err != nil {
 				return nil, err
 			}
 			out[i] = updated
 		}
 		return out, nil
 	case map[string]any:
 		if replacement, replaced, err := s.tryUploadBlock(x); replaced || err != nil {
 			return replacement, err
 		}
 		for _, key := range []string{"messages", "input", "attachments", "content", "files", "items", "data", "source", "file", "image_url"} {
 			if nested, ok := x[key]; ok {
 				updated, err := s.walk(nested)
 				if err != nil {
 					return nil, err
 				}
 				x[key] = updated
 			}
 		}
 		return x, nil
 	default:
 		return raw, nil
 	}
 }
 func (s *inlineUploadState) tryUploadBlock(block map[string]any) (map[string]any, bool, error) {
 	decoded, ok, err := decodeOpenAIInlineFileBlock(block)
 	if err != nil {
 		return nil, true, &inlineFileUploadError{status: http.StatusBadRequest, message: err.Error(), err: err}
 	}
 	if !ok {
 		return nil, false, nil
 	}
 	if s.uploadCount >= maxInlineFilesPerRequest {
 		return nil, true, fmt.Errorf("exceeded maximum of %d inline files per request", maxInlineFilesPerRequest)
 	}
 	fileID, err := s.uploadInlineFile(decoded)
 	if err != nil {
 		return nil, true, &inlineFileUploadError{status: http.StatusInternalServerError, message: "Failed to upload inline file.", err: err}
 	}
 	s.uploadCount++
 	replacement := map[string]any{
 		"type":    decoded.ReplacementType,
 		"file_id": fileID,
 	}
 	if decoded.Filename != "" {
 		replacement["filename"] = decoded.Filename
 	}
 	if decoded.ContentType != "" {
 		replacement["mime_type"] = decoded.ContentType
 	}
 	return replacement, true, nil
 }
 func (s *inlineUploadState) uploadInlineFile(file inlineDecodedFile) (string, error) {
 	sum := sha256.Sum256(append([]byte(file.ContentType+"\x00"+file.Filename+"\x00"), file.Data...))
 	cacheKey := fmt.Sprintf("%x", sum[:])
 	if fileID, ok := s.uploadedByID[cacheKey]; ok && strings.TrimSpace(fileID) != "" {
 		return fileID, nil
 	}
 	contentType := strings.TrimSpace(file.ContentType)
 	if contentType == "" {
 		contentType = http.DetectContentType(file.Data)
 	}
 	result, err := s.handler.DS.UploadFile(s.ctx, s.auth, deepseek.UploadFileRequest{
 		Filename:    file.Filename,
 		ContentType: contentType,
 		Data:        file.Data,
 	}, 3)
 	if err != nil {
 		return "", err
 	}
 	fileID := strings.TrimSpace(result.ID)
 	if fileID == "" {
 		return "", fmt.Errorf("upload succeeded without file id")
 	}
 	s.uploadedByID[cacheKey] = fileID
 	return fileID, nil
 }
 func decodeOpenAIInlineFileBlock(block map[string]any) (inlineDecodedFile, bool, error) {
 	if block == nil {
 		return inlineDecodedFile{}, false, nil
 	}
 	if strings.TrimSpace(asString(block["file_id"])) != "" {
 		return inlineDecodedFile{}, false, nil
 	}
 	if nested, ok := block["file"].(map[string]any); ok {
 		decoded, matched, err := decodeOpenAIInlineFileBlock(nested)
 		if err != nil || !matched {
 			return decoded, matched, err
 		}
 		if decoded.Filename == "" {
 			decoded.Filename = pickInlineFilename(block, decoded.ContentType, defaultInlinePrefix(decoded.ReplacementType))
 		}
 		return decoded, true, nil
 	}
 	blockType := strings.ToLower(strings.TrimSpace(asString(block["type"])))
 	if raw, matched := extractInlineImageDataURL(block); matched {
 		data, contentType, err := decodeInlinePayload(raw, contentTypeFromMap(block))
 		if err != nil {
 			return inlineDecodedFile{}, true, fmt.Errorf("invalid image input")
 		}
 		return inlineDecodedFile{
 			Data:            data,
 			ContentType:     contentType,
 			Filename:        pickInlineFilename(block, contentType, "image"),
 			ReplacementType: "input_image",
 		}, true, nil
 	}
 	if raw, matched := extractInlineFilePayload(block, blockType); matched {
 		data, contentType, err := decodeInlinePayload(raw, contentTypeFromMap(block))
 		if err != nil {
 			return inlineDecodedFile{}, true, fmt.Errorf("invalid file input")
 		}
 		return inlineDecodedFile{
 			Data:            data,
 			ContentType:     contentType,
 			Filename:        pickInlineFilename(block, contentType, defaultInlinePrefix(blockType)),
 			ReplacementType: "input_file",
 		}, true, nil
 	}
 	return inlineDecodedFile{}, false, nil
 }
 func extractInlineImageDataURL(block map[string]any) (string, bool) {
 	imageURL := block["image_url"]
 	switch x := imageURL.(type) {
 	case string:
 		if isDataURL(x) {
 			return strings.TrimSpace(x), true
 		}
 	case map[string]any:
 		if raw := strings.TrimSpace(asString(x["url"])); isDataURL(raw) {
 			return raw, true
 		}
 	}
 	if raw := strings.TrimSpace(asString(block["url"])); isDataURL(raw) {
 		return raw, true
 	}
 	return "", false
 }
 func extractInlineFilePayload(block map[string]any, blockType string) (string, bool) {
 	for _, value := range []any{block["file_data"], block["base64"], block["data"]} {
 		if raw := strings.TrimSpace(asString(value)); raw != "" {
 			if strings.Contains(blockType, "file") || block["file_data"] != nil || block["filename"] != nil || block["file_name"] != nil || block["name"] != nil {
 				return raw, true
 			}
 		}
 	}
 	return "", false
 }
 func decodeInlinePayload(raw string, explicitContentType string) ([]byte, string, error) {
 	raw = strings.TrimSpace(raw)
 	if raw == "" {
 		return nil, "", fmt.Errorf("empty payload")
 	}
 	if isDataURL(raw) {
 		return decodeDataURL(raw, explicitContentType)
 	}
 	decoded, err := decodeBase64Flexible(raw)
 	if err != nil {
 		return nil, "", err
 	}
 	contentType := strings.TrimSpace(explicitContentType)
 	if contentType == "" && len(decoded) > 0 {
 		contentType = http.DetectContentType(decoded)
 	}
 	return decoded, contentType, nil
 }
 func decodeDataURL(raw string, explicitContentType string) ([]byte, string, error) {
 	raw = strings.TrimSpace(raw)
 	if !isDataURL(raw) {
 		return nil, "", fmt.Errorf("unsupported data url")
 	}
 	header, payload, ok := strings.Cut(raw, ",")
 	if !ok {
 		return nil, "", fmt.Errorf("invalid data url")
 	}
 	meta := strings.TrimSpace(strings.TrimPrefix(header, "data:"))
 	contentType := strings.TrimSpace(explicitContentType)
 	if contentType == "" {
 		contentType = "application/octet-stream"
 		if meta != "" {
 			parts := strings.Split(meta, ";")
 			if len(parts) > 0 && strings.TrimSpace(parts[0]) != "" {
 				contentType = strings.TrimSpace(parts[0])
 			}
 		}
 	}
 	if strings.Contains(strings.ToLower(meta), ";base64") {
 		decoded, err := decodeBase64Flexible(payload)
 		if err != nil {
 			return nil, "", err
 		}
 		return decoded, contentType, nil
 	}
 	decoded, err := url.PathUnescape(payload)
 	if err != nil {
 		return nil, "", err
 	}
 	return []byte(decoded), contentType, nil
 }
 func decodeBase64Flexible(raw string) ([]byte, error) {
 	raw = strings.TrimSpace(raw)
 	for _, enc := range []*base64.Encoding{base64.StdEncoding, base64.RawStdEncoding, base64.URLEncoding, base64.RawURLEncoding} {
 		decoded, err := enc.DecodeString(raw)
 		if err == nil {
 			return decoded, nil
 		}
 	}
 	return nil, fmt.Errorf("invalid base64 payload")
 }
 func contentTypeFromMap(block map[string]any) string {
 	for _, value := range []any{block["mime_type"], block["mimeType"], block["content_type"], block["contentType"], block["media_type"], block["mediaType"]} {
 		if contentType := strings.TrimSpace(asString(value)); contentType != "" {
 			return contentType
 		}
 	}
 	if imageURL, ok := block["image_url"].(map[string]any); ok {
 		for _, value := range []any{imageURL["mime_type"], imageURL["mimeType"], imageURL["content_type"], imageURL["contentType"]} {
 			if contentType := strings.TrimSpace(asString(value)); contentType != "" {
 				return contentType
 			}
 		}
 	}
 	return ""
 }
 func pickInlineFilename(block map[string]any, contentType string, prefix string) string {
 	for _, value := range []any{block["filename"], block["file_name"], block["name"]} {
 		if name := strings.TrimSpace(asString(value)); name != "" {
 			return filepath.Base(name)
 		}
 	}
 	if prefix == "" {
 		prefix = "upload"
 	}
 	ext := ".bin"
 	if parsedType := strings.TrimSpace(contentType); parsedType != "" {
 		if comma := strings.Index(parsedType, ";"); comma >= 0 {
 			parsedType = strings.TrimSpace(parsedType[:comma])
 		}
 		if exts, err := mime.ExtensionsByType(parsedType); err == nil && len(exts) > 0 && strings.TrimSpace(exts[0]) != "" {
 			ext = exts[0]
 		}
 	}
 	return prefix + ext
 }
 func defaultInlinePrefix(blockType string) string {
 	blockType = strings.ToLower(strings.TrimSpace(blockType))
 	if strings.Contains(blockType, "image") {
 		return "image"
 	}
 	return "upload"
 }
 func isDataURL(raw string) bool {
 	return strings.HasPrefix(strings.ToLower(strings.TrimSpace(raw)), "data:")
 }
 func stringsToAnySlice(items []string) []any {
 	out := make([]any, 0, len(items))
 	for _, item := range items {
 		trimmed := strings.TrimSpace(item)
 		if trimmed == "" {
 			continue
 		}
 		out = append(out, trimmed)
 	}
 	if len(out) == 0 {
 		return nil
 	}
 	return out
 }
--- a/internal/adapter/openai/file_inline_upload_test.go
+++ b/internal/adapter/openai/file_inline_upload_test.go
@@ -0,0 +1,274 @@
 package openai
 import (
 	"context"
 	"encoding/json"
 	"errors"
 	"net/http"
 	"net/http/httptest"
 	"strings"
 	"testing"
 	"github.com/go-chi/chi/v5"
 	"ds2api/internal/auth"
 	"ds2api/internal/deepseek"
 )
 type inlineUploadDSStub struct {
 	uploadCalls    []deepseek.UploadFileRequest
 	lastCtx        context.Context
 	completionReq  map[string]any
 	createSession  string
 	uploadErr      error
 	completionResp *http.Response
 }
 func (m *inlineUploadDSStub) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	if strings.TrimSpace(m.createSession) == "" {
 		return "session-id", nil
 	}
 	return m.createSession, nil
 }
 func (m *inlineUploadDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "pow", nil
 }
 func (m *inlineUploadDSStub) UploadFile(ctx context.Context, _ *auth.RequestAuth, req deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
 	m.lastCtx = ctx
 	m.uploadCalls = append(m.uploadCalls, req)
 	if m.uploadErr != nil {
 		return nil, m.uploadErr
 	}
 	return &deepseek.UploadFileResult{
 		ID:       "file-inline-1",
 		Filename: req.Filename,
 		Bytes:    int64(len(req.Data)),
 		Status:   "uploaded",
 		Purpose:  req.Purpose,
 	}, nil
 }
 func (m *inlineUploadDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
 	m.completionReq = payload
 	if m.completionResp != nil {
 		return m.completionResp, nil
 	}
 	return makeOpenAISSEHTTPResponse(
 		`data: {"p":"response/content","v":"ok"}`,
 		`data: [DONE]`,
 	), nil
 }
 func (m *inlineUploadDSStub) DeleteSessionForToken(_ context.Context, _ string, _ string) (*deepseek.DeleteSessionResult, error) {
 	return &deepseek.DeleteSessionResult{Success: true}, nil
 }
 func (m *inlineUploadDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
 	return nil
 }
 func TestPreprocessInlineFileInputsReplacesDataURLAndCollectsRefFileIDs(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &Handler{DS: ds}
 	req := map[string]any{
 		"messages": []any{
 			map[string]any{
 				"role": "user",
 				"content": []any{
 					map[string]any{
 						"type":      "image_url",
 						"image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="},
 					},
 				},
 			},
 		},
 	}
 	ctx, cancel := context.WithCancel(context.Background())
 	defer cancel()
 	if err := h.preprocessInlineFileInputs(ctx, &auth.RequestAuth{DeepSeekToken: "token"}, req); err != nil {
 		t.Fatalf("preprocess failed: %v", err)
 	}
 	if len(ds.uploadCalls) != 1 {
 		t.Fatalf("expected 1 upload, got %d", len(ds.uploadCalls))
 	}
 	if ds.lastCtx != ctx {
 		t.Fatalf("expected upload to use request context")
 	}
 	if ds.uploadCalls[0].ContentType != "image/png" {
 		t.Fatalf("expected image/png, got %q", ds.uploadCalls[0].ContentType)
 	}
 	if ds.uploadCalls[0].Filename != "image.png" {
 		t.Fatalf("expected inferred filename image.png, got %q", ds.uploadCalls[0].Filename)
 	}
 	messages, _ := req["messages"].([]any)
 	first, _ := messages[0].(map[string]any)
 	content, _ := first["content"].([]any)
 	block, _ := content[0].(map[string]any)
 	if block["type"] != "input_image" {
 		t.Fatalf("expected input_image replacement, got %#v", block)
 	}
 	if block["file_id"] != "file-inline-1" {
 		t.Fatalf("expected file-inline-1 replacement id, got %#v", block)
 	}
 	refIDs, _ := req["ref_file_ids"].([]any)
 	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
 		t.Fatalf("unexpected ref_file_ids: %#v", req["ref_file_ids"])
 	}
 }
 func TestPreprocessInlineFileInputsDeduplicatesIdenticalPayloads(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &Handler{DS: ds}
 	req := map[string]any{
 		"messages": []any{
 			map[string]any{
 				"role": "user",
 				"content": []any{
 					map[string]any{"type": "image_url", "image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="}},
 					map[string]any{"type": "image_url", "image_url": map[string]any{"url": "data:image/png;base64,QUJDRA=="}},
 				},
 			},
 		},
 	}
 	if err := h.preprocessInlineFileInputs(context.Background(), &auth.RequestAuth{DeepSeekToken: "token"}, req); err != nil {
 		t.Fatalf("preprocess failed: %v", err)
 	}
 	if len(ds.uploadCalls) != 1 {
 		t.Fatalf("expected deduplicated single upload, got %d", len(ds.uploadCalls))
 	}
 	refIDs, _ := req["ref_file_ids"].([]any)
 	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
 		t.Fatalf("unexpected ref_file_ids after dedupe: %#v", req["ref_file_ids"])
 	}
 }
 func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	h.ChatCompletions(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(ds.uploadCalls) != 1 {
 		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
 	}
 	if ds.completionReq == nil {
 		t.Fatal("expected completion payload to be captured")
 	}
 	refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
 	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
 		t.Fatalf("unexpected completion ref_file_ids: %#v", ds.completionReq["ref_file_ids"])
 	}
 }
 func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","input":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"input_image","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(ds.uploadCalls) != 1 {
 		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
 	}
 	refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
 	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
 		t.Fatalf("unexpected completion ref_file_ids: %#v", ds.completionReq["ref_file_ids"])
 	}
 }
 func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
 	ds := &inlineUploadDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,%%%"}}]}],"stream":false}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	h.ChatCompletions(rec, req)
 	if rec.Code != http.StatusBadRequest {
 		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if ds.completionReq != nil {
 		t.Fatalf("did not expect completion call on upload decode error")
 	}
 }
 func TestResponsesInlineUploadFailureReturnsInternalServerError(t *testing.T) {
 	ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","input":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusInternalServerError {
 		t.Fatalf("expected 500, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if ds.completionReq != nil {
 		t.Fatalf("did not expect completion call after upload failure")
 	}
 }
 func TestVercelPrepareUploadsInlineFilesBeforeLeasePayload(t *testing.T) {
 	t.Setenv("VERCEL", "1")
 	t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
 	ds := &inlineUploadDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":true}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions?__stream_prepare=1", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("X-Ds2-Internal-Token", "stream-secret")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(ds.uploadCalls) != 1 {
 		t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
 	}
 	payload, _ := out["payload"].(map[string]any)
 	if payload == nil {
 		t.Fatalf("expected payload in prepare response, got %#v", out)
 	}
 	refIDs, _ := payload["ref_file_ids"].([]any)
 	if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
 		t.Fatalf("unexpected payload ref_file_ids: %#v", payload["ref_file_ids"])
 	}
 }
--- a/internal/adapter/openai/file_refs.go
+++ b/internal/adapter/openai/file_refs.go
@@ -0,0 +1,94 @@
 package openai
 import "strings"
 func collectOpenAIRefFileIDs(req map[string]any) []string {
 	if len(req) == 0 {
 		return nil
 	}
 	out := make([]string, 0, 4)
 	seen := map[string]struct{}{}
 	for _, key := range []string{
 		"ref_file_ids",
 		"file_ids",
 		"attachments",
 		"messages",
 		"input",
 	} {
 		raw := req[key]
 		if raw == nil {
 			continue
 		}
 		// Skip top-level strings for 'messages' and 'input' as they are likely plain text content,
 		// not file IDs. String file IDs are expected in 'ref_file_ids' or 'file_ids'.
 		if key == "messages" || key == "input" {
 			if _, ok := raw.(string); ok {
 				continue
 			}
 		}
 		appendOpenAIRefFileIDs(&out, seen, raw)
 	}
 	if len(out) == 0 {
 		return nil
 	}
 	return out
 }
 func appendOpenAIRefFileIDs(out *[]string, seen map[string]struct{}, raw any) {
 	switch x := raw.(type) {
 	case string:
 		addOpenAIRefFileID(out, seen, x)
 	case []string:
 		for _, item := range x {
 			addOpenAIRefFileID(out, seen, item)
 		}
 	case []any:
 		for _, item := range x {
 			appendOpenAIRefFileIDs(out, seen, item)
 		}
 	case map[string]any:
 		if fileID := strings.TrimSpace(asString(x["file_id"])); fileID != "" {
 			addOpenAIRefFileID(out, seen, fileID)
 		}
 		if strings.Contains(strings.ToLower(strings.TrimSpace(asString(x["type"]))), "file") {
 			if fileID := strings.TrimSpace(asString(x["id"])); fileID != "" {
 				addOpenAIRefFileID(out, seen, fileID)
 			}
 		}
 		if fileMap, ok := x["file"].(map[string]any); ok {
 			if fileID := strings.TrimSpace(asString(fileMap["file_id"])); fileID != "" {
 				addOpenAIRefFileID(out, seen, fileID)
 			}
 			if fileID := strings.TrimSpace(asString(fileMap["id"])); fileID != "" {
 				addOpenAIRefFileID(out, seen, fileID)
 			}
 		}
 		// Recurse into potential containers. Note: we do NOT recurse into 'content' or 'input'
 		// if they are plain strings (handled by the top-level switch), but they are usually
 		// nested inside the map branch anyway.
 		// To be safe, we only recurse into these known container keys.
 		for _, key := range []string{"ref_file_ids", "file_ids", "attachments", "messages", "input", "content", "files", "items", "data", "source"} {
 			if nested, ok := x[key]; ok {
 				// If it's a message content that is a string, we must NOT treat it as an ID.
 				if key == "content" || key == "input" {
 					if _, ok := nested.(string); ok {
 						continue
 					}
 				}
 				appendOpenAIRefFileIDs(out, seen, nested)
 			}
 		}
 	}
 }
 func addOpenAIRefFileID(out *[]string, seen map[string]struct{}, fileID string) {
 	fileID = strings.TrimSpace(fileID)
 	if fileID == "" {
 		return
 	}
 	if _, ok := seen[fileID]; ok {
 		return
 	}
 	seen[fileID] = struct{}{}
 	*out = append(*out, fileID)
 }
--- a/internal/adapter/openai/files_route_test.go
+++ b/internal/adapter/openai/files_route_test.go
@@ -0,0 +1,202 @@
 package openai
 import (
 	"bytes"
 	"context"
 	"encoding/json"
 	"errors"
 	"mime/multipart"
 	"net/http"
 	"net/http/httptest"
 	"testing"
 	"github.com/go-chi/chi/v5"
 	"ds2api/internal/auth"
 	"ds2api/internal/deepseek"
 )
 type managedFilesAuthStub struct{}
 func (managedFilesAuthStub) Determine(_ *http.Request) (*auth.RequestAuth, error) {
 	return &auth.RequestAuth{
 		UseConfigToken: true,
 		DeepSeekToken:  "managed-token",
 		CallerID:       "caller:test",
 		AccountID:      "acct-123",
 		TriedAccounts:  map[string]bool{},
 	}, nil
 }
 func (managedFilesAuthStub) DetermineCaller(_ *http.Request) (*auth.RequestAuth, error) {
 	return &auth.RequestAuth{
 		UseConfigToken: true,
 		DeepSeekToken:  "managed-token",
 		CallerID:       "caller:test",
 		AccountID:      "acct-123",
 		TriedAccounts:  map[string]bool{},
 	}, nil
 }
 func (managedFilesAuthStub) Release(_ *auth.RequestAuth) {}
 type filesRouteDSStub struct {
 	lastReq deepseek.UploadFileRequest
 	upload  *deepseek.UploadFileResult
 	err     error
 }
 func (m *filesRouteDSStub) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "", nil
 }
 func (m *filesRouteDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "", nil
 }
 func (m *filesRouteDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, req deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
 	m.lastReq = req
 	if m.err != nil {
 		return nil, m.err
 	}
 	if m.upload != nil {
 		return m.upload, nil
 	}
 	return &deepseek.UploadFileResult{ID: "file-123", Filename: req.Filename, Bytes: int64(len(req.Data)), Purpose: req.Purpose, Status: "uploaded"}, nil
 }
 func (m *filesRouteDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	return nil, errors.New("not implemented")
 }
 func (m *filesRouteDSStub) DeleteSessionForToken(_ context.Context, _ string, _ string) (*deepseek.DeleteSessionResult, error) {
 	return &deepseek.DeleteSessionResult{Success: true}, nil
 }
 func (m *filesRouteDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
 	return nil
 }
 func newMultipartUploadRequest(t *testing.T, purpose string, filename string, data []byte) *http.Request {
 	t.Helper()
 	var body bytes.Buffer
 	writer := multipart.NewWriter(&body)
 	if purpose != "" {
 		if err := writer.WriteField("purpose", purpose); err != nil {
 			t.Fatalf("write purpose failed: %v", err)
 		}
 	}
 	part, err := writer.CreateFormFile("file", filename)
 	if err != nil {
 		t.Fatalf("create form file failed: %v", err)
 	}
 	if _, err := part.Write(data); err != nil {
 		t.Fatalf("write file failed: %v", err)
 	}
 	if err := writer.Close(); err != nil {
 		t.Fatalf("close writer failed: %v", err)
 	}
 	req := httptest.NewRequest(http.MethodPost, "/v1/files", &body)
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", writer.FormDataContentType())
 	return req
 }
 func TestFilesRouteUploadSuccess(t *testing.T) {
 	ds := &filesRouteDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if ds.lastReq.Filename != "notes.txt" {
 		t.Fatalf("expected filename notes.txt, got %q", ds.lastReq.Filename)
 	}
 	if ds.lastReq.Purpose != "assistants" {
 		t.Fatalf("expected purpose assistants, got %q", ds.lastReq.Purpose)
 	}
 	if string(ds.lastReq.Data) != "hello world" {
 		t.Fatalf("unexpected uploaded data: %q", string(ds.lastReq.Data))
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
 	}
 	if out["object"] != "file" {
 		t.Fatalf("expected file object, got %#v", out)
 	}
 	if out["id"] != "file-123" {
 		t.Fatalf("expected file id file-123, got %#v", out["id"])
 	}
 	if out["filename"] != "notes.txt" {
 		t.Fatalf("expected filename notes.txt, got %#v", out["filename"])
 	}
 }
 func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
 	ds := &filesRouteDSStub{}
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: managedFilesAuthStub{}, DS: ds}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
 	}
 	if out["account_id"] != "acct-123" {
 		t.Fatalf("expected account_id acct-123, got %#v", out["account_id"])
 	}
 }
 func TestFilesRouteRejectsNonMultipart(t *testing.T) {
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	req := httptest.NewRequest(http.MethodPost, "/v1/files", bytes.NewBufferString(`{"purpose":"assistants"}`))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusBadRequest {
 		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
 	}
 }
 func TestFilesRouteRequiresFileField(t *testing.T) {
 	h := &Handler{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
 	r := chi.NewRouter()
 	RegisterRoutes(r, h)
 	var body bytes.Buffer
 	writer := multipart.NewWriter(&body)
 	if err := writer.WriteField("purpose", "assistants"); err != nil {
 		t.Fatalf("write field failed: %v", err)
 	}
 	if err := writer.Close(); err != nil {
 		t.Fatalf("close writer failed: %v", err)
 	}
 	req := httptest.NewRequest(http.MethodPost, "/v1/files", &body)
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", writer.FormDataContentType())
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusBadRequest {
 		t.Fatalf("expected 400, got %d body=%s", rec.Code, rec.Body.String())
 	}
 }
--- a/internal/adapter/openai/handler_chat.go
+++ b/internal/adapter/openai/handler_chat.go
@@ -5,6 +5,7 @@ import (
 	"encoding/json"
 	"io"
 	"net/http"
 	"strings"
 	"time"
 	"ds2api/internal/auth"
@@ -43,11 +44,20 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
 	r = r.WithContext(auth.WithAuth(r.Context(), a))
 	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
 		if strings.Contains(strings.ToLower(err.Error()), "too large") {
 			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
 			return
 		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
 	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
 		writeOpenAIInlineFileError(w, err)
 		return
 	}
 	stdReq, err := normalizeOpenAIChatRequest(h.Store, req, requestTraceID(r))
 	if err != nil {
 		writeOpenAIError(w, http.StatusBadRequest, err.Error())
@@ -116,7 +126,7 @@ func (h *Handler) autoDeleteRemoteSession(ctx context.Context, a *auth.RequestAu
 func (h *Handler) handleNonStream(w http.ResponseWriter, ctx context.Context, resp *http.Response, completionID, model, finalPrompt string, thinkingEnabled bool, toolNames []string) {
 	if resp.StatusCode != http.StatusOK {
-		defer resp.Body.Close()
+		defer func() { _ = resp.Body.Close() }()
 		body, _ := io.ReadAll(resp.Body)
 		writeOpenAIError(w, resp.StatusCode, string(body))
 		return
@@ -127,23 +137,15 @@ func (h *Handler) handleNonStream(w http.ResponseWriter, ctx context.Context, re
 	stripReferenceMarkers := h.compatStripReferenceMarkers()
 	finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
 	finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
-	if writeUpstreamEmptyOutputError(w, finalThinking, finalText, result.ContentFilter) {
+	if writeUpstreamEmptyOutputError(w, finalText, result.ContentFilter) {
 		return
 	}
 	respBody := openaifmt.BuildChatCompletion(completionID, model, finalPrompt, finalThinking, finalText, toolNames)
 	if result.OutputTokens > 0 {
 		if usage, ok := respBody["usage"].(map[string]any); ok {
 			usage["completion_tokens"] = result.OutputTokens
 			if prompt, ok := usage["prompt_tokens"].(int); ok {
 				usage["total_tokens"] = prompt + result.OutputTokens
 			}
 		}
 	}
 	writeJSON(w, http.StatusOK, respBody)
 }
 func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *http.Response, completionID, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeOpenAIError(w, resp.StatusCode, string(body))
--- a/internal/adapter/openai/handler_chat_auto_delete_test.go
+++ b/internal/adapter/openai/handler_chat_auto_delete_test.go
@@ -27,6 +27,10 @@ func (m *autoDeleteModeDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _
 	return "pow", nil
 }
 func (m *autoDeleteModeDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, _ deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
 	return &deepseek.UploadFileResult{ID: "file-id", Filename: "file.txt", Bytes: 1, Status: "uploaded"}, nil
 }
 func (m *autoDeleteModeDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	return m.resp, nil
 }
@@ -107,7 +111,7 @@ type autoDeleteCtxDSStub struct {
 }
 func (m *autoDeleteCtxDSStub) DeleteSessionForToken(ctx context.Context, token string, sessionID string) (*deepseek.DeleteSessionResult, error) {
-	return m.autoDeleteModeDSStub.DeleteSessionForTokenCtx(ctx, token, sessionID)
+	return m.DeleteSessionForTokenCtx(ctx, token, sessionID)
 }
 func (m *autoDeleteCtxDSStub) DeleteAllSessionsForToken(_ context.Context, _ string) error {
--- a/internal/adapter/openai/handler_files.go
+++ b/internal/adapter/openai/handler_files.go
@@ -0,0 +1,104 @@
 package openai
 import (
 	"io"
 	"net/http"
 	"strings"
 	"time"
 	"ds2api/internal/auth"
 	"ds2api/internal/deepseek"
 )
 const openAIUploadMaxMemory = 32 << 20
 func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
 	a, err := h.Auth.Determine(r)
 	if err != nil {
 		status := http.StatusUnauthorized
 		detail := err.Error()
 		if err == auth.ErrNoAccount {
 			status = http.StatusTooManyRequests
 		}
 		writeOpenAIError(w, status, detail)
 		return
 	}
 	defer h.Auth.Release(a)
 	if !strings.HasPrefix(strings.ToLower(strings.TrimSpace(r.Header.Get("Content-Type"))), "multipart/form-data") {
 		writeOpenAIError(w, http.StatusBadRequest, "content-type must be multipart/form-data")
 		return
 	}
 	// Enforce a hard cap on the total request body size to prevent OOM
 	r.Body = http.MaxBytesReader(w, r.Body, openAIUploadMaxSize)
 	if err := r.ParseMultipartForm(openAIUploadMaxMemory); err != nil {
 		if strings.Contains(strings.ToLower(err.Error()), "too large") {
 			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "file size exceeds limit")
 			return
 		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid multipart form")
 		return
 	}
 	if r.MultipartForm != nil {
 		defer func() { _ = r.MultipartForm.RemoveAll() }()
 	}
 	r = r.WithContext(auth.WithAuth(r.Context(), a))
 	file, header, err := r.FormFile("file")
 	if err != nil {
 		writeOpenAIError(w, http.StatusBadRequest, "file is required")
 		return
 	}
 	defer func() { _ = file.Close() }()
 	data, err := io.ReadAll(file)
 	if err != nil {
 		writeOpenAIError(w, http.StatusBadRequest, "failed to read uploaded file")
 		return
 	}
 	contentType := strings.TrimSpace(header.Header.Get("Content-Type"))
 	if contentType == "" && len(data) > 0 {
 		contentType = http.DetectContentType(data)
 	}
 	result, err := h.DS.UploadFile(r.Context(), a, deepseek.UploadFileRequest{
 		Filename:    header.Filename,
 		ContentType: contentType,
 		Purpose:     strings.TrimSpace(r.FormValue("purpose")),
 		Data:        data,
 	}, 3)
 	if err != nil {
 		writeOpenAIError(w, http.StatusInternalServerError, "Failed to upload file.")
 		return
 	}
 	if result != nil && result.AccountID == "" {
 		result.AccountID = a.AccountID
 	}
 	writeJSON(w, http.StatusOK, buildOpenAIFileObject(result))
 }
 func buildOpenAIFileObject(result *deepseek.UploadFileResult) map[string]any {
 	if result == nil {
 		obj := map[string]any{
 			"id":             "",
 			"object":         "file",
 			"bytes":          0,
 			"created_at":     time.Now().Unix(),
 			"filename":       "",
 			"purpose":        "",
 			"status":         "uploaded",
 			"status_details": nil,
 		}
 		return obj
 	}
 	obj := map[string]any{
 		"id":             result.ID,
 		"object":         "file",
 		"bytes":          result.Bytes,
 		"created_at":     time.Now().Unix(),
 		"filename":       result.Filename,
 		"purpose":        result.Purpose,
 		"status":         result.Status,
 		"status_details": nil,
 	}
 	if result.AccountID != "" {
 		obj["account_id"] = result.AccountID
 	}
 	return obj
 }
--- a/internal/adapter/openai/handler_routes.go
+++ b/internal/adapter/openai/handler_routes.go
@@ -13,6 +13,13 @@ import (
 	"ds2api/internal/util"
 )
 const (
 	// openAIUploadMaxSize limits total multipart request body size (100 MiB).
 	openAIUploadMaxSize = 100 << 20
 	// openAIGeneralMaxSize limits total JSON request body size (100 MiB).
 	openAIGeneralMaxSize = 100 << 20
 )
 // writeJSON is a package-internal alias kept to avoid mass-renaming across
 // every call-site in this package.
 var writeJSON = util.WriteJSON
@@ -46,6 +53,7 @@ func RegisterRoutes(r chi.Router, h *Handler) {
 	r.Post("/v1/chat/completions", h.ChatCompletions)
 	r.Post("/v1/responses", h.Responses)
 	r.Get("/v1/responses/{response_id}", h.GetResponseByID)
 	r.Post("/v1/files", h.UploadFile)
 	r.Post("/v1/embeddings", h.Embeddings)
 }
--- a/internal/adapter/openai/handler_toolcall_format.go
+++ b/internal/adapter/openai/handler_toolcall_format.go
@@ -1,6 +1,7 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"fmt"
 	"strings"
@@ -75,7 +76,7 @@ func injectToolPrompt(messages []map[string]any, tools []any, policy util.ToolCh
 // buildToolCallInstructions delegates to the shared util implementation.
 func buildToolCallInstructions(toolNames []string) string {
-	return util.BuildToolCallInstructions(toolNames)
+	return toolcall.BuildToolCallInstructions(toolNames)
 }
 func formatIncrementalStreamToolCallDeltas(deltas []toolCallDelta, ids map[int]string) []map[string]any {
@@ -112,7 +113,7 @@ func formatIncrementalStreamToolCallDeltas(deltas []toolCallDelta, ids map[int]s
 	return out
 }
-func filterIncrementalToolCallDeltasByAllowed(deltas []toolCallDelta, allowedNames []string, seenNames map[int]string) []toolCallDelta {
+func filterIncrementalToolCallDeltasByAllowed(deltas []toolCallDelta, seenNames map[int]string) []toolCallDelta {
 	if len(deltas) == 0 {
 		return nil
 	}
@@ -138,7 +139,7 @@ func filterIncrementalToolCallDeltasByAllowed(deltas []toolCallDelta, allowedNam
 	return out
 }
-func formatFinalStreamToolCallsWithStableIDs(calls []util.ParsedToolCall, ids map[int]string) []map[string]any {
+func formatFinalStreamToolCallsWithStableIDs(calls []toolcall.ParsedToolCall, ids map[int]string) []map[string]any {
 	if len(calls) == 0 {
 		return nil
 	}
--- a/internal/adapter/openai/handler_toolcall_test.go
+++ b/internal/adapter/openai/handler_toolcall_test.go
@@ -275,7 +275,7 @@ func TestHandleNonStreamFencedToolCallExamplePromotesToolCall(t *testing.T) {
 	TestHandleNonStreamFencedToolCallExampleDoesNotPromoteToolCall(t)
 }
-func TestHandleNonStreamReturns502WhenUpstreamOutputEmpty(t *testing.T) {
+func TestHandleNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
 		`data: {"p":"response/content","v":""}`,
@@ -284,8 +284,8 @@ func TestHandleNonStreamReturns502WhenUpstreamOutputEmpty(t *testing.T) {
 	rec := httptest.NewRecorder()
 	h.handleNonStream(rec, context.Background(), resp, "cid-empty", "deepseek-chat", "prompt", false, nil)
-	if rec.Code != http.StatusBadGateway {
+	if rec.Code != http.StatusTooManyRequests {
-		t.Fatalf("expected status 502 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
+		t.Fatalf("expected status 429 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	out := decodeJSONBody(t, rec.Body.String())
 	errObj, _ := out["error"].(map[string]any)
@@ -313,6 +313,25 @@ func TestHandleNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWithoutOutp
 	}
 }
 func TestHandleNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
 		`data: {"p":"response/thinking_content","v":"Only thinking"}`,
 		`data: [DONE]`,
 	)
 	rec := httptest.NewRecorder()
 	h.handleNonStream(rec, context.Background(), resp, "cid-thinking-only", "deepseek-reasoner", "prompt", true, nil)
 	if rec.Code != http.StatusTooManyRequests {
 		t.Fatalf("expected status 429 for thinking-only upstream output, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	out := decodeJSONBody(t, rec.Body.String())
 	errObj, _ := out["error"].(map[string]any)
 	if asString(errObj["code"]) != "upstream_empty_output" {
 		t.Fatalf("expected code=upstream_empty_output, got %#v", out)
 	}
 }
 func TestHandleStreamToolCallInterceptsWithoutRawContentLeak(t *testing.T) {
 	h := &Handler{}
 	resp := makeSSEHTTPResponse(
--- a/internal/adapter/openai/leaked_output_sanitize.go
+++ b/internal/adapter/openai/leaked_output_sanitize.go
@@ -2,13 +2,21 @@ package openai
 import (
 	"regexp"
 	"strings"
 )
 var emptyJSONFencePattern = regexp.MustCompile("(?is)```json\\s*```")
 var leakedToolCallArrayPattern = regexp.MustCompile(`(?is)\[\{\s*"function"\s*:\s*\{[\s\S]*?\}\s*,\s*"id"\s*:\s*"call[^"]*"\s*,\s*"type"\s*:\s*"function"\s*}\]`)
 var leakedToolResultBlobPattern = regexp.MustCompile(`(?is)<\s*\|\s*tool\s*\|\s*>\s*\{[\s\S]*?"tool_call_id"\s*:\s*"call[^"]*"\s*}`)
-// leakedMetaMarkerPattern matches DeepSeek special tokens in BOTH forms:
+var leakedThinkTagPattern = regexp.MustCompile(`(?is)</?\s*think\s*>`)
 // leakedBOSMarkerPattern matches DeepSeek BOS markers in BOTH forms:
 //   - ASCII underscore: <｜begin_of_sentence｜>
 //   - U+2581 variant:   <｜begin▁of▁sentence｜>
 var leakedBOSMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*begin[_▁]of[_▁]sentence\s*[｜\|]>`)
 // leakedMetaMarkerPattern matches the remaining DeepSeek special tokens in BOTH forms:
 //   - ASCII underscore: <｜end_of_sentence｜>, <｜end_of_toolresults｜>, <｜end_of_instructions｜>
 //   - U+2581 variant:   <｜end▁of▁sentence｜>, <｜end▁of▁toolresults｜>, <｜end▁of▁instructions｜>
 var leakedMetaMarkerPattern = regexp.MustCompile(`(?i)<[｜\|]\s*(?:assistant|tool|end[_▁]of[_▁]sentence|end[_▁]of[_▁]thinking|end[_▁]of[_▁]toolresults|end[_▁]of[_▁]instructions)\s*[｜\|]>`)
@@ -35,11 +43,48 @@ func sanitizeLeakedOutput(text string) string {
 	out := emptyJSONFencePattern.ReplaceAllString(text, "")
 	out = leakedToolCallArrayPattern.ReplaceAllString(out, "")
 	out = leakedToolResultBlobPattern.ReplaceAllString(out, "")
 	out = stripDanglingThinkSuffix(out)
 	out = leakedThinkTagPattern.ReplaceAllString(out, "")
 	out = leakedBOSMarkerPattern.ReplaceAllString(out, "")
 	out = leakedMetaMarkerPattern.ReplaceAllString(out, "")
 	out = sanitizeLeakedAgentXMLBlocks(out)
 	return out
 }
 func stripDanglingThinkSuffix(text string) string {
 	matches := leakedThinkTagPattern.FindAllStringIndex(text, -1)
 	if len(matches) == 0 {
 		return text
 	}
 	depth := 0
 	lastOpen := -1
 	for _, loc := range matches {
 		tag := strings.ToLower(text[loc[0]:loc[1]])
 		compact := strings.ReplaceAll(strings.ReplaceAll(strings.TrimSpace(tag), " ", ""), "\t", "")
 		if strings.HasPrefix(compact, "</") {
 			if depth > 0 {
 				depth--
 				if depth == 0 {
 					lastOpen = -1
 				}
 			}
 			continue
 		}
 		if depth == 0 {
 			lastOpen = loc[0]
 		}
 		depth++
 	}
 	if depth == 0 || lastOpen < 0 {
 		return text
 	}
 	prefix := text[:lastOpen]
 	if strings.TrimSpace(prefix) == "" {
 		return ""
 	}
 	return prefix
 }
 func sanitizeLeakedAgentXMLBlocks(text string) string {
 	out := text
 	for _, pattern := range leakedAgentXMLBlockPatterns {
--- a/internal/adapter/openai/leaked_output_sanitize_test.go
+++ b/internal/adapter/openai/leaked_output_sanitize_test.go
@@ -26,6 +26,22 @@ func TestSanitizeLeakedOutputRemovesStandaloneMetaMarkers(t *testing.T) {
 	}
 }
 func TestSanitizeLeakedOutputRemovesThinkAndBosMarkers(t *testing.T) {
 	raw := "A<think>B</think>C<｜begin▁of▁sentence｜>D<| begin_of_sentence |>E<｜begin_of_sentence｜>F"
 	got := sanitizeLeakedOutput(raw)
 	if got != "ABCDEF" {
 		t.Fatalf("unexpected sanitize result for think/BOS markers: %q", got)
 	}
 }
 func TestSanitizeLeakedOutputRemovesDanglingThinkBlock(t *testing.T) {
 	raw := "Answer prefix<think>internal reasoning that never closes"
 	got := sanitizeLeakedOutput(raw)
 	if got != "Answer prefix" {
 		t.Fatalf("unexpected sanitize result for dangling think block: %q", got)
 	}
 }
 func TestSanitizeLeakedOutputRemovesAgentXMLLeaks(t *testing.T) {
 	raw := "Done.<attempt_completion><result>Some final answer</result></attempt_completion>"
 	got := sanitizeLeakedOutput(raw)
--- a/internal/adapter/openai/models_route_test.go
+++ b/internal/adapter/openai/models_route_test.go
@@ -22,6 +22,24 @@ func TestGetModelRouteDirectAndAlias(t *testing.T) {
 		}
 	})
 	t.Run("direct_expert", func(t *testing.T) {
 		req := httptest.NewRequest(http.MethodGet, "/v1/models/deepseek-expert-chat", nil)
 		rec := httptest.NewRecorder()
 		r.ServeHTTP(rec, req)
 		if rec.Code != http.StatusOK {
 			t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 		}
 	})
 	t.Run("direct_vision", func(t *testing.T) {
 		req := httptest.NewRequest(http.MethodGet, "/v1/models/deepseek-vision-chat", nil)
 		rec := httptest.NewRecorder()
 		r.ServeHTTP(rec, req)
 		if rec.Code != http.StatusOK {
 			t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 		}
 	})
 	t.Run("alias", func(t *testing.T) {
 		req := httptest.NewRequest(http.MethodGet, "/v1/models/gpt-4.1", nil)
 		rec := httptest.NewRecorder()
--- a/internal/adapter/openai/prompt_build.go
+++ b/internal/adapter/openai/prompt_build.go
@@ -5,22 +5,22 @@ import (
 	"ds2api/internal/util"
 )
-func buildOpenAIFinalPrompt(messagesRaw []any, toolsRaw any, traceID string) (string, []string) {
+func buildOpenAIFinalPrompt(messagesRaw []any, toolsRaw any, traceID string, thinkingEnabled bool) (string, []string) {
-	return buildOpenAIFinalPromptWithPolicy(messagesRaw, toolsRaw, traceID, util.DefaultToolChoicePolicy())
+	return buildOpenAIFinalPromptWithPolicy(messagesRaw, toolsRaw, traceID, util.DefaultToolChoicePolicy(), thinkingEnabled)
 }
-func buildOpenAIFinalPromptWithPolicy(messagesRaw []any, toolsRaw any, traceID string, toolPolicy util.ToolChoicePolicy) (string, []string) {
+func buildOpenAIFinalPromptWithPolicy(messagesRaw []any, toolsRaw any, traceID string, toolPolicy util.ToolChoicePolicy, thinkingEnabled bool) (string, []string) {
 	messages := normalizeOpenAIMessagesForPrompt(messagesRaw, traceID)
 	toolNames := []string{}
 	if tools, ok := toolsRaw.([]any); ok && len(tools) > 0 {
 		messages, toolNames = injectToolPrompt(messages, tools, toolPolicy)
 	}
-	return deepseek.MessagesPrepare(messages), toolNames
+	return deepseek.MessagesPrepareWithThinking(messages, thinkingEnabled), toolNames
 }
 // BuildPromptForAdapter exposes the OpenAI-compatible prompt building flow so
 // other protocol adapters (for example Gemini) can reuse the same tool/history
 // normalization logic and remain behavior-compatible with chat/completions.
-func BuildPromptForAdapter(messagesRaw []any, toolsRaw any, traceID string) (string, []string) {
+func BuildPromptForAdapter(messagesRaw []any, toolsRaw any, traceID string, thinkingEnabled bool) (string, []string) {
-	return buildOpenAIFinalPrompt(messagesRaw, toolsRaw, traceID)
+	return buildOpenAIFinalPrompt(messagesRaw, toolsRaw, traceID, thinkingEnabled)
 }
--- a/internal/adapter/openai/prompt_build_test.go
+++ b/internal/adapter/openai/prompt_build_test.go
@@ -40,7 +40,7 @@ func TestBuildOpenAIFinalPrompt_HandlerPathIncludesToolRoundtripSemantics(t *tes
 		},
 	}
-	finalPrompt, toolNames := buildOpenAIFinalPrompt(messages, tools, "")
+	finalPrompt, toolNames := buildOpenAIFinalPrompt(messages, tools, "", false)
 	if len(toolNames) != 1 || toolNames[0] != "get_weather" {
 		t.Fatalf("unexpected tool names: %#v", toolNames)
 	}
@@ -73,17 +73,14 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
 		},
 	}
-	finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "")
+	finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
-	if !strings.Contains(finalPrompt, "After receiving a tool result, use it directly.") {
+	if !strings.Contains(finalPrompt, "Remember: Output ONLY the <tool_calls>...</tool_calls> XML block when calling tools.") {
-		t.Fatalf("vercel prepare finalPrompt missing final-answer instruction: %q", finalPrompt)
+		t.Fatalf("vercel prepare finalPrompt missing final tool-call anchor instruction: %q", finalPrompt)
 	}
 	if !strings.Contains(finalPrompt, "Only call another tool if the result is insufficient.") {
 		t.Fatalf("vercel prepare finalPrompt missing retry guard instruction: %q", finalPrompt)
 	}
 	if !strings.Contains(finalPrompt, "TOOL CALL FORMAT") {
 		t.Fatalf("vercel prepare finalPrompt missing xml format instruction: %q", finalPrompt)
 	}
-	if !strings.Contains(finalPrompt, "Do NOT wrap the XML in markdown code fences") {
+	if !strings.Contains(finalPrompt, "Do NOT wrap XML in markdown fences") {
 		t.Fatalf("vercel prepare finalPrompt missing no-fence xml instruction: %q", finalPrompt)
 	}
 	if strings.Contains(finalPrompt, "```json") {
--- a/internal/adapter/openai/responses_embeddings_test.go
+++ b/internal/adapter/openai/responses_embeddings_test.go
@@ -156,6 +156,33 @@ func TestNormalizeResponsesInputAsMessagesFunctionCallItemPreservesConcatenatedA
 	}
 }
 func TestCollectOpenAIRefFileIDs(t *testing.T) {
 	got := collectOpenAIRefFileIDs(map[string]any{
 		"ref_file_ids": []any{"file-top", "file-dup"},
 		"attachments": []any{
 			map[string]any{"file_id": "file-attachment"},
 		},
 		"input": []any{
 			map[string]any{
 				"type": "message",
 				"content": []any{
 					map[string]any{"type": "input_file", "file_id": "file-input"},
 					map[string]any{"type": "input_file", "id": "file-dup"},
 				},
 			},
 		},
 	})
 	want := []string{"file-top", "file-dup", "file-attachment", "file-input"}
 	if len(got) != len(want) {
 		t.Fatalf("expected %d file ids, got %#v", len(want), got)
 	}
 	for i, id := range want {
 		if got[i] != id {
 			t.Fatalf("unexpected file ids at %d: got=%#v want=%#v", i, got, want)
 		}
 	}
 }
 func TestExtractEmbeddingInputs(t *testing.T) {
 	got := extractEmbeddingInputs([]any{"a", "b"})
 	if len(got) != 2 || got[0] != "a" || got[1] != "b" {
--- a/internal/adapter/openai/responses_handler.go
+++ b/internal/adapter/openai/responses_handler.go
@@ -1,6 +1,7 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"io"
 	"net/http"
@@ -64,11 +65,20 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
 		return
 	}
 	r.Body = http.MaxBytesReader(w, r.Body, openAIGeneralMaxSize)
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
 		if strings.Contains(strings.ToLower(err.Error()), "too large") {
 			writeOpenAIError(w, http.StatusRequestEntityTooLarge, "request body too large")
 			return
 		}
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
 	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
 		writeOpenAIInlineFileError(w, err)
 		return
 	}
 	traceID := requestTraceID(r)
 	stdReq, err := normalizeOpenAIResponsesRequest(h.Store, req, traceID)
 	if err != nil {
@@ -106,7 +116,7 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
 }
 func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Response, owner, responseID, model, finalPrompt string, thinkingEnabled bool, toolNames []string, toolChoice util.ToolChoicePolicy, traceID string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeOpenAIError(w, resp.StatusCode, strings.TrimSpace(string(body)))
@@ -116,10 +126,10 @@ func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Res
 	stripReferenceMarkers := h.compatStripReferenceMarkers()
 	sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
 	sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
-	if writeUpstreamEmptyOutputError(w, sanitizedThinking, sanitizedText, result.ContentFilter) {
+	if writeUpstreamEmptyOutputError(w, sanitizedText, result.ContentFilter) {
 		return
 	}
-	textParsed := util.ParseStandaloneToolCallsDetailed(sanitizedText, toolNames)
+	textParsed := toolcall.ParseStandaloneToolCallsDetailed(sanitizedText, toolNames)
 	logResponsesToolPolicyRejection(traceID, toolChoice, textParsed, "text")
 	callCount := len(textParsed.Calls)
@@ -129,20 +139,12 @@ func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Res
 	}
 	responseObj := openaifmt.BuildResponseObject(responseID, model, finalPrompt, sanitizedThinking, sanitizedText, toolNames)
 	if result.OutputTokens > 0 {
 		if usage, ok := responseObj["usage"].(map[string]any); ok {
 			usage["output_tokens"] = result.OutputTokens
 			if input, ok := usage["input_tokens"].(int); ok {
 				usage["total_tokens"] = input + result.OutputTokens
 			}
 		}
 	}
 	h.getResponseStore().put(owner, responseID, responseObj)
 	writeJSON(w, http.StatusOK, responseObj)
 }
 func (h *Handler) handleResponsesStream(w http.ResponseWriter, r *http.Request, resp *http.Response, owner, responseID, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolChoice util.ToolChoicePolicy, traceID string) {
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	if resp.StatusCode != http.StatusOK {
 		body, _ := io.ReadAll(resp.Body)
 		writeOpenAIError(w, resp.StatusCode, strings.TrimSpace(string(body)))
@@ -200,7 +202,7 @@ func (h *Handler) handleResponsesStream(w http.ResponseWriter, r *http.Request,
 	})
 }
-func logResponsesToolPolicyRejection(traceID string, policy util.ToolChoicePolicy, parsed util.ToolCallParseResult, channel string) {
+func logResponsesToolPolicyRejection(traceID string, policy util.ToolChoicePolicy, parsed toolcall.ToolCallParseResult, channel string) {
 	rejected := filteredRejectedToolNamesForLog(parsed.RejectedToolNames)
 	if !parsed.RejectedByPolicy || len(rejected) == 0 {
 		return
--- a/internal/adapter/openai/responses_stream_runtime_core.go
+++ b/internal/adapter/openai/responses_stream_runtime_core.go
@@ -1,6 +1,7 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"net/http"
 	"strings"
@@ -50,7 +51,6 @@ type responsesStreamRuntime struct {
 	messagePartAdded  bool
 	sequence          int
 	failed            bool
 	outputTokens      int
 	persistResponse func(obj map[string]any)
 }
@@ -99,6 +99,30 @@ func newResponsesStreamRuntime(
 	}
 }
 func (s *responsesStreamRuntime) failResponse(message, code string) {
 	s.failed = true
 	failedResp := map[string]any{
 		"id":          s.responseID,
 		"type":        "response",
 		"object":      "response",
 		"model":       s.model,
 		"status":      "failed",
 		"output":      []any{},
 		"output_text": "",
 		"error": map[string]any{
 			"message": message,
 			"type":    "invalid_request_error",
 			"code":    code,
 			"param":   nil,
 		},
 	}
 	if s.persistResponse != nil {
 		s.persistResponse(failedResp)
 	}
 	s.sendEvent("response.failed", openaifmt.BuildResponsesFailedPayload(s.responseID, s.model, message, code))
 	s.sendDone()
 }
 func (s *responsesStreamRuntime) finalize() {
 	finalThinking := s.thinking.String()
 	finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
@@ -107,7 +131,7 @@ func (s *responsesStreamRuntime) finalize() {
 		s.processToolStreamEvents(flushToolSieve(&s.sieve, s.toolNames), true)
 	}
-	textParsed := util.ParseStandaloneToolCallsDetailed(finalText, s.toolNames)
+	textParsed := toolcall.ParseStandaloneToolCallsDetailed(finalText, s.toolNames)
 	detected := textParsed.Calls
 	s.logToolPolicyRejections(textParsed)
@@ -121,41 +145,21 @@ func (s *responsesStreamRuntime) finalize() {
 	s.closeMessageItem()
 	if s.toolChoice.IsRequired() && len(detected) == 0 {
-		s.failed = true
+		s.failResponse("tool_choice requires at least one valid tool call.", "tool_choice_violation")
-		message := "tool_choice requires at least one valid tool call."
+		return
-		failedResp := map[string]any{
+	}
-			"id":          s.responseID,
+	if len(detected) == 0 && strings.TrimSpace(finalText) == "" {
-			"type":        "response",
+		code := "upstream_empty_output"
-			"object":      "response",
+		message := "Upstream model returned empty output."
-			"model":       s.model,
+		if finalThinking != "" {
-			"status":      "failed",
+			message = "Upstream model returned reasoning without visible output."
 			"output":      []any{},
 			"output_text": "",
 			"error": map[string]any{
 				"message": message,
 				"type":    "invalid_request_error",
 				"code":    "tool_choice_violation",
 				"param":   nil,
 			},
 		}
-		if s.persistResponse != nil {
+		s.failResponse(message, code)
 			s.persistResponse(failedResp)
 		}
 		s.sendEvent("response.failed", openaifmt.BuildResponsesFailedPayload(s.responseID, s.model, message, "tool_choice_violation"))
 		s.sendDone()
 		return
 	}
 	s.closeIncompleteFunctionItems()
 	obj := s.buildCompletedResponseObject(finalThinking, finalText, detected)
 	if s.outputTokens > 0 {
 		if usage, ok := obj["usage"].(map[string]any); ok {
 			usage["output_tokens"] = s.outputTokens
 			if input, ok := usage["input_tokens"].(int); ok {
 				usage["total_tokens"] = input + s.outputTokens
 			}
 		}
 	}
 	if s.persistResponse != nil {
 		s.persistResponse(obj)
 	}
@@ -163,8 +167,8 @@ func (s *responsesStreamRuntime) finalize() {
 	s.sendDone()
 }
-func (s *responsesStreamRuntime) logToolPolicyRejections(textParsed util.ToolCallParseResult) {
+func (s *responsesStreamRuntime) logToolPolicyRejections(textParsed toolcall.ToolCallParseResult) {
-	logRejected := func(parsed util.ToolCallParseResult, channel string) {
+	logRejected := func(parsed toolcall.ToolCallParseResult, channel string) {
 		rejected := filteredRejectedToolNamesForLog(parsed.RejectedToolNames)
 		if !parsed.RejectedByPolicy || len(rejected) == 0 {
 			return
@@ -184,9 +188,6 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
 	if !parsed.Parsed {
 		return streamengine.ParsedDecision{}
 	}
 	if parsed.OutputTokens > 0 {
 		s.outputTokens = parsed.OutputTokens
 	}
 	if parsed.ContentFilter || parsed.ErrorMessage != "" || parsed.Stop {
 		return streamengine.ParsedDecision{Stop: true}
 	}
@@ -205,17 +206,25 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
 			if !s.thinkingEnabled {
 				continue
 			}
-			s.thinking.WriteString(cleanedText)
+			trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
-			s.sendEvent("response.reasoning.delta", openaifmt.BuildResponsesReasoningDeltaPayload(s.responseID, cleanedText))
+			if trimmed == "" {
 				continue
 			}
 			s.thinking.WriteString(trimmed)
 			s.sendEvent("response.reasoning.delta", openaifmt.BuildResponsesReasoningDeltaPayload(s.responseID, trimmed))
 			continue
 		}
-		s.text.WriteString(cleanedText)
+		trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
-		if !s.bufferToolContent {
+		if trimmed == "" {
 			s.emitTextDelta(cleanedText)
 			continue
 		}
-		s.processToolStreamEvents(processToolSieveChunk(&s.sieve, cleanedText, s.toolNames), true)
+		s.text.WriteString(trimmed)
 		if !s.bufferToolContent {
 			s.emitTextDelta(trimmed)
 			continue
 		}
 		s.processToolStreamEvents(processToolSieveChunk(&s.sieve, trimmed, s.toolNames), true)
 	}
 	return streamengine.ParsedDecision{ContentSeen: contentSeen}
--- a/internal/adapter/openai/responses_stream_runtime_events.go
+++ b/internal/adapter/openai/responses_stream_runtime_events.go
@@ -48,7 +48,7 @@ func (s *responsesStreamRuntime) processToolStreamEvents(events []toolStreamEven
 			if !s.emitEarlyToolDeltas {
 				continue
 			}
-			filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.toolNames, s.functionNames)
+			filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.functionNames)
 			if len(filtered) == 0 {
 				continue
 			}
--- a/internal/adapter/openai/responses_stream_runtime_toolcalls.go
+++ b/internal/adapter/openai/responses_stream_runtime_toolcalls.go
@@ -1,11 +1,11 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"strings"
 	openaifmt "ds2api/internal/format/openai"
 	"ds2api/internal/util"
 	"github.com/google/uuid"
 )
@@ -208,7 +208,7 @@ func (s *responsesStreamRuntime) emitFunctionCallDeltaEvents(deltas []toolCallDe
 	}
 }
-func (s *responsesStreamRuntime) emitFunctionCallDoneEvents(calls []util.ParsedToolCall) {
+func (s *responsesStreamRuntime) emitFunctionCallDoneEvents(calls []toolcall.ParsedToolCall) {
 	for idx, tc := range calls {
 		if strings.TrimSpace(tc.Name) == "" {
 			continue
--- a/internal/adapter/openai/responses_stream_runtime_toolcalls_finalize.go
+++ b/internal/adapter/openai/responses_stream_runtime_toolcalls_finalize.go
@@ -1,12 +1,12 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"sort"
 	"strings"
 	openaifmt "ds2api/internal/format/openai"
 	"ds2api/internal/util"
 )
 func (s *responsesStreamRuntime) closeIncompleteFunctionItems() {
@@ -57,7 +57,7 @@ func (s *responsesStreamRuntime) closeIncompleteFunctionItems() {
 	}
 }
-func (s *responsesStreamRuntime) buildCompletedResponseObject(finalThinking, finalText string, calls []util.ParsedToolCall) map[string]any {
+func (s *responsesStreamRuntime) buildCompletedResponseObject(finalThinking, finalText string, calls []toolcall.ParsedToolCall) map[string]any {
 	type indexedItem struct {
 		index int
 		item  map[string]any
--- a/internal/adapter/openai/responses_stream_test.go
+++ b/internal/adapter/openai/responses_stream_test.go
@@ -518,6 +518,44 @@ func TestHandleResponsesStreamRequiredMalformedToolPayloadFails(t *testing.T) {
 	}
 }
 func TestHandleResponsesStreamFailsWhenUpstreamHasOnlyThinking(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
 	rec := httptest.NewRecorder()
 	sseLine := func(path, value string) string {
 		b, _ := json.Marshal(map[string]any{
 			"p": path,
 			"v": value,
 		})
 		return "data: " + string(b) + "\n"
 	}
 	streamBody := sseLine("response/thinking_content", "Only thinking") + "data: [DONE]\n"
 	resp := &http.Response{
 		StatusCode: http.StatusOK,
 		Body:       io.NopCloser(strings.NewReader(streamBody)),
 	}
 	h.handleResponsesStream(rec, req, resp, "owner-a", "resp_test", "deepseek-reasoner", "prompt", true, false, nil, util.DefaultToolChoicePolicy(), "")
 	body := rec.Body.String()
 	if !strings.Contains(body, "event: response.failed") {
 		t.Fatalf("expected response.failed event, body=%s", body)
 	}
 	if strings.Contains(body, "event: response.completed") {
 		t.Fatalf("did not expect response.completed, body=%s", body)
 	}
 	payload, ok := extractSSEEventPayload(body, "response.failed")
 	if !ok {
 		t.Fatalf("expected response.failed payload, body=%s", body)
 	}
 	errObj, _ := payload["error"].(map[string]any)
 	if asString(errObj["code"]) != "upstream_empty_output" {
 		t.Fatalf("expected code=upstream_empty_output, got %#v", payload)
 	}
 }
 func TestHandleResponsesStreamAllowsUnknownToolName(t *testing.T) {
 	h := &Handler{}
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil)
@@ -627,7 +665,7 @@ func TestHandleResponsesNonStreamToolChoiceNoneStillAllowsFunctionCall(t *testin
 	}
 }
-func TestHandleResponsesNonStreamReturns502WhenUpstreamOutputEmpty(t *testing.T) {
+func TestHandleResponsesNonStreamReturns429WhenUpstreamOutputEmpty(t *testing.T) {
 	h := &Handler{}
 	rec := httptest.NewRecorder()
 	resp := &http.Response{
@@ -639,8 +677,8 @@ func TestHandleResponsesNonStreamReturns502WhenUpstreamOutputEmpty(t *testing.T)
 	}
 	h.handleResponsesNonStream(rec, resp, "owner-a", "resp_test", "deepseek-chat", "prompt", false, nil, util.DefaultToolChoicePolicy(), "")
-	if rec.Code != http.StatusBadGateway {
+	if rec.Code != http.StatusTooManyRequests {
-		t.Fatalf("expected 502 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
+		t.Fatalf("expected 429 for empty upstream output, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	out := decodeJSONBody(t, rec.Body.String())
 	errObj, _ := out["error"].(map[string]any)
@@ -671,6 +709,28 @@ func TestHandleResponsesNonStreamReturnsContentFilterErrorWhenUpstreamFilteredWi
 	}
 }
 func TestHandleResponsesNonStreamReturns429WhenUpstreamHasOnlyThinking(t *testing.T) {
 	h := &Handler{}
 	rec := httptest.NewRecorder()
 	resp := &http.Response{
 		StatusCode: http.StatusOK,
 		Body: io.NopCloser(strings.NewReader(
 			`data: {"p":"response/thinking_content","v":"Only thinking"}` + "\n" +
 				`data: [DONE]` + "\n",
 		)),
 	}
 	h.handleResponsesNonStream(rec, resp, "owner-a", "resp_test", "deepseek-reasoner", "prompt", true, nil, util.DefaultToolChoicePolicy(), "")
 	if rec.Code != http.StatusTooManyRequests {
 		t.Fatalf("expected 429 for thinking-only upstream output, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	out := decodeJSONBody(t, rec.Body.String())
 	errObj, _ := out["error"].(map[string]any)
 	if asString(errObj["code"]) != "upstream_empty_output" {
 		t.Fatalf("expected code=upstream_empty_output, got %#v", out)
 	}
 }
 func extractSSEEventPayload(body, targetEvent string) (map[string]any, bool) {
 	scanner := bufio.NewScanner(strings.NewReader(body))
 	matched := false
--- a/internal/adapter/openai/standard_request.go
+++ b/internal/adapter/openai/standard_request.go
@@ -12,11 +12,11 @@ func normalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID
 	model, _ := req["model"].(string)
 	messagesRaw, _ := req["messages"].([]any)
 	if strings.TrimSpace(model) == "" || len(messagesRaw) == 0 {
-		return util.StandardRequest{}, fmt.Errorf("Request must include 'model' and 'messages'.")
+		return util.StandardRequest{}, fmt.Errorf("request must include 'model' and 'messages'")
 	}
 	resolvedModel, ok := config.ResolveModel(store, model)
 	if !ok {
-		return util.StandardRequest{}, fmt.Errorf("Model '%s' is not available.", model)
+		return util.StandardRequest{}, fmt.Errorf("model %q is not available", model)
 	}
 	thinkingEnabled, searchEnabled, _ := config.GetModelConfig(resolvedModel)
 	responseModel := strings.TrimSpace(model)
@@ -24,9 +24,10 @@ func normalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID
 		responseModel = resolvedModel
 	}
 	toolPolicy := util.DefaultToolChoicePolicy()
-	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy)
+	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy, thinkingEnabled)
 	toolNames = ensureToolDetectionEnabled(toolNames, req["tools"])
 	passThrough := collectOpenAIChatPassThrough(req)
 	refFileIDs := collectOpenAIRefFileIDs(req)
 	return util.StandardRequest{
 		Surface:        "openai_chat",
@@ -40,6 +41,7 @@ func normalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID
 		Stream:         util.ToBool(req["stream"]),
 		Thinking:       thinkingEnabled,
 		Search:         searchEnabled,
 		RefFileIDs:     refFileIDs,
 		PassThrough:    passThrough,
 	}, nil
 }
@@ -48,11 +50,11 @@ func normalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
 	model, _ := req["model"].(string)
 	model = strings.TrimSpace(model)
 	if model == "" {
-		return util.StandardRequest{}, fmt.Errorf("Request must include 'model'.")
+		return util.StandardRequest{}, fmt.Errorf("request must include 'model'")
 	}
 	resolvedModel, ok := config.ResolveModel(store, model)
 	if !ok {
-		return util.StandardRequest{}, fmt.Errorf("Model '%s' is not available.", model)
+		return util.StandardRequest{}, fmt.Errorf("model %q is not available", model)
 	}
 	thinkingEnabled, searchEnabled, _ := config.GetModelConfig(resolvedModel)
@@ -68,18 +70,19 @@ func normalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
 		messagesRaw = msgs
 	}
 	if len(messagesRaw) == 0 {
-		return util.StandardRequest{}, fmt.Errorf("Request must include 'input' or 'messages'.")
+		return util.StandardRequest{}, fmt.Errorf("request must include 'input' or 'messages'")
 	}
 	toolPolicy, err := parseToolChoicePolicy(req["tool_choice"], req["tools"])
 	if err != nil {
 		return util.StandardRequest{}, err
 	}
-	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy)
+	finalPrompt, toolNames := buildOpenAIFinalPromptWithPolicy(messagesRaw, req["tools"], traceID, toolPolicy, thinkingEnabled)
 	toolNames = ensureToolDetectionEnabled(toolNames, req["tools"])
 	if !toolPolicy.IsNone() {
 		toolPolicy.Allowed = namesToSet(toolNames)
 	}
 	passThrough := collectOpenAIChatPassThrough(req)
 	refFileIDs := collectOpenAIRefFileIDs(req)
 	return util.StandardRequest{
 		Surface:        "openai_responses",
@@ -93,6 +96,7 @@ func normalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
 		Stream:         util.ToBool(req["stream"]),
 		Thinking:       thinkingEnabled,
 		Search:         searchEnabled,
 		RefFileIDs:     refFileIDs,
 		PassThrough:    passThrough,
 	}, nil
 }
@@ -152,7 +156,7 @@ func parseToolChoicePolicy(toolChoiceRaw any, toolsRaw any) (util.ToolChoicePoli
 		case "required":
 			policy.Mode = util.ToolChoiceRequired
 		default:
-			return util.ToolChoicePolicy{}, fmt.Errorf("Unsupported tool_choice: %q", v)
+			return util.ToolChoicePolicy{}, fmt.Errorf("unsupported tool_choice: %q", v)
 		}
 	case map[string]any:
 		allowedOverride, hasAllowedOverride, err := parseAllowedToolNames(v["allowed_tools"])
@@ -198,7 +202,7 @@ func parseToolChoicePolicy(toolChoiceRaw any, toolsRaw any) (util.ToolChoicePoli
 			policy.ForcedName = name
 			policy.Allowed = namesToSet([]string{name})
 		default:
-			return util.ToolChoicePolicy{}, fmt.Errorf("Unsupported tool_choice.type: %q", typ)
+			return util.ToolChoicePolicy{}, fmt.Errorf("unsupported tool_choice.type: %q", typ)
 		}
 	default:
 		return util.ToolChoicePolicy{}, fmt.Errorf("tool_choice must be a string or object")
@@ -206,7 +210,7 @@ func parseToolChoicePolicy(toolChoiceRaw any, toolsRaw any) (util.ToolChoicePoli
 	if policy.Mode == util.ToolChoiceRequired || policy.Mode == util.ToolChoiceForced {
 		if len(declaredNames) == 0 {
-			return util.ToolChoicePolicy{}, fmt.Errorf("tool_choice=%s requires non-empty tools.", policy.Mode)
+			return util.ToolChoicePolicy{}, fmt.Errorf("tool_choice=%s requires non-empty tools", policy.Mode)
 		}
 	}
 	if policy.Mode == util.ToolChoiceForced {
--- a/internal/adapter/openai/standard_request_test.go
+++ b/internal/adapter/openai/standard_request_test.go
@@ -41,6 +41,36 @@ func TestNormalizeOpenAIChatRequest(t *testing.T) {
 	}
 }
 func TestNormalizeOpenAIChatRequestCollectsRefFileIDs(t *testing.T) {
 	store := newEmptyStoreForNormalizeTest(t)
 	req := map[string]any{
 		"model": "gpt-5-codex",
 		"messages": []any{
 			map[string]any{
 				"role": "user",
 				"content": []any{
 					map[string]any{"type": "input_text", "text": "hello"},
 					map[string]any{"type": "input_file", "file_id": "file-msg"},
 				},
 			},
 		},
 		"attachments": []any{
 			map[string]any{"file_id": "file-attachment"},
 		},
 		"ref_file_ids": []any{"file-top", "file-attachment"},
 	}
 	n, err := normalizeOpenAIChatRequest(store, req, "")
 	if err != nil {
 		t.Fatalf("normalize failed: %v", err)
 	}
 	if len(n.RefFileIDs) != 3 {
 		t.Fatalf("expected 3 distinct file ids, got %#v", n.RefFileIDs)
 	}
 	if n.RefFileIDs[0] != "file-top" || n.RefFileIDs[1] != "file-attachment" || n.RefFileIDs[2] != "file-msg" {
 		t.Fatalf("unexpected file ids: %#v", n.RefFileIDs)
 	}
 }
 func TestNormalizeOpenAIResponsesRequestInput(t *testing.T) {
 	store := newEmptyStoreForNormalizeTest(t)
 	req := map[string]any{
--- a/internal/adapter/openai/stream_status_test.go
+++ b/internal/adapter/openai/stream_status_test.go
@@ -50,6 +50,10 @@ func (m streamStatusDSStub) GetPow(_ context.Context, _ *auth.RequestAuth, _ int
 	return "pow", nil
 }
 func (m streamStatusDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, _ deepseek.UploadFileRequest, _ int) (*deepseek.UploadFileResult, error) {
 	return &deepseek.UploadFileResult{ID: "file-id", Filename: "file.txt", Bytes: 1, Status: "uploaded"}, nil
 }
 func (m streamStatusDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
 	return m.resp, nil
 }
@@ -238,3 +242,140 @@ func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T
 		t.Fatalf("expected finish_reason=stop for content-filter upstream stop, got %#v", choice["finish_reason"])
 	}
 }
 func TestChatCompletionsStreamEmitsFailureFrameWhenUpstreamOutputEmpty(t *testing.T) {
 	statuses := make([]int, 0, 1)
 	h := &Handler{
 		Store: mockOpenAIConfig{wideInput: true},
 		Auth:  streamStatusAuthStub{},
 		DS:    streamStatusDSStub{resp: makeOpenAISSEHTTPResponse("data: [DONE]")},
 	}
 	r := chi.NewRouter()
 	r.Use(captureStatusMiddleware(&statuses))
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","messages":[{"role":"user","content":"hi"}],"stream":true}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(statuses) != 1 || statuses[0] != http.StatusOK {
 		t.Fatalf("expected captured status 200, got %#v", statuses)
 	}
 	frames, done := parseSSEDataFrames(t, rec.Body.String())
 	if !done {
 		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
 	}
 	if len(frames) != 1 {
 		t.Fatalf("expected one failure frame, got %#v body=%s", frames, rec.Body.String())
 	}
 	last := frames[0]
 	statusCode, ok := last["status_code"].(float64)
 	if !ok || int(statusCode) != http.StatusTooManyRequests {
 		t.Fatalf("expected status_code=429, got %#v body=%s", last["status_code"], rec.Body.String())
 	}
 	errObj, _ := last["error"].(map[string]any)
 	if asString(errObj["code"]) != "upstream_empty_output" {
 		t.Fatalf("expected code=upstream_empty_output, got %#v", last)
 	}
 }
 func TestResponsesStreamUsageIgnoresBatchAccumulatedTokenUsage(t *testing.T) {
 	statuses := make([]int, 0, 1)
 	h := &Handler{
 		Store: mockOpenAIConfig{wideInput: true},
 		Auth:  streamStatusAuthStub{},
 		DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
 			`data: {"p":"response/content","v":"hello"}`,
 			`data: {"p":"response","o":"BATCH","v":[{"p":"accumulated_token_usage","v":190},{"p":"quasi_status","v":"FINISHED"}]}`,
 		)},
 	}
 	r := chi.NewRouter()
 	r.Use(captureStatusMiddleware(&statuses))
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","input":"hi","stream":true}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(statuses) != 1 || statuses[0] != http.StatusOK {
 		t.Fatalf("expected captured status 200, got %#v", statuses)
 	}
 	frames, done := parseSSEDataFrames(t, rec.Body.String())
 	if !done {
 		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
 	}
 	if len(frames) == 0 {
 		t.Fatalf("expected at least one json frame, body=%s", rec.Body.String())
 	}
 	last := frames[len(frames)-1]
 	resp, _ := last["response"].(map[string]any)
 	if resp == nil {
 		t.Fatalf("expected response payload in final frame, got %#v", last)
 	}
 	usage, _ := resp["usage"].(map[string]any)
 	if usage == nil {
 		t.Fatalf("expected usage in response payload, got %#v", resp)
 	}
 	if got, _ := usage["output_tokens"].(float64); int(got) == 190 {
 		t.Fatalf("expected upstream accumulated token usage to be ignored, got %#v", usage["output_tokens"])
 	}
 }
 func TestResponsesNonStreamUsageIgnoresPromptAndOutputTokenUsage(t *testing.T) {
 	statuses := make([]int, 0, 1)
 	h := &Handler{
 		Store: mockOpenAIConfig{wideInput: true},
 		Auth:  streamStatusAuthStub{},
 		DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
 			`data: {"p":"response/content","v":"ok"}`,
 			`data: {"p":"response","o":"BATCH","v":[{"p":"token_usage","v":{"prompt_tokens":11,"completion_tokens":29}},{"p":"quasi_status","v":"FINISHED"}]}`,
 		)},
 	}
 	r := chi.NewRouter()
 	r.Use(captureStatusMiddleware(&statuses))
 	RegisterRoutes(r, h)
 	reqBody := `{"model":"deepseek-chat","input":"hi","stream":false}`
 	req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
 	req.Header.Set("Authorization", "Bearer direct-token")
 	req.Header.Set("Content-Type", "application/json")
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	if len(statuses) != 1 || statuses[0] != http.StatusOK {
 		t.Fatalf("expected captured status 200, got %#v", statuses)
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
 	}
 	usage, _ := out["usage"].(map[string]any)
 	if usage == nil {
 		t.Fatalf("expected usage object, got %#v", out)
 	}
 	input, _ := usage["input_tokens"].(float64)
 	output, _ := usage["output_tokens"].(float64)
 	total, _ := usage["total_tokens"].(float64)
 	if int(output) == 29 {
 		t.Fatalf("expected upstream completion token usage to be ignored, got %#v", usage["output_tokens"])
 	}
 	if int(total) != int(input)+int(output) {
 		t.Fatalf("expected total_tokens=input_tokens+output_tokens, usage=%#v", usage)
 	}
 }
--- a/internal/adapter/openai/tool_sieve_core.go
+++ b/internal/adapter/openai/tool_sieve_core.go
@@ -3,7 +3,7 @@ package openai
 import (
 	"strings"
-	"ds2api/internal/util"
+	"ds2api/internal/toolcall"
 )
 func processToolSieveChunk(state *toolStreamSieveState, chunk string, toolNames []string) []toolStreamEvent {
@@ -226,7 +226,7 @@ func findToolSegmentStart(s string) int {
 	return start
 }
-func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix string, calls []util.ParsedToolCall, suffix string, ready bool) {
+func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix string, calls []toolcall.ParsedToolCall, suffix string, ready bool) {
 	captured := state.capture.String()
 	if captured == "" {
 		return "", nil, "", false
@@ -267,7 +267,7 @@ func consumeToolCapture(state *toolStreamSieveState, toolNames []string) (prefix
 	}
 	prefixPart := captured[:start]
 	suffixPart := captured[end:]
-	parsed := util.ParseStandaloneToolCallsDetailed(obj, toolNames)
+	parsed := toolcall.ParseStandaloneToolCallsDetailed(obj, toolNames)
 	if len(parsed.Calls) == 0 {
 		if parsed.SawToolCallSyntax && parsed.RejectedByPolicy {
 			// Parsed as tool-call payload but rejected by schema/policy:
--- a/internal/adapter/openai/tool_sieve_state.go
+++ b/internal/adapter/openai/tool_sieve_state.go
@@ -1,9 +1,8 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"strings"
 	"ds2api/internal/util"
 )
 type toolStreamSieveState struct {
@@ -12,7 +11,7 @@ type toolStreamSieveState struct {
 	capturing        bool
 	recentTextTail   string
 	pendingToolRaw   string
-	pendingToolCalls []util.ParsedToolCall
+	pendingToolCalls []toolcall.ParsedToolCall
 	disableDeltas    bool
 	toolNameSent     bool
 	toolName         string
@@ -24,7 +23,7 @@ type toolStreamSieveState struct {
 type toolStreamEvent struct {
 	Content        string
-	ToolCalls      []util.ParsedToolCall
+	ToolCalls      []toolcall.ParsedToolCall
 	ToolCallDeltas []toolCallDelta
 }
--- a/internal/adapter/openai/tool_sieve_xml.go
+++ b/internal/adapter/openai/tool_sieve_xml.go
@@ -1,14 +1,14 @@
 package openai
 import (
 	"ds2api/internal/toolcall"
 	"regexp"
 	"strings"
 	"ds2api/internal/util"
 )
 // --- XML tool call support for the streaming sieve ---
 //nolint:unused // kept as explicit tag inventory for future XML sieve refinements.
 var xmlToolCallClosingTags = []string{"</tool_calls>", "</tool_call>", "</invoke>", "</function_call>", "</function_calls>", "</tool_use>",
 	// Agent-style XML tags (Roo Code, Cline, etc.)
 	"</attempt_completion>", "</ask_followup_question>", "</new_task>", "</result>"}
@@ -34,6 +34,8 @@ var xmlToolCallTagPairs = []struct{ open, close string }{
 }
 // xmlToolCallBlockPattern matches a complete XML tool call block (wrapper or standalone).
 //
 //nolint:unused // reserved for future fast-path XML block detection.
 var xmlToolCallBlockPattern = regexp.MustCompile(`(?is)(<tool_calls>\s*(?:.*?)\s*</tool_calls>|<tool_call>\s*(?:.*?)\s*</tool_call>|<invoke\b[^>]*>(?:.*?)</invoke>|<function_calls?\b[^>]*>(?:.*?)</function_calls?>|<tool_use>(?:.*?)</tool_use>|<attempt_completion>(?:.*?)</attempt_completion>|<ask_followup_question>(?:.*?)</ask_followup_question>|<new_task>(?:.*?)</new_task>)`)
 // xmlToolTagsToDetect is the set of XML tag prefixes used by findToolSegmentStart.
@@ -43,7 +45,7 @@ var xmlToolTagsToDetect = []string{"<tool_calls>", "<tool_calls\n", "<tool_call>
 	"<attempt_completion>", "<ask_followup_question>", "<new_task>"}
 // consumeXMLToolCapture tries to extract complete XML tool call blocks from captured text.
-func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, calls []util.ParsedToolCall, suffix string, ready bool) {
+func consumeXMLToolCapture(captured string, toolNames []string) (prefix string, calls []toolcall.ParsedToolCall, suffix string, ready bool) {
 	lower := strings.ToLower(captured)
 	// Find the FIRST matching open/close pair, preferring wrapper tags.
 	// Tag pairs are ordered longest-first (e.g. <tool_calls before <tool_call)
@@ -66,7 +68,7 @@ func consumeXMLToolCapture(captured string, toolNames []string) (prefix string,
 		xmlBlock := captured[openIdx:closeEnd]
 		prefixPart := captured[:openIdx]
 		suffixPart := captured[closeEnd:]
-		parsed := util.ParseToolCalls(xmlBlock, toolNames)
+		parsed := toolcall.ParseToolCalls(xmlBlock, toolNames)
 		if len(parsed) > 0 {
 			prefixPart, suffixPart = trimWrappingJSONFence(prefixPart, suffixPart)
 			return prefixPart, parsed, suffixPart, true
--- a/internal/adapter/openai/upstream_empty.go
+++ b/internal/adapter/openai/upstream_empty.go
@@ -2,14 +2,14 @@ package openai
 import "net/http"
-func writeUpstreamEmptyOutputError(w http.ResponseWriter, thinking, text string, contentFilter bool) bool {
+func writeUpstreamEmptyOutputError(w http.ResponseWriter, text string, contentFilter bool) bool {
-	if thinking != "" || text != "" {
+	if text != "" {
 		return false
 	}
 	if contentFilter {
 		writeOpenAIErrorWithCode(w, http.StatusBadRequest, "Upstream content filtered the response and returned no output.", "content_filter")
 		return true
 	}
-	writeOpenAIErrorWithCode(w, http.StatusBadGateway, "Upstream model returned empty output.", "upstream_empty_output")
+	writeOpenAIErrorWithCode(w, http.StatusTooManyRequests, "Upstream model returned empty output.", "upstream_empty_output")
 	return true
 }
--- a/internal/adapter/openai/vercel_stream.go
+++ b/internal/adapter/openai/vercel_stream.go
@@ -52,6 +52,10 @@ func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Reque
 		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
 		return
 	}
 	if err := h.preprocessInlineFileInputs(r.Context(), a, req); err != nil {
 		writeOpenAIInlineFileError(w, err)
 		return
 	}
 	if !util.ToBool(req["stream"]) {
 		writeOpenAIError(w, http.StatusBadRequest, "stream must be true")
 		return
--- a/internal/admin/handler.go
+++ b/internal/admin/handler.go
@@ -26,9 +26,15 @@ func RegisterRoutes(r chi.Router, h *Handler) {
 		pr.Get("/config/export", h.configExport)
 		pr.Post("/keys", h.addKey)
 		pr.Delete("/keys/{key}", h.deleteKey)
 		pr.Get("/proxies", h.listProxies)
 		pr.Post("/proxies", h.addProxy)
 		pr.Put("/proxies/{proxyID}", h.updateProxy)
 		pr.Delete("/proxies/{proxyID}", h.deleteProxy)
 		pr.Post("/proxies/test", h.testProxy)
 		pr.Get("/accounts", h.listAccounts)
 		pr.Post("/accounts", h.addAccount)
 		pr.Delete("/accounts/{identifier}", h.deleteAccount)
 		pr.Put("/accounts/{identifier}/proxy", h.updateAccountProxy)
 		pr.Get("/queue/status", h.queueStatus)
 		pr.Post("/accounts/test", h.testSingleAccount)
 		pr.Post("/accounts/test-all", h.testAllAccounts)
@@ -36,6 +42,8 @@ func RegisterRoutes(r chi.Router, h *Handler) {
 		pr.Post("/import", h.batchImport)
 		pr.Post("/test", h.testAPI)
 		pr.Post("/dev/raw-samples/capture", h.captureRawSample)
 		pr.Get("/dev/raw-samples/query", h.queryRawSampleCaptures)
 		pr.Post("/dev/raw-samples/save", h.saveRawSampleFromCaptures)
 		pr.Post("/vercel/sync", h.syncVercel)
 		pr.Get("/vercel/status", h.vercelStatus)
 		pr.Post("/vercel/status", h.vercelStatus)
--- a/internal/admin/handler_accounts_crud.go
+++ b/internal/admin/handler_accounts_crud.go
@@ -68,6 +68,7 @@ func (h *Handler) listAccounts(w http.ResponseWriter, r *http.Request) {
 			"identifier":    acc.Identifier(),
 			"email":         acc.Email,
 			"mobile":        acc.Mobile,
 			"proxy_id":      acc.ProxyID,
 			"has_password":  acc.Password != "",
 			"has_token":     token != "",
 			"token_preview": preview,
@@ -86,6 +87,11 @@ func (h *Handler) addAccount(w http.ResponseWriter, r *http.Request) {
 		return
 	}
 	err := h.Store.Update(func(c *config.Config) error {
 		if acc.ProxyID != "" {
 			if _, ok := findProxyByID(*c, acc.ProxyID); !ok {
 				return fmt.Errorf("代理不存在")
 			}
 		}
 		mobileKey := config.CanonicalMobileKey(acc.Mobile)
 		for _, a := range c.Accounts {
 			if acc.Email != "" && a.Email == acc.Email {
--- a/internal/admin/handler_accounts_testing.go
+++ b/internal/admin/handler_accounts_testing.go
@@ -15,8 +15,17 @@ import (
 	"ds2api/internal/config"
 	"ds2api/internal/deepseek"
 	"ds2api/internal/sse"
 	"ds2api/internal/util"
 )
 type modelAliasSnapshotReader struct {
 	aliases map[string]string
 }
 func (m modelAliasSnapshotReader) ModelAliases() map[string]string {
 	return m.aliases
 }
 func (h *Handler) testSingleAccount(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
@@ -115,10 +124,11 @@ func (h *Handler) testAccount(ctx context.Context, acc config.Account, model, me
 		result["message"] = "登录成功但写入运行时 token 失败: " + err.Error()
 		return result
 	}
-	authCtx := &authn.RequestAuth{UseConfigToken: false, DeepSeekToken: token}
+	authCtx := &authn.RequestAuth{UseConfigToken: false, DeepSeekToken: token, AccountID: identifier, Account: acc}
-	sessionID, err := h.DS.CreateSession(ctx, authCtx, 1)
+	proxyCtx := authn.WithAuth(ctx, authCtx)
 	sessionID, err := h.DS.CreateSession(proxyCtx, authCtx, 1)
 	if err != nil {
-		newToken, loginErr := h.DS.Login(ctx, acc)
+		newToken, loginErr := h.DS.Login(proxyCtx, acc)
 		if loginErr != nil {
 			result["message"] = "创建会话失败: " + err.Error()
 			return result
@@ -129,7 +139,7 @@ func (h *Handler) testAccount(ctx context.Context, acc config.Account, model, me
 			result["message"] = "刷新 token 成功但写入运行时 token 失败: " + err.Error()
 			return result
 		}
-		sessionID, err = h.DS.CreateSession(ctx, authCtx, 1)
+		sessionID, err = h.DS.CreateSession(proxyCtx, authCtx, 1)
 		if err != nil {
 			result["message"] = "创建会话失败: " + err.Error()
 			return result
@@ -137,7 +147,7 @@ func (h *Handler) testAccount(ctx context.Context, acc config.Account, model, me
 	}
 	// 获取会话数量
-	sessionStats, sessionErr := h.DS.GetSessionCountForToken(ctx, token)
+	sessionStats, sessionErr := h.DS.GetSessionCountForToken(proxyCtx, token)
 	if sessionErr == nil && sessionStats != nil {
 		result["session_count"] = sessionStats.FirstPageCount
 	}
@@ -149,23 +159,34 @@ func (h *Handler) testAccount(ctx context.Context, acc config.Account, model, me
 		return result
 	}
 	thinking, search, ok := config.GetModelConfig(model)
 	resolvedModel, resolved := config.ResolveModel(modelAliasSnapshotReader{
 		aliases: h.Store.Snapshot().ModelAliases,
 	}, model)
 	if resolved {
 		model = resolvedModel
 		thinking, search, ok = config.GetModelConfig(model)
 	}
 	if !ok {
 		thinking, search = false, false
 	}
-	_ = search
+	pow, err := h.DS.GetPow(proxyCtx, authCtx, 1)
 	pow, err := h.DS.GetPow(ctx, authCtx, 1)
 	if err != nil {
 		result["message"] = "获取 PoW 失败: " + err.Error()
 		return result
 	}
-	payload := map[string]any{"chat_session_id": sessionID, "prompt": deepseek.MessagesPrepare([]map[string]any{{"role": "user", "content": message}}), "ref_file_ids": []any{}, "thinking_enabled": thinking, "search_enabled": search}
+	payload := util.StandardRequest{
-	resp, err := h.DS.CallCompletion(ctx, authCtx, payload, pow, 1)
+		ResolvedModel: model,
 		FinalPrompt:   deepseek.MessagesPrepare([]map[string]any{{"role": "user", "content": message}}),
 		Thinking:      thinking,
 		Search:        search,
 	}.CompletionPayload(sessionID)
 	resp, err := h.DS.CallCompletion(proxyCtx, authCtx, payload, pow, 1)
 	if err != nil {
 		result["message"] = "请求失败: " + err.Error()
 		return result
 	}
 	if resp.StatusCode != http.StatusOK {
-		defer resp.Body.Close()
+		defer func() { _ = resp.Body.Close() }()
 		result["message"] = fmt.Sprintf("请求失败: HTTP %d", resp.StatusCode)
 		return result
 	}
@@ -218,7 +239,7 @@ func (h *Handler) testAPI(w http.ResponseWriter, r *http.Request) {
 		writeJSON(w, http.StatusOK, map[string]any{"success": false, "error": err.Error()})
 		return
 	}
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	body, _ := io.ReadAll(resp.Body)
 	if resp.StatusCode == http.StatusOK {
 		var parsed any
@@ -244,25 +265,29 @@ func (h *Handler) deleteAllSessions(w http.ResponseWriter, r *http.Request) {
 	}
 	// 每次先登录刷新一次 token，避免使用过期 token。
-	token, err := h.DS.Login(r.Context(), acc)
+	authCtx := &authn.RequestAuth{UseConfigToken: false, AccountID: acc.Identifier(), Account: acc}
 	proxyCtx := authn.WithAuth(r.Context(), authCtx)
 	token, err := h.DS.Login(proxyCtx, acc)
 	if err != nil {
 		writeJSON(w, http.StatusOK, map[string]any{"success": false, "message": "登录失败: " + err.Error()})
 		return
 	}
 	_ = h.Store.UpdateAccountToken(acc.Identifier(), token)
 	authCtx.DeepSeekToken = token
 	// 删除所有会话
-	err = h.DS.DeleteAllSessionsForToken(r.Context(), token)
+	err = h.DS.DeleteAllSessionsForToken(proxyCtx, token)
 	if err != nil {
 		// token 可能过期，尝试重新登录并重试一次
-		newToken, loginErr := h.DS.Login(r.Context(), acc)
+		newToken, loginErr := h.DS.Login(proxyCtx, acc)
 		if loginErr != nil {
 			writeJSON(w, http.StatusOK, map[string]any{"success": false, "message": "删除失败: " + err.Error()})
 			return
 		}
 		token = newToken
 		_ = h.Store.UpdateAccountToken(acc.Identifier(), token)
-		if retryErr := h.DS.DeleteAllSessionsForToken(r.Context(), token); retryErr != nil {
+		authCtx.DeepSeekToken = token
 		if retryErr := h.DS.DeleteAllSessionsForToken(proxyCtx, token); retryErr != nil {
 			writeJSON(w, http.StatusOK, map[string]any{"success": false, "message": "删除失败: " + retryErr.Error()})
 			return
 		}
--- a/internal/admin/handler_accounts_testing_test.go
+++ b/internal/admin/handler_accounts_testing_test.go
@@ -5,6 +5,7 @@ import (
 	"context"
 	"encoding/json"
 	"errors"
 	"io"
 	"net/http"
 	"net/http/httptest"
 	"strings"
@@ -133,3 +134,78 @@ func TestDeleteAllSessions_RetryWithReloginOnDeleteFailure(t *testing.T) {
 		t.Fatalf("expected refreshed token persisted, got %q", updated.Token)
 	}
 }
 type completionPayloadDSMock struct {
 	payload map[string]any
 }
 func (m *completionPayloadDSMock) Login(_ context.Context, _ config.Account) (string, error) {
 	return "new-token", nil
 }
 func (m *completionPayloadDSMock) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "session-id", nil
 }
 func (m *completionPayloadDSMock) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
 	return "pow-ok", nil
 }
 func (m *completionPayloadDSMock) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
 	m.payload = payload
 	return &http.Response{
 		StatusCode: http.StatusOK,
 		Body:       io.NopCloser(strings.NewReader("data: {\"v\":\"ok\"}\n\ndata: [DONE]\n\n")),
 	}, nil
 }
 func (m *completionPayloadDSMock) DeleteAllSessionsForToken(_ context.Context, _ string) error {
 	return nil
 }
 func (m *completionPayloadDSMock) GetSessionCountForToken(_ context.Context, _ string) (*deepseek.SessionStats, error) {
 	return &deepseek.SessionStats{Success: true}, nil
 }
 func TestTestAccount_MessageModeUsesExpertModelTypeForExpertModel(t *testing.T) {
 	t.Setenv("DS2API_CONFIG_JSON", `{"accounts":[{"email":"batch@example.com","password":"pwd","token":"seed-token"}]}`)
 	store := config.LoadStore()
 	ds := &completionPayloadDSMock{}
 	h := &Handler{Store: store, DS: ds}
 	acc, ok := store.FindAccount("batch@example.com")
 	if !ok {
 		t.Fatal("expected test account")
 	}
 	result := h.testAccount(context.Background(), acc, "deepseek-expert-chat", "hello")
 	if ok, _ := result["success"].(bool); !ok {
 		t.Fatalf("expected success=true, got %#v", result)
 	}
 	if got := ds.payload["model_type"]; got != "expert" {
 		t.Fatalf("expected model_type expert, got %#v", got)
 	}
 	if got := ds.payload["chat_session_id"]; got != "session-id" {
 		t.Fatalf("unexpected chat_session_id: %#v", got)
 	}
 }
 func TestTestAccount_MessageModeUsesVisionModelTypeForVisionModel(t *testing.T) {
 	t.Setenv("DS2API_CONFIG_JSON", `{"accounts":[{"email":"batch@example.com","password":"pwd","token":"seed-token"}]}`)
 	store := config.LoadStore()
 	ds := &completionPayloadDSMock{}
 	h := &Handler{Store: store, DS: ds}
 	acc, ok := store.FindAccount("batch@example.com")
 	if !ok {
 		t.Fatal("expected test account")
 	}
 	result := h.testAccount(context.Background(), acc, "deepseek-vision-chat", "hello")
 	if ok, _ := result["success"].(bool); !ok {
 		t.Fatalf("expected success=true, got %#v", result)
 	}
 	if got := ds.payload["model_type"]; got != "vision" {
 		t.Fatalf("expected model_type vision, got %#v", got)
 	}
 }
--- a/internal/admin/handler_config_read.go
+++ b/internal/admin/handler_config_read.go
@@ -3,6 +3,8 @@ package admin
 import (
 	"net/http"
 	"strings"
 	"ds2api/internal/config"
 )
 func (h *Handler) getConfig(w http.ResponseWriter, _ *http.Request) {
@@ -10,6 +12,7 @@ func (h *Handler) getConfig(w http.ResponseWriter, _ *http.Request) {
 	safe := map[string]any{
 		"keys":                  snap.Keys,
 		"accounts":              []map[string]any{},
 		"proxies":               []map[string]any{},
 		"env_backed":            h.Store.IsEnvBacked(),
 		"env_source_present":    h.Store.HasEnvConfigSource(),
 		"env_writeback_enabled": h.Store.IsEnvWritebackEnabled(),
@@ -36,12 +39,27 @@ func (h *Handler) getConfig(w http.ResponseWriter, _ *http.Request) {
 			"identifier":    acc.Identifier(),
 			"email":         acc.Email,
 			"mobile":        acc.Mobile,
 			"proxy_id":      acc.ProxyID,
 			"has_password":  strings.TrimSpace(acc.Password) != "",
 			"has_token":     token != "",
 			"token_preview": preview,
 		})
 	}
 	safe["accounts"] = accounts
 	proxies := make([]map[string]any, 0, len(snap.Proxies))
 	for _, proxy := range snap.Proxies {
 		proxy = config.NormalizeProxy(proxy)
 		proxies = append(proxies, map[string]any{
 			"id":           proxy.ID,
 			"name":         proxy.Name,
 			"type":         proxy.Type,
 			"host":         proxy.Host,
 			"port":         proxy.Port,
 			"username":     proxy.Username,
 			"has_password": strings.TrimSpace(proxy.Password) != "",
 		})
 	}
 	safe["proxies"] = proxies
 	writeJSON(w, http.StatusOK, safe)
 }
--- a/internal/admin/handler_config_write.go
+++ b/internal/admin/handler_config_write.go
@@ -85,7 +85,7 @@ func (h *Handler) addKey(w http.ResponseWriter, r *http.Request) {
 	err := h.Store.Update(func(c *config.Config) error {
 		for _, k := range c.Keys {
 			if k == key {
-				return fmt.Errorf("Key 已存在")
+				return fmt.Errorf("key 已存在")
 			}
 		}
 		c.Keys = append(c.Keys, key)
@@ -109,7 +109,7 @@ func (h *Handler) deleteKey(w http.ResponseWriter, r *http.Request) {
 			}
 		}
 		if idx < 0 {
-			return fmt.Errorf("Key 不存在")
+			return fmt.Errorf("key 不存在")
 		}
 		c.Keys = append(c.Keys[:idx], c.Keys[idx+1:]...)
 		return nil
--- a/internal/admin/handler_proxies.go
+++ b/internal/admin/handler_proxies.go
@@ -0,0 +1,202 @@
 package admin
 import (
 	"context"
 	"encoding/json"
 	"net/http"
 	"net/url"
 	"strings"
 	"github.com/go-chi/chi/v5"
 	"ds2api/internal/config"
 	"ds2api/internal/deepseek"
 )
 var proxyConnectivityTester = func(ctx context.Context, proxy config.Proxy) map[string]any {
 	return deepseek.TestProxyConnectivity(ctx, proxy)
 }
 func validateProxyMutation(cfg *config.Config) error {
 	if cfg == nil {
 		return nil
 	}
 	if err := config.ValidateProxyConfig(cfg.Proxies); err != nil {
 		return err
 	}
 	return config.ValidateAccountProxyReferences(cfg.Accounts, cfg.Proxies)
 }
 func proxyResponse(proxy config.Proxy) map[string]any {
 	proxy = config.NormalizeProxy(proxy)
 	return map[string]any{
 		"id":           proxy.ID,
 		"name":         proxy.Name,
 		"type":         proxy.Type,
 		"host":         proxy.Host,
 		"port":         proxy.Port,
 		"username":     proxy.Username,
 		"has_password": strings.TrimSpace(proxy.Password) != "",
 	}
 }
 func (h *Handler) listProxies(w http.ResponseWriter, _ *http.Request) {
 	proxies := h.Store.Snapshot().Proxies
 	items := make([]map[string]any, 0, len(proxies))
 	for _, proxy := range proxies {
 		proxy = config.NormalizeProxy(proxy)
 		items = append(items, map[string]any{
 			"id":           proxy.ID,
 			"name":         proxy.Name,
 			"type":         proxy.Type,
 			"host":         proxy.Host,
 			"port":         proxy.Port,
 			"username":     proxy.Username,
 			"has_password": strings.TrimSpace(proxy.Password) != "",
 		})
 	}
 	writeJSON(w, http.StatusOK, map[string]any{"items": items, "total": len(items)})
 }
 func (h *Handler) addProxy(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
 	proxy := toProxy(req)
 	err := h.Store.Update(func(c *config.Config) error {
 		c.Proxies = append(c.Proxies, proxy)
 		return validateProxyMutation(c)
 	})
 	if err != nil {
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
 		return
 	}
 	writeJSON(w, http.StatusOK, map[string]any{"success": true, "proxy": proxyResponse(proxy)})
 }
 func (h *Handler) updateProxy(w http.ResponseWriter, r *http.Request) {
 	proxyID := chi.URLParam(r, "proxyID")
 	if decoded, err := url.PathUnescape(proxyID); err == nil {
 		proxyID = decoded
 	}
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
 	proxy := toProxy(req)
 	proxy.ID = strings.TrimSpace(proxyID)
 	err := h.Store.Update(func(c *config.Config) error {
 		for i, existing := range c.Proxies {
 			existing = config.NormalizeProxy(existing)
 			if existing.ID != proxy.ID {
 				continue
 			}
 			if proxy.Password == "" {
 				proxy.Password = existing.Password
 			}
 			c.Proxies[i] = proxy
 			return validateProxyMutation(c)
 		}
 		return newRequestError("代理不存在")
 	})
 	if err != nil {
 		if detail, ok := requestErrorDetail(err); ok {
 			writeJSON(w, http.StatusNotFound, map[string]any{"detail": detail})
 			return
 		}
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
 		return
 	}
 	writeJSON(w, http.StatusOK, map[string]any{"success": true, "proxy": proxyResponse(proxy)})
 }
 func (h *Handler) deleteProxy(w http.ResponseWriter, r *http.Request) {
 	proxyID := chi.URLParam(r, "proxyID")
 	if decoded, err := url.PathUnescape(proxyID); err == nil {
 		proxyID = decoded
 	}
 	err := h.Store.Update(func(c *config.Config) error {
 		idx := -1
 		for i, existing := range c.Proxies {
 			existing = config.NormalizeProxy(existing)
 			if existing.ID == strings.TrimSpace(proxyID) {
 				idx = i
 				break
 			}
 		}
 		if idx < 0 {
 			return newRequestError("代理不存在")
 		}
 		c.Proxies = append(c.Proxies[:idx], c.Proxies[idx+1:]...)
 		for i := range c.Accounts {
 			if strings.TrimSpace(c.Accounts[i].ProxyID) == strings.TrimSpace(proxyID) {
 				c.Accounts[i].ProxyID = ""
 			}
 		}
 		return validateProxyMutation(c)
 	})
 	if err != nil {
 		if detail, ok := requestErrorDetail(err); ok {
 			writeJSON(w, http.StatusNotFound, map[string]any{"detail": detail})
 			return
 		}
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
 		return
 	}
 	writeJSON(w, http.StatusOK, map[string]any{"success": true})
 }
 func (h *Handler) testProxy(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
 	proxyID := fieldString(req, "proxy_id")
 	var proxy config.Proxy
 	if proxyID != "" {
 		var ok bool
 		proxy, ok = findProxyByID(h.Store.Snapshot(), proxyID)
 		if !ok {
 			writeJSON(w, http.StatusNotFound, map[string]any{"detail": "代理不存在"})
 			return
 		}
 	} else {
 		proxy = toProxy(req)
 	}
 	result := proxyConnectivityTester(r.Context(), proxy)
 	writeJSON(w, http.StatusOK, result)
 }
 func (h *Handler) updateAccountProxy(w http.ResponseWriter, r *http.Request) {
 	identifier := chi.URLParam(r, "identifier")
 	if decoded, err := url.PathUnescape(identifier); err == nil {
 		identifier = decoded
 	}
 	var req map[string]any
 	_ = json.NewDecoder(r.Body).Decode(&req)
 	proxyID := fieldString(req, "proxy_id")
 	err := h.Store.Update(func(c *config.Config) error {
 		if proxyID != "" {
 			if _, ok := findProxyByID(*c, proxyID); !ok {
 				return newRequestError("代理不存在")
 			}
 		}
 		for i, acc := range c.Accounts {
 			if !accountMatchesIdentifier(acc, identifier) {
 				continue
 			}
 			c.Accounts[i].ProxyID = proxyID
 			return validateProxyMutation(c)
 		}
 		return newRequestError("账号不存在")
 	})
 	if err != nil {
 		if detail, ok := requestErrorDetail(err); ok {
 			writeJSON(w, http.StatusBadRequest, map[string]any{"detail": detail})
 			return
 		}
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
 		return
 	}
 	h.Pool.Reset()
 	writeJSON(w, http.StatusOK, map[string]any{"success": true, "proxy_id": proxyID})
 }
--- a/internal/admin/handler_proxies_test.go
+++ b/internal/admin/handler_proxies_test.go
@@ -0,0 +1,227 @@
 package admin
 import (
 	"bytes"
 	"context"
 	"encoding/json"
 	"net/http"
 	"net/http/httptest"
 	"testing"
 	"github.com/go-chi/chi/v5"
 	"ds2api/internal/account"
 	"ds2api/internal/config"
 )
 func newAdminProxyTestHandler(t *testing.T, raw string) *Handler {
 	t.Helper()
 	t.Setenv("DS2API_CONFIG_JSON", raw)
 	store := config.LoadStore()
 	return &Handler{
 		Store: store,
 		Pool:  account.NewPool(store),
 	}
 }
 func TestAddProxyPersistsNormalizedProxy(t *testing.T) {
 	h := newAdminProxyTestHandler(t, `{"accounts":[]}`)
 	r := chi.NewRouter()
 	r.Post("/admin/proxies", h.addProxy)
 	req := httptest.NewRequest(http.MethodPost, "/admin/proxies", bytes.NewBufferString(`{
 		"name":"  HK Exit  ",
 		"type":" SOCKS5H ",
 		"host":" 127.0.0.1 ",
 		"port":1081,
 		"username":" user ",
 		"password":" pass "
 	}`))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	proxies := h.Store.Snapshot().Proxies
 	if len(proxies) != 1 {
 		t.Fatalf("expected 1 proxy, got %d", len(proxies))
 	}
 	if proxies[0].Name != "HK Exit" {
 		t.Fatalf("unexpected proxy name: %#v", proxies[0])
 	}
 	if proxies[0].Type != "socks5h" {
 		t.Fatalf("unexpected proxy type: %#v", proxies[0])
 	}
 	if proxies[0].Username != "user" || proxies[0].Password != "pass" {
 		t.Fatalf("expected trimmed credentials, got %#v", proxies[0])
 	}
 	if proxies[0].ID == "" {
 		t.Fatalf("expected generated proxy id, got %#v", proxies[0])
 	}
 }
 func TestAddProxyDoesNotFailOnUnrelatedInvalidRuntimeConfig(t *testing.T) {
 	router := newHTTPAdminHarness(t, `{
 		"keys":["k1"],
 		"runtime":{
 			"account_max_inflight":8,
 			"global_max_inflight":4
 		}
 	}`, &testingDSMock{})
 	rec := httptest.NewRecorder()
 	router.ServeHTTP(rec, adminReq(http.MethodPost, "/proxies", []byte(`{
 		"name":"HK Exit",
 		"type":"socks5h",
 		"host":"127.0.0.1",
 		"port":1080
 	}`)))
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected add proxy success despite unrelated runtime issue, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	readRec := httptest.NewRecorder()
 	router.ServeHTTP(readRec, adminReq(http.MethodGet, "/config", nil))
 	if readRec.Code != http.StatusOK {
 		t.Fatalf("config read status=%d body=%s", readRec.Code, readRec.Body.String())
 	}
 	var payload map[string]any
 	if err := json.Unmarshal(readRec.Body.Bytes(), &payload); err != nil {
 		t.Fatalf("decode config response: %v", err)
 	}
 	proxies, _ := payload["proxies"].([]any)
 	if len(proxies) != 1 {
 		t.Fatalf("expected proxy to be persisted, got %#v", payload["proxies"])
 	}
 }
 func TestDeleteProxyClearsAssignedAccountProxyID(t *testing.T) {
 	h := newAdminProxyTestHandler(t, `{
 		"proxies":[{"id":"proxy-1","name":"Node 1","type":"socks5","host":"127.0.0.1","port":1080}],
 		"accounts":[{"email":"u@example.com","password":"pwd","proxy_id":"proxy-1"}]
 	}`)
 	r := chi.NewRouter()
 	r.Delete("/admin/proxies/{proxyID}", h.deleteProxy)
 	req := httptest.NewRequest(http.MethodDelete, "/admin/proxies/proxy-1", nil)
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	snap := h.Store.Snapshot()
 	if len(snap.Proxies) != 0 {
 		t.Fatalf("expected proxy removed, got %#v", snap.Proxies)
 	}
 	if len(snap.Accounts) != 1 {
 		t.Fatalf("expected account kept, got %#v", snap.Accounts)
 	}
 	if snap.Accounts[0].ProxyID != "" {
 		t.Fatalf("expected proxy assignment cleared, got %#v", snap.Accounts[0])
 	}
 }
 func TestUpdateProxyResponseDoesNotExposeStoredPassword(t *testing.T) {
 	h := newAdminProxyTestHandler(t, `{
 		"proxies":[{"id":"proxy-1","name":"Node 1","type":"socks5h","host":"127.0.0.1","port":1080,"username":"u","password":"secret"}]
 	}`)
 	r := chi.NewRouter()
 	r.Put("/admin/proxies/{proxyID}", h.updateProxy)
 	req := httptest.NewRequest(http.MethodPut, "/admin/proxies/proxy-1", bytes.NewBufferString(`{
 		"name":"Node 1",
 		"type":"socks5h",
 		"host":"127.0.0.2",
 		"port":1081,
 		"username":"u2"
 	}`))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	var payload map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &payload); err != nil {
 		t.Fatalf("decode response: %v", err)
 	}
 	proxy, _ := payload["proxy"].(map[string]any)
 	if _, exists := proxy["password"]; exists {
 		t.Fatalf("response should not expose password, got %#v", proxy)
 	}
 	if hasPassword, _ := proxy["has_password"].(bool); !hasPassword {
 		t.Fatalf("expected has_password=true, got %#v", proxy)
 	}
 }
 func TestUpdateAccountProxyAssignsProxyID(t *testing.T) {
 	h := newAdminProxyTestHandler(t, `{
 		"proxies":[{"id":"proxy-1","name":"Node 1","type":"socks5h","host":"127.0.0.1","port":1080}],
 		"accounts":[{"email":"u@example.com","password":"pwd"}]
 	}`)
 	r := chi.NewRouter()
 	r.Put("/admin/accounts/{identifier}/proxy", h.updateAccountProxy)
 	req := httptest.NewRequest(http.MethodPut, "/admin/accounts/u@example.com/proxy", bytes.NewBufferString(`{"proxy_id":"proxy-1"}`))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	acc, ok := h.Store.FindAccount("u@example.com")
 	if !ok {
 		t.Fatal("expected account")
 	}
 	if acc.ProxyID != "proxy-1" {
 		t.Fatalf("expected proxy assigned, got %#v", acc)
 	}
 }
 func TestTestProxyUsesStoredProxy(t *testing.T) {
 	h := newAdminProxyTestHandler(t, `{
 		"proxies":[{"id":"proxy-1","name":"Node 1","type":"socks5h","host":"127.0.0.1","port":1080}]
 	}`)
 	original := proxyConnectivityTester
 	defer func() { proxyConnectivityTester = original }()
 	var got config.Proxy
 	proxyConnectivityTester = func(_ context.Context, proxy config.Proxy) map[string]any {
 		got = proxy
 		return map[string]any{
 			"success":       true,
 			"proxy_id":      proxy.ID,
 			"proxy_type":    proxy.Type,
 			"response_time": 12,
 		}
 	}
 	r := chi.NewRouter()
 	r.Post("/admin/proxies/test", h.testProxy)
 	req := httptest.NewRequest(http.MethodPost, "/admin/proxies/test", bytes.NewBufferString(`{"proxy_id":"proxy-1"}`))
 	rec := httptest.NewRecorder()
 	r.ServeHTTP(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
 	}
 	if got.ID != "proxy-1" || got.Type != "socks5h" {
 		t.Fatalf("expected stored proxy passed to tester, got %#v", got)
 	}
 	var payload map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &payload); err != nil {
 		t.Fatalf("decode response: %v", err)
 	}
 	if ok, _ := payload["success"].(bool); !ok {
 		t.Fatalf("expected success payload, got %#v", payload)
 	}
 }
--- a/internal/admin/handler_raw_samples.go
+++ b/internal/admin/handler_raw_samples.go
@@ -8,6 +8,7 @@ import (
 	"net/http"
 	"net/http/httptest"
 	"net/url"
 	"sort"
 	"strings"
 	"ds2api/internal/config"
@@ -15,6 +16,11 @@ import (
 	"ds2api/internal/rawsample"
 )
 type captureChain struct {
 	Key     string
 	Entries []devcapture.Entry
 }
 func (h *Handler) captureRawSample(w http.ResponseWriter, r *http.Request) {
 	if h.OpenAI == nil {
 		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": "OpenAI handler is not configured"})
@@ -231,3 +237,312 @@ func cloneMap(in map[string]any) map[string]any {
 	}
 	return out
 }
 func (h *Handler) queryRawSampleCaptures(w http.ResponseWriter, r *http.Request) {
 	query := strings.TrimSpace(r.URL.Query().Get("q"))
 	limit := intFromQuery(r, "limit", 20)
 	if limit <= 0 {
 		limit = 20
 	}
 	if limit > 50 {
 		limit = 50
 	}
 	chains := buildCaptureChains(devcapture.Global().Snapshot())
 	items := make([]map[string]any, 0, len(chains))
 	for _, chain := range chains {
 		if query != "" && !captureChainMatchesQuery(chain, query) {
 			continue
 		}
 		items = append(items, buildCaptureChainQueryItem(chain, query))
 		if len(items) >= limit {
 			break
 		}
 	}
 	writeJSON(w, http.StatusOK, map[string]any{
 		"query": query,
 		"limit": limit,
 		"count": len(items),
 		"items": items,
 	})
 }
 func (h *Handler) saveRawSampleFromCaptures(w http.ResponseWriter, r *http.Request) {
 	var req map[string]any
 	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "invalid json"})
 		return
 	}
 	snapshot := devcapture.Global().Snapshot()
 	if len(snapshot) == 0 {
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "no capture logs available"})
 		return
 	}
 	chain, err := resolveCaptureChainSelection(snapshot, req)
 	if err != nil {
 		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
 		return
 	}
 	sampleID := strings.TrimSpace(fieldString(req, "sample_id"))
 	source := strings.TrimSpace(fieldString(req, "source"))
 	if source == "" {
 		source = "admin/dev/raw-samples/save"
 	}
 	requestPayload := captureChainRequestPayload(chain)
 	saved, err := rawsample.Persist(rawsample.PersistOptions{
 		RootDir:      config.RawStreamSampleRoot(),
 		SampleID:     sampleID,
 		Source:       source,
 		Request:      requestPayload,
 		Capture:      captureSummaryFromEntries(chain.Entries),
 		UpstreamBody: combineCaptureBodies(chain.Entries),
 	})
 	if err != nil {
 		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
 		return
 	}
 	writeJSON(w, http.StatusOK, map[string]any{
 		"success":       true,
 		"sample_id":     saved.SampleID,
 		"sample_dir":    saved.Dir,
 		"meta_path":     saved.MetaPath,
 		"upstream_path": saved.UpstreamPath,
 		"chain_key":     chain.Key,
 		"capture_ids":   captureChainIDs(chain),
 		"round_count":   len(chain.Entries),
 	})
 }
 func buildCaptureChains(snapshot []devcapture.Entry) []captureChain {
 	if len(snapshot) == 0 {
 		return nil
 	}
 	ordered := make([]devcapture.Entry, len(snapshot))
 	// devcapture snapshots are newest-first because the store prepends entries.
 	// Reverse once so equal-second timestamps can preserve the actual capture
 	// order (completion before continue) under the stable CreatedAt sort below.
 	for i := range snapshot {
 		ordered[len(snapshot)-1-i] = snapshot[i]
 	}
 	sort.SliceStable(ordered, func(i, j int) bool {
 		return ordered[i].CreatedAt < ordered[j].CreatedAt
 	})
 	byKey := make(map[string]*captureChain, len(ordered))
 	keys := make([]string, 0, len(ordered))
 	for _, entry := range ordered {
 		key := captureChainKey(entry)
 		if key == "" {
 			key = "capture:" + entry.ID
 		}
 		if _, ok := byKey[key]; !ok {
 			byKey[key] = &captureChain{Key: key}
 			keys = append(keys, key)
 		}
 		byKey[key].Entries = append(byKey[key].Entries, entry)
 	}
 	chains := make([]captureChain, 0, len(keys))
 	for _, key := range keys {
 		chains = append(chains, *byKey[key])
 	}
 	sort.SliceStable(chains, func(i, j int) bool {
 		return latestCreatedAt(chains[i]) > latestCreatedAt(chains[j])
 	})
 	return chains
 }
 func captureChainKey(entry devcapture.Entry) string {
 	req := parseCaptureRequestBody(entry.RequestBody)
 	if sessionID := strings.TrimSpace(fieldString(req, "chat_session_id")); sessionID != "" {
 		return "session:" + sessionID
 	}
 	return "capture:" + entry.ID
 }
 func parseCaptureRequestBody(raw string) map[string]any {
 	raw = strings.TrimSpace(raw)
 	if raw == "" {
 		return nil
 	}
 	var out map[string]any
 	if err := json.Unmarshal([]byte(raw), &out); err != nil {
 		return nil
 	}
 	return out
 }
 func latestCreatedAt(chain captureChain) int64 {
 	var latest int64
 	for _, entry := range chain.Entries {
 		if entry.CreatedAt > latest {
 			latest = entry.CreatedAt
 		}
 	}
 	return latest
 }
 func captureChainMatchesQuery(chain captureChain, query string) bool {
 	query = strings.ToLower(strings.TrimSpace(query))
 	if query == "" {
 		return true
 	}
 	for _, entry := range chain.Entries {
 		hay := strings.ToLower(strings.Join([]string{
 			entry.Label,
 			entry.URL,
 			entry.AccountID,
 			entry.RequestBody,
 			entry.ResponseBody,
 		}, "\n"))
 		if strings.Contains(hay, query) {
 			return true
 		}
 	}
 	return false
 }
 func buildCaptureChainQueryItem(chain captureChain, query string) map[string]any {
 	first := chain.Entries[0]
 	last := chain.Entries[len(chain.Entries)-1]
 	requestPreview := previewCaptureChainRequest(chain)
 	responsePreview := previewCaptureChainResponse(chain)
 	return map[string]any{
 		"chain_key":          chain.Key,
 		"capture_ids":        captureChainIDs(chain),
 		"created_at":         latestCreatedAt(chain),
 		"round_count":        len(chain.Entries),
 		"account_id":         nilIfEmpty(strings.TrimSpace(first.AccountID)),
 		"initial_label":      first.Label,
 		"initial_url":        first.URL,
 		"latest_label":       last.Label,
 		"latest_url":         last.URL,
 		"request_preview":    requestPreview,
 		"response_preview":   responsePreview,
 		"query":              query,
 		"response_truncated": captureChainHasTruncatedResponse(chain),
 	}
 }
 func captureChainIDs(chain captureChain) []string {
 	out := make([]string, 0, len(chain.Entries))
 	for _, entry := range chain.Entries {
 		out = append(out, entry.ID)
 	}
 	return out
 }
 func previewCaptureChainRequest(chain captureChain) string {
 	for _, entry := range chain.Entries {
 		req := parseCaptureRequestBody(entry.RequestBody)
 		if prompt := strings.TrimSpace(fieldString(req, "prompt")); prompt != "" {
 			return previewText(prompt, 280)
 		}
 		if messages, ok := req["messages"].([]any); ok {
 			var parts []string
 			for _, item := range messages {
 				m, _ := item.(map[string]any)
 				content := strings.TrimSpace(fieldString(m, "content"))
 				if content != "" {
 					parts = append(parts, content)
 				}
 			}
 			if len(parts) > 0 {
 				return previewText(strings.Join(parts, "\n"), 280)
 			}
 		}
 	}
 	return previewText(strings.TrimSpace(chain.Entries[0].RequestBody), 280)
 }
 func previewCaptureChainResponse(chain captureChain) string {
 	var b strings.Builder
 	for _, entry := range chain.Entries {
 		if b.Len() > 0 {
 			b.WriteByte('\n')
 		}
 		b.WriteString(strings.TrimSpace(entry.ResponseBody))
 		if b.Len() >= 280 {
 			break
 		}
 	}
 	return previewText(b.String(), 280)
 }
 func previewText(text string, limit int) string {
 	text = strings.TrimSpace(text)
 	if limit <= 0 || len(text) <= limit {
 		return text
 	}
 	return text[:limit] + "..."
 }
 func captureChainHasTruncatedResponse(chain captureChain) bool {
 	for _, entry := range chain.Entries {
 		if entry.ResponseTruncated {
 			return true
 		}
 	}
 	return false
 }
 func resolveCaptureChainSelection(snapshot []devcapture.Entry, req map[string]any) (captureChain, error) {
 	chains := buildCaptureChains(snapshot)
 	if len(chains) == 0 {
 		return captureChain{}, fmt.Errorf("no capture logs available")
 	}
 	if chainKey := strings.TrimSpace(fieldString(req, "chain_key")); chainKey != "" {
 		for _, chain := range chains {
 			if chain.Key == chainKey {
 				return chain, nil
 			}
 		}
 		return captureChain{}, fmt.Errorf("capture chain not found")
 	}
 	captureID := strings.TrimSpace(fieldString(req, "capture_id"))
 	if captureID == "" {
 		if ids, ok := toStringSlice(req["capture_ids"]); ok && len(ids) > 0 {
 			captureID = strings.TrimSpace(ids[0])
 		}
 	}
 	if captureID != "" {
 		for _, chain := range chains {
 			for _, entry := range chain.Entries {
 				if entry.ID == captureID {
 					return chain, nil
 				}
 			}
 		}
 		return captureChain{}, fmt.Errorf("capture id not found")
 	}
 	query := strings.TrimSpace(fieldString(req, "query"))
 	if query != "" {
 		for _, chain := range chains {
 			if captureChainMatchesQuery(chain, query) {
 				return chain, nil
 			}
 		}
 		return captureChain{}, fmt.Errorf("no capture chain matched query")
 	}
 	return captureChain{}, fmt.Errorf("capture_id, chain_key, or query is required")
 }
 func captureChainRequestPayload(chain captureChain) any {
 	for _, entry := range chain.Entries {
 		if req := parseCaptureRequestBody(entry.RequestBody); req != nil {
 			return req
 		}
 	}
 	return strings.TrimSpace(chain.Entries[0].RequestBody)
 }
--- a/internal/admin/handler_raw_samples_test.go
+++ b/internal/admin/handler_raw_samples_test.go
@@ -230,3 +230,160 @@ func TestCombineCaptureBodiesPreservesOrderAndSeparators(t *testing.T) {
 		t.Fatalf("unexpected combined body: %q", string(got))
 	}
 }
 func TestQueryRawSampleCapturesGroupsBySessionAndMatchesQuestion(t *testing.T) {
 	devcapture.Global().Clear()
 	defer devcapture.Global().Clear()
 	recordCapturedResponse(
 		"deepseek_completion",
 		"https://chat.deepseek.com/api/v0/chat/completion",
 		http.StatusOK,
 		map[string]any{
 			"chat_session_id": "session-query-1",
 			"prompt":          "用户问题：广州天气怎么样？",
 		},
 		"data: {\"v\":\"先看天气\"}\n\n",
 	)
 	recordCapturedResponse(
 		"deepseek_continue",
 		"https://chat.deepseek.com/api/v0/chat/continue",
 		http.StatusOK,
 		map[string]any{
 			"chat_session_id": "session-query-1",
 			"message_id":      2,
 		},
 		"data: {\"v\":\"再补充一点\"}\n\n",
 	)
 	h := &Handler{}
 	rec := httptest.NewRecorder()
 	req := httptest.NewRequest(http.MethodGet, "/admin/dev/raw-samples/query?q=广州天气", nil)
 	h.queryRawSampleCaptures(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode failed: %v", err)
 	}
 	items, _ := out["items"].([]any)
 	if len(items) != 1 {
 		t.Fatalf("expected 1 item, got %d body=%s", len(items), rec.Body.String())
 	}
 	item, _ := items[0].(map[string]any)
 	if item["chain_key"] != "session:session-query-1" {
 		t.Fatalf("unexpected chain key: %#v", item["chain_key"])
 	}
 	if int(item["round_count"].(float64)) != 2 {
 		t.Fatalf("expected 2 rounds, got %#v", item["round_count"])
 	}
 	reqPreview, _ := item["request_preview"].(string)
 	if !strings.Contains(reqPreview, "广州天气") {
 		t.Fatalf("expected request preview to contain query, got %q", reqPreview)
 	}
 }
 func TestBuildCaptureChainsPreservesCaptureOrderWhenTimestampsCollide(t *testing.T) {
 	snapshot := []devcapture.Entry{
 		{
 			ID:           "cap_continue",
 			CreatedAt:    1712365200,
 			Label:        "deepseek_continue",
 			RequestBody:  `{"chat_session_id":"session-collision","message_id":2}`,
 			ResponseBody: "data: {\"v\":\"第二段\"}\n\n",
 		},
 		{
 			ID:           "cap_completion",
 			CreatedAt:    1712365200,
 			Label:        "deepseek_completion",
 			RequestBody:  `{"chat_session_id":"session-collision","prompt":"题目"}`,
 			ResponseBody: "data: {\"v\":\"第一段\"}\n\n",
 		},
 	}
 	chains := buildCaptureChains(snapshot)
 	if len(chains) != 1 {
 		t.Fatalf("expected 1 chain, got %d", len(chains))
 	}
 	if len(chains[0].Entries) != 2 {
 		t.Fatalf("expected 2 entries, got %d", len(chains[0].Entries))
 	}
 	if chains[0].Entries[0].Label != "deepseek_completion" {
 		t.Fatalf("expected completion first, got %#v", chains[0].Entries)
 	}
 	if chains[0].Entries[1].Label != "deepseek_continue" {
 		t.Fatalf("expected continue second, got %#v", chains[0].Entries)
 	}
 }
 func TestSaveRawSampleFromCapturesPersistsSelectedChain(t *testing.T) {
 	root := t.TempDir()
 	t.Setenv("DS2API_RAW_STREAM_SAMPLE_ROOT", root)
 	devcapture.Global().Clear()
 	defer devcapture.Global().Clear()
 	recordCapturedResponse(
 		"deepseek_completion",
 		"https://chat.deepseek.com/api/v0/chat/completion",
 		http.StatusOK,
 		map[string]any{
 			"chat_session_id": "session-save-1",
 			"prompt":          "请回答深圳天气",
 		},
 		"data: {\"v\":\"第一段\"}\n\n",
 	)
 	recordCapturedResponse(
 		"deepseek_continue",
 		"https://chat.deepseek.com/api/v0/chat/continue",
 		http.StatusOK,
 		map[string]any{
 			"chat_session_id": "session-save-1",
 			"message_id":      2,
 		},
 		"data: {\"v\":\"第二段\"}\n\n",
 	)
 	h := &Handler{}
 	rec := httptest.NewRecorder()
 	reqBody := `{"query":"深圳天气","sample_id":"saved-from-memory"}`
 	req := httptest.NewRequest(http.MethodPost, "/admin/dev/raw-samples/save", strings.NewReader(reqBody))
 	h.saveRawSampleFromCaptures(rec, req)
 	if rec.Code != http.StatusOK {
 		t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
 	}
 	var out map[string]any
 	if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
 		t.Fatalf("decode failed: %v", err)
 	}
 	if out["sample_id"] != "saved-from-memory" {
 		t.Fatalf("unexpected sample id: %#v", out["sample_id"])
 	}
 	if int(out["round_count"].(float64)) != 2 {
 		t.Fatalf("expected round_count=2, got %#v", out["round_count"])
 	}
 	sampleDir := filepath.Join(root, "saved-from-memory")
 	upstreamBytes, err := os.ReadFile(filepath.Join(sampleDir, "upstream.stream.sse"))
 	if err != nil {
 		t.Fatalf("read upstream: %v", err)
 	}
 	upstream := string(upstreamBytes)
 	if !strings.Contains(upstream, "第一段") || !strings.Contains(upstream, "第二段") {
 		t.Fatalf("expected combined upstream, got %q", upstream)
 	}
 	metaBytes, err := os.ReadFile(filepath.Join(sampleDir, "meta.json"))
 	if err != nil {
 		t.Fatalf("read meta: %v", err)
 	}
 	var meta map[string]any
 	if err := json.Unmarshal(metaBytes, &meta); err != nil {
 		t.Fatalf("decode meta: %v", err)
 	}
 	reqMeta, _ := meta["request"].(map[string]any)
 	if fieldString(reqMeta, "chat_session_id") != "session-save-1" {
 		t.Fatalf("expected request to come from selected chain, got %#v", meta["request"])
 	}
 }
--- a/internal/admin/handler_vercel.go
+++ b/internal/admin/handler_vercel.go
@@ -301,7 +301,7 @@ func vercelRequest(ctx context.Context, client *http.Client, method, endpoint st
 	if err != nil {
 		return nil, 0, err
 	}
-	defer resp.Body.Close()
+	defer func() { _ = resp.Body.Close() }()
 	b, _ := io.ReadAll(resp.Body)
 	parsed := map[string]any{}
 	_ = json.Unmarshal(b, &parsed)
--- a/internal/admin/handler_version.go
+++ b/internal/admin/handler_version.go
@@ -43,7 +43,7 @@ func (h *Handler) getVersion(w http.ResponseWriter, _ *http.Request) {
 		writeJSON(w, http.StatusOK, resp)
 		return
 	}
-	defer r.Body.Close()
+	defer func() { _ = r.Body.Close() }()
 	if r.StatusCode < 200 || r.StatusCode >= 300 {
 		resp["check_error"] = "github api status: " + r.Status
 		writeJSON(w, http.StatusOK, resp)
--- a/internal/admin/helpers.go
+++ b/internal/admin/helpers.go
@@ -65,6 +65,7 @@ func toAccount(m map[string]any) config.Account {
 		Email:    email,
 		Mobile:   mobile,
 		Password: fieldString(m, "password"),
 		ProxyID:  fieldString(m, "proxy_id"),
 	}
 }
@@ -100,9 +101,36 @@ func accountMatchesIdentifier(acc config.Account, identifier string) bool {
 func normalizeAccountForStorage(acc config.Account) config.Account {
 	acc.Email = strings.TrimSpace(acc.Email)
 	acc.Mobile = config.NormalizeMobileForStorage(acc.Mobile)
 	acc.ProxyID = strings.TrimSpace(acc.ProxyID)
 	return acc
 }
 func toProxy(m map[string]any) config.Proxy {
 	return config.NormalizeProxy(config.Proxy{
 		ID:       fieldString(m, "id"),
 		Name:     fieldString(m, "name"),
 		Type:     fieldString(m, "type"),
 		Host:     fieldString(m, "host"),
 		Port:     intFrom(m["port"]),
 		Username: fieldString(m, "username"),
 		Password: fieldString(m, "password"),
 	})
 }
 func findProxyByID(c config.Config, proxyID string) (config.Proxy, bool) {
 	id := strings.TrimSpace(proxyID)
 	if id == "" {
 		return config.Proxy{}, false
 	}
 	for _, proxy := range c.Proxies {
 		proxy = config.NormalizeProxy(proxy)
 		if proxy.ID == id {
 			return proxy, true
 		}
 	}
 	return config.Proxy{}, false
 }
 func accountDedupeKey(acc config.Account) string {
 	if email := strings.TrimSpace(acc.Email); email != "" {
 		return "email:" + email
--- a/internal/auth/auth_edge_test.go
+++ b/internal/auth/auth_edge_test.go
@@ -130,9 +130,7 @@ func TestMarkTokenInvalidNotConfigToken(t *testing.T) {
 	a := &RequestAuth{UseConfigToken: false, DeepSeekToken: "direct", resolver: r}
 	r.MarkTokenInvalid(a)
 	// Should not panic, token should be unchanged for non-config
-	if a.DeepSeekToken != "" {
+	_ = a.DeepSeekToken // Actual behavior may clear it; this test only asserts no panic.
 		// Actually it does clear it; that's fine - let's check behavior
 	}
 }
 func TestMarkTokenInvalidEmptyAccountID(t *testing.T) {
--- a/internal/compat/go_compat_test.go
+++ b/internal/compat/go_compat_test.go
@@ -1,6 +1,7 @@
 package compat
 import (
 	"ds2api/internal/toolcall"
 	"encoding/json"
 	"os"
 	"path/filepath"
@@ -36,7 +37,6 @@ func TestGoCompatSSEFixtures(t *testing.T) {
 			Finished      bool             `json:"finished"`
 			NewType       string           `json:"new_type"`
 			ContentFilter bool             `json:"content_filter"`
 			OutputTokens  int              `json:"output_tokens"`
 			ErrorMessage  string           `json:"error_message"`
 		}
 		mustLoadJSON(t, expectedPath, &expected)
@@ -57,11 +57,10 @@ func TestGoCompatSSEFixtures(t *testing.T) {
 			res.Stop != expected.Finished ||
 			res.NextType != expected.NewType ||
 			res.ContentFilter != expected.ContentFilter ||
 			res.OutputTokens != expected.OutputTokens ||
 			res.ErrorMessage != expected.ErrorMessage {
-			t.Fatalf("fixture %s mismatch:\n got parts=%#v finished=%v newType=%q contentFilter=%v outputTokens=%d errorMessage=%q\nwant parts=%#v finished=%v newType=%q contentFilter=%v outputTokens=%d errorMessage=%q",
+			t.Fatalf("fixture %s mismatch:\n got parts=%#v finished=%v newType=%q contentFilter=%v errorMessage=%q\nwant parts=%#v finished=%v newType=%q contentFilter=%v errorMessage=%q",
-				name, gotParts, res.Stop, res.NextType, res.ContentFilter, res.OutputTokens, res.ErrorMessage,
+				name, gotParts, res.Stop, res.NextType, res.ContentFilter, res.ErrorMessage,
-				expected.Parts, expected.Finished, expected.NewType, expected.ContentFilter, expected.OutputTokens, expected.ErrorMessage)
+				expected.Parts, expected.Finished, expected.NewType, expected.ContentFilter, expected.ErrorMessage)
 		}
 	}
 }
@@ -86,22 +85,22 @@ func TestGoCompatToolcallFixtures(t *testing.T) {
 		mustLoadJSON(t, fixturePath, &fixture)
 		var expected struct {
-			Calls             []util.ParsedToolCall `json:"calls"`
+			Calls             []toolcall.ParsedToolCall `json:"calls"`
-			SawToolCallSyntax bool                  `json:"sawToolCallSyntax"`
+			SawToolCallSyntax bool                      `json:"sawToolCallSyntax"`
-			RejectedByPolicy  bool                  `json:"rejectedByPolicy"`
+			RejectedByPolicy  bool                      `json:"rejectedByPolicy"`
-			RejectedToolNames []string              `json:"rejectedToolNames"`
+			RejectedToolNames []string                  `json:"rejectedToolNames"`
 		}
 		mustLoadJSON(t, expectedPath, &expected)
-		var got util.ToolCallParseResult
+		var got toolcall.ToolCallParseResult
 		switch strings.ToLower(strings.TrimSpace(fixture.Mode)) {
 		case "standalone":
-			got = util.ParseStandaloneToolCallsDetailed(fixture.Text, fixture.ToolNames)
+			got = toolcall.ParseStandaloneToolCallsDetailed(fixture.Text, fixture.ToolNames)
 		default:
-			got = util.ParseToolCallsDetailed(fixture.Text, fixture.ToolNames)
+			got = toolcall.ParseToolCallsDetailed(fixture.Text, fixture.ToolNames)
 		}
 		if got.Calls == nil {
-			got.Calls = []util.ParsedToolCall{}
+			got.Calls = []toolcall.ParsedToolCall{}
 		}
 		if got.RejectedToolNames == nil {
 			got.RejectedToolNames = []string{}
--- a/internal/config/codec.go
+++ b/internal/config/codec.go
@@ -20,6 +20,9 @@ func (c Config) MarshalJSON() ([]byte, error) {
 	if len(c.Accounts) > 0 {
 		m["accounts"] = c.Accounts
 	}
 	if len(c.Proxies) > 0 {
 		m["proxies"] = c.Proxies
 	}
 	if len(c.ClaudeMapping) > 0 {
 		m["claude_mapping"] = c.ClaudeMapping
 	}
@@ -70,6 +73,10 @@ func (c *Config) UnmarshalJSON(b []byte) error {
 			if err := json.Unmarshal(v, &c.Accounts); err != nil {
 				return fmt.Errorf("invalid field %q: %w", k, err)
 			}
 		case "proxies":
 			if err := json.Unmarshal(v, &c.Proxies); err != nil {
 				return fmt.Errorf("invalid field %q: %w", k, err)
 			}
 		case "claude_mapping":
 			if err := json.Unmarshal(v, &c.ClaudeMapping); err != nil {
 				return fmt.Errorf("invalid field %q: %w", k, err)
@@ -130,6 +137,7 @@ func (c Config) Clone() Config {
 	clone := Config{
 		Keys:           slices.Clone(c.Keys),
 		Accounts:       slices.Clone(c.Accounts),
 		Proxies:        slices.Clone(c.Proxies),
 		ClaudeMapping:  cloneStringMap(c.ClaudeMapping),
 		ClaudeModelMap: cloneStringMap(c.ClaudeModelMap),
 		ModelAliases:   cloneStringMap(c.ModelAliases),
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -1,8 +1,16 @@
 package config
 import (
 	"crypto/sha1"
 	"encoding/hex"
 	"fmt"
 	"strings"
 )
 type Config struct {
 	Keys             []string          `json:"keys,omitempty"`
 	Accounts         []Account         `json:"accounts,omitempty"`
 	Proxies          []Proxy           `json:"proxies,omitempty"`
 	ClaudeMapping    map[string]string `json:"claude_mapping,omitempty"`
 	ClaudeModelMap   map[string]string `json:"claude_model_mapping,omitempty"`
 	ModelAliases     map[string]string `json:"model_aliases,omitempty"`
@@ -22,6 +30,38 @@ type Account struct {
 	Mobile   string `json:"mobile,omitempty"`
 	Password string `json:"password,omitempty"`
 	Token    string `json:"token,omitempty"`
 	ProxyID  string `json:"proxy_id,omitempty"`
 }
 type Proxy struct {
 	ID       string `json:"id,omitempty"`
 	Name     string `json:"name,omitempty"`
 	Type     string `json:"type,omitempty"`
 	Host     string `json:"host,omitempty"`
 	Port     int    `json:"port,omitempty"`
 	Username string `json:"username,omitempty"`
 	Password string `json:"password,omitempty"`
 }
 func NormalizeProxy(p Proxy) Proxy {
 	p.ID = strings.TrimSpace(p.ID)
 	p.Name = strings.TrimSpace(p.Name)
 	p.Type = strings.ToLower(strings.TrimSpace(p.Type))
 	p.Host = strings.TrimSpace(p.Host)
 	p.Username = strings.TrimSpace(p.Username)
 	p.Password = strings.TrimSpace(p.Password)
 	if p.ID == "" {
 		p.ID = StableProxyID(p)
 	}
 	if p.Name == "" && p.Host != "" && p.Port > 0 {
 		p.Name = fmt.Sprintf("%s:%d", p.Host, p.Port)
 	}
 	return p
 }
 func StableProxyID(p Proxy) string {
 	sum := sha1.Sum([]byte(strings.ToLower(strings.TrimSpace(p.Type)) + "|" + strings.ToLower(strings.TrimSpace(p.Host)) + "|" + fmt.Sprintf("%d", p.Port) + "|" + strings.TrimSpace(p.Username)))
 	return "proxy_" + hex.EncodeToString(sum[:6])
 }
 func (c *Config) ClearAccountTokens() {
--- a/internal/config/config_edge_test.go
+++ b/internal/config/config_edge_test.go
@@ -49,6 +49,51 @@ func TestGetModelConfigDeepSeekReasonerSearch(t *testing.T) {
 	}
 }
 func TestGetModelConfigDeepSeekExpertChat(t *testing.T) {
 	thinking, search, ok := GetModelConfig("deepseek-expert-chat")
 	if !ok {
 		t.Fatal("expected ok for deepseek-expert-chat")
 	}
 	if thinking || search {
 		t.Fatalf("expected no thinking/search for deepseek-expert-chat, got thinking=%v search=%v", thinking, search)
 	}
 }
 func TestGetModelConfigDeepSeekExpertReasonerSearch(t *testing.T) {
 	thinking, search, ok := GetModelConfig("deepseek-expert-reasoner-search")
 	if !ok {
 		t.Fatal("expected ok for deepseek-expert-reasoner-search")
 	}
 	if !thinking || !search {
 		t.Fatalf("expected both true, got thinking=%v search=%v", thinking, search)
 	}
 }
 func TestGetModelConfigDeepSeekVisionReasonerSearch(t *testing.T) {
 	thinking, search, ok := GetModelConfig("deepseek-vision-reasoner-search")
 	if !ok {
 		t.Fatal("expected ok for deepseek-vision-reasoner-search")
 	}
 	if !thinking || !search {
 		t.Fatalf("expected both true, got thinking=%v search=%v", thinking, search)
 	}
 }
 func TestGetModelTypeDefaultExpertAndVision(t *testing.T) {
 	defaultType, ok := GetModelType("deepseek-chat")
 	if !ok || defaultType != "default" {
 		t.Fatalf("expected default model_type, got ok=%v model_type=%q", ok, defaultType)
 	}
 	expertType, ok := GetModelType("deepseek-expert-chat")
 	if !ok || expertType != "expert" {
 		t.Fatalf("expected expert model_type, got ok=%v model_type=%q", ok, expertType)
 	}
 	visionType, ok := GetModelType("deepseek-vision-chat")
 	if !ok || visionType != "vision" {
 		t.Fatalf("expected vision model_type, got ok=%v model_type=%q", ok, visionType)
 	}
 }
 func TestGetModelConfigCaseInsensitive(t *testing.T) {
 	thinking, search, ok := GetModelConfig("DeepSeek-Chat")
 	if !ok {
@@ -551,6 +596,30 @@ func TestOpenAIModelsResponse(t *testing.T) {
 	if len(data) == 0 {
 		t.Fatal("expected non-empty models list")
 	}
 	expected := map[string]bool{
 		"deepseek-chat":                   false,
 		"deepseek-reasoner":               false,
 		"deepseek-chat-search":            false,
 		"deepseek-reasoner-search":        false,
 		"deepseek-expert-chat":            false,
 		"deepseek-expert-reasoner":        false,
 		"deepseek-expert-chat-search":     false,
 		"deepseek-expert-reasoner-search": false,
 		"deepseek-vision-chat":            false,
 		"deepseek-vision-reasoner":        false,
 		"deepseek-vision-chat-search":     false,
 		"deepseek-vision-reasoner-search": false,
 	}
 	for _, model := range data {
 		if _, ok := expected[model.ID]; ok {
 			expected[model.ID] = true
 		}
 	}
 	for id, seen := range expected {
 		if !seen {
 			t.Fatalf("expected OpenAI model list to include %s", id)
 		}
 	}
 }
 func TestClaudeModelsResponse(t *testing.T) {
--- a/internal/config/config_test.go
+++ b/internal/config/config_test.go
@@ -32,6 +32,47 @@ func TestLoadStoreClearsTokensFromConfigInput(t *testing.T) {
 	}
 }
 func TestLoadStorePreservesProxiesAndAccountProxyAssignment(t *testing.T) {
 	t.Setenv("DS2API_CONFIG_JSON", `{
 		"proxies":[
 			{
 				"id":"proxy-sh-1",
 				"name":"Shanghai Exit",
 				"type":"socks5h",
 				"host":"127.0.0.1",
 				"port":1080,
 				"username":"demo",
 				"password":"secret"
 			}
 		],
 		"accounts":[
 			{
 				"email":"u@example.com",
 				"password":"p",
 				"proxy_id":"proxy-sh-1"
 			}
 		]
 	}`)
 	store := LoadStore()
 	snap := store.Snapshot()
 	if len(snap.Proxies) != 1 {
 		t.Fatalf("expected 1 proxy, got %d", len(snap.Proxies))
 	}
 	if snap.Proxies[0].ID != "proxy-sh-1" {
 		t.Fatalf("unexpected proxy id: %#v", snap.Proxies[0])
 	}
 	if snap.Proxies[0].Type != "socks5h" {
 		t.Fatalf("unexpected proxy type: %#v", snap.Proxies[0])
 	}
 	if len(snap.Accounts) != 1 {
 		t.Fatalf("expected 1 account, got %d", len(snap.Accounts))
 	}
 	if snap.Accounts[0].ProxyID != "proxy-sh-1" {
 		t.Fatalf("expected account proxy assignment preserved, got %#v", snap.Accounts[0])
 	}
 }
 func TestLoadStoreDropsLegacyTokenOnlyAccounts(t *testing.T) {
 	t.Setenv("DS2API_CONFIG_JSON", `{
 		"accounts":[
@@ -58,8 +99,7 @@ func TestLoadStorePreservesFileBackedTokensForRuntime(t *testing.T) {
 	if err != nil {
 		t.Fatalf("create temp config: %v", err)
 	}
-	defer tmp.Close()
+	defer func() { _ = tmp.Close() }()
 	if _, err := tmp.WriteString(`{
 		"accounts":[{"email":"u@example.com","password":"p","token":"persisted-token"}]
 	}`); err != nil {
@@ -355,7 +395,7 @@ func TestAccountTestStatusIsRuntimeOnlyAndNotPersisted(t *testing.T) {
 	if err != nil {
 		t.Fatalf("create temp config: %v", err)
 	}
-	defer tmp.Close()
+	defer func() { _ = tmp.Close() }()
 	if _, err := tmp.WriteString(`{
 		"accounts":[{"email":"u@example.com","password":"p","test_status":"ok"}]
 	}`); err != nil {
--- a/internal/config/model_alias_test.go
+++ b/internal/config/model_alias_test.go
@@ -2,6 +2,10 @@ package config
 import "testing"
 type mockModelAliasReader map[string]string
 func (m mockModelAliasReader) ModelAliases() map[string]string { return m }
 func TestResolveModelDirectDeepSeek(t *testing.T) {
 	got, ok := ResolveModel(nil, "deepseek-chat")
 	if !ok || got != "deepseek-chat" {
@@ -30,6 +34,31 @@ func TestResolveModelUnknown(t *testing.T) {
 	}
 }
 func TestResolveModelDirectDeepSeekExpert(t *testing.T) {
 	got, ok := ResolveModel(nil, "deepseek-expert-chat")
 	if !ok || got != "deepseek-expert-chat" {
 		t.Fatalf("expected deepseek-expert-chat, got ok=%v model=%q", ok, got)
 	}
 }
 func TestResolveModelCustomAliasToExpert(t *testing.T) {
 	got, ok := ResolveModel(mockModelAliasReader{
 		"my-expert-model": "deepseek-expert-reasoner-search",
 	}, "my-expert-model")
 	if !ok || got != "deepseek-expert-reasoner-search" {
 		t.Fatalf("expected alias -> deepseek-expert-reasoner-search, got ok=%v model=%q", ok, got)
 	}
 }
 func TestResolveModelCustomAliasToVision(t *testing.T) {
 	got, ok := ResolveModel(mockModelAliasReader{
 		"my-vision-model": "deepseek-vision-chat-search",
 	}, "my-vision-model")
 	if !ok || got != "deepseek-vision-chat-search" {
 		t.Fatalf("expected alias -> deepseek-vision-chat-search, got ok=%v model=%q", ok, got)
 	}
 }
 func TestClaudeModelsResponsePaginationFields(t *testing.T) {
 	resp := ClaudeModelsResponse()
 	if _, ok := resp["first_id"]; !ok {
--- a/internal/config/models.go
+++ b/internal/config/models.go
@@ -19,6 +19,14 @@ var DeepSeekModels = []ModelInfo{
 	{ID: "deepseek-reasoner", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-chat-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-reasoner-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-expert-chat", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-expert-reasoner", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-expert-chat-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-expert-reasoner-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-vision-chat", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-vision-reasoner", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-vision-chat-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 	{ID: "deepseek-vision-reasoner-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
 }
 var ClaudeModels = []ModelInfo{
@@ -72,11 +80,40 @@ func GetModelConfig(model string) (thinking bool, search bool, ok bool) {
 		return false, true, true
 	case "deepseek-reasoner-search":
 		return true, true, true
 	case "deepseek-expert-chat":
 		return false, false, true
 	case "deepseek-expert-reasoner":
 		return true, false, true
 	case "deepseek-expert-chat-search":
 		return false, true, true
 	case "deepseek-expert-reasoner-search":
 		return true, true, true
 	case "deepseek-vision-chat":
 		return false, false, true
 	case "deepseek-vision-reasoner":
 		return true, false, true
 	case "deepseek-vision-chat-search":
 		return false, true, true
 	case "deepseek-vision-reasoner-search":
 		return true, true, true
 	default:
 		return false, false, false
 	}
 }
 func GetModelType(model string) (modelType string, ok bool) {
 	switch lower(model) {
 	case "deepseek-chat", "deepseek-reasoner", "deepseek-chat-search", "deepseek-reasoner-search":
 		return "default", true
 	case "deepseek-expert-chat", "deepseek-expert-reasoner", "deepseek-expert-chat-search", "deepseek-expert-reasoner-search":
 		return "expert", true
 	case "deepseek-vision-chat", "deepseek-vision-reasoner", "deepseek-vision-chat-search", "deepseek-vision-reasoner-search":
 		return "vision", true
 	default:
 		return "", false
 	}
 }
 func IsSupportedDeepSeekModel(model string) bool {
 	_, _, ok := GetModelConfig(model)
 	return ok
--- a/internal/config/paths.go
+++ b/internal/config/paths.go
@@ -33,10 +33,6 @@ func ConfigPath() string {
 	return ResolvePath("DS2API_CONFIG_PATH", "config.json")
 }
 func WASMPath() string {
 	return ResolvePath("DS2API_WASM_PATH", "sha3_wasm_bg.7b9ca65ddd.wasm")
 }
 func RawStreamSampleRoot() string {
 	return ResolvePath("DS2API_RAW_STREAM_SAMPLE_ROOT", "tests/raw_stream_samples")
 }
--- a/internal/config/validation.go
+++ b/internal/config/validation.go
@@ -6,6 +6,9 @@ import (
 )
 func ValidateConfig(c Config) error {
 	if err := ValidateProxyConfig(c.Proxies); err != nil {
 		return err
 	}
 	if err := ValidateAdminConfig(c.Admin); err != nil {
 		return err
 	}
@@ -21,6 +24,55 @@ func ValidateConfig(c Config) error {
 	if err := ValidateAutoDeleteConfig(c.AutoDelete); err != nil {
 		return err
 	}
 	if err := ValidateAccountProxyReferences(c.Accounts, c.Proxies); err != nil {
 		return err
 	}
 	return nil
 }
 func ValidateProxyConfig(proxies []Proxy) error {
 	seen := make(map[string]struct{}, len(proxies))
 	for _, proxy := range proxies {
 		proxy = NormalizeProxy(proxy)
 		if err := ValidateTrimmedString("proxies.id", proxy.ID, true); err != nil {
 			return err
 		}
 		switch proxy.Type {
 		case "socks5", "socks5h":
 		default:
 			return fmt.Errorf("proxies.type must be one of socks5, socks5h")
 		}
 		if err := ValidateTrimmedString("proxies.host", proxy.Host, true); err != nil {
 			return err
 		}
 		if err := ValidateIntRange("proxies.port", proxy.Port, 1, 65535, true); err != nil {
 			return err
 		}
 		if _, ok := seen[proxy.ID]; ok {
 			return fmt.Errorf("duplicate proxy id: %s", proxy.ID)
 		}
 		seen[proxy.ID] = struct{}{}
 	}
 	return nil
 }
 func ValidateAccountProxyReferences(accounts []Account, proxies []Proxy) error {
 	if len(accounts) == 0 {
 		return nil
 	}
 	ids := make(map[string]struct{}, len(proxies))
 	for _, proxy := range proxies {
 		ids[NormalizeProxy(proxy).ID] = struct{}{}
 	}
 	for _, acc := range accounts {
 		proxyID := strings.TrimSpace(acc.ProxyID)
 		if proxyID == "" {
 			continue
 		}
 		if _, ok := ids[proxyID]; !ok {
 			return fmt.Errorf("account proxy_id references unknown proxy: %s", proxyID)
 		}
 	}
 	return nil
 }
--- a/internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm
+++ b/internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm
--- a/internal/deepseek/client_auth.go
+++ b/internal/deepseek/client_auth.go
@@ -13,6 +13,7 @@ import (
 )
 func (c *Client) Login(ctx context.Context, acc config.Account) (string, error) {
 	clients := c.requestClientsForAccount(acc)
 	payload := map[string]any{
 		"password":  strings.TrimSpace(acc.Password),
 		"device_id": "deepseek_to_api",
@@ -27,7 +28,7 @@ func (c *Client) Login(ctx context.Context, acc config.Account) (string, error)
 	} else {
 		return "", errors.New("missing email/mobile")
 	}
-	resp, err := c.postJSON(ctx, c.regular, DeepSeekLoginURL, BaseHeaders, payload)
+	resp, err := c.postJSON(ctx, clients.regular, clients.fallback, DeepSeekLoginURL, BaseHeaders, payload)
 	if err != nil {
 		return "", err
 	}
@@ -52,11 +53,12 @@ func (c *Client) CreateSession(ctx context.Context, a *auth.RequestAuth, maxAtte
 	if maxAttempts <= 0 {
 		maxAttempts = c.maxRetries
 	}
 	clients := c.requestClientsForAuth(ctx, a)
 	attempts := 0
 	refreshed := false
 	for attempts < maxAttempts {
 		headers := c.authHeaders(a.DeepSeekToken)
-		resp, status, err := c.postJSONWithStatus(ctx, c.regular, DeepSeekCreateSessionURL, headers, map[string]any{"agent": "chat"})
+		resp, status, err := c.postJSONWithStatus(ctx, clients.regular, clients.fallback, DeepSeekCreateSessionURL, headers, map[string]any{"agent": "chat"})
 		if err != nil {
 			config.Logger.Warn("[create_session] request error", "error", err, "account", a.AccountID)
 			attempts++
@@ -64,9 +66,7 @@ func (c *Client) CreateSession(ctx context.Context, a *auth.RequestAuth, maxAtte
 		}
 		code, bizCode, msg, bizMsg := extractResponseStatus(resp)
 		if status == http.StatusOK && code == 0 && bizCode == 0 {
-			data, _ := resp["data"].(map[string]any)
+			sessionID := extractCreateSessionID(resp)
 			bizData, _ := data["biz_data"].(map[string]any)
 			sessionID, _ := bizData["id"].(string)
 			if sessionID != "" {
 				return sessionID, nil
 			}
@@ -91,16 +91,25 @@ func (c *Client) CreateSession(ctx context.Context, a *auth.RequestAuth, maxAtte
 }
 func (c *Client) GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error) {
 	return c.GetPowForTarget(ctx, a, DeepSeekCompletionTargetPath, maxAttempts)
 }
 func (c *Client) GetPowForTarget(ctx context.Context, a *auth.RequestAuth, targetPath string, maxAttempts int) (string, error) {
 	if maxAttempts <= 0 {
 		maxAttempts = c.maxRetries
 	}
 	targetPath = strings.TrimSpace(targetPath)
 	if targetPath == "" {
 		targetPath = DeepSeekCompletionTargetPath
 	}
 	clients := c.requestClientsForAuth(ctx, a)
 	attempts := 0
 	refreshed := false
 	for attempts < maxAttempts {
 		headers := c.authHeaders(a.DeepSeekToken)
-		resp, status, err := c.postJSONWithStatus(ctx, c.regular, DeepSeekCreatePowURL, headers, map[string]any{"target_path": "/api/v0/chat/completion"})
+		resp, status, err := c.postJSONWithStatus(ctx, clients.regular, clients.fallback, DeepSeekCreatePowURL, headers, map[string]any{"target_path": targetPath})
 		if err != nil {
-			config.Logger.Warn("[get_pow] request error", "error", err, "account", a.AccountID)
+			config.Logger.Warn("[get_pow] request error", "error", err, "account", a.AccountID, "target_path", targetPath)
 			attempts++
 			continue
 		}
@@ -109,14 +118,14 @@ func (c *Client) GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts in
 			data, _ := resp["data"].(map[string]any)
 			bizData, _ := data["biz_data"].(map[string]any)
 			challenge, _ := bizData["challenge"].(map[string]any)
-			answer, err := c.powSolver.Compute(ctx, challenge)
+			answer, err := ComputePow(ctx, challenge)
 			if err != nil {
 				attempts++
 				continue
 			}
 			return BuildPowHeader(challenge, answer)
 		}
-		config.Logger.Warn("[get_pow] failed", "status", status, "code", code, "biz_code", bizCode, "msg", msg, "biz_msg", bizMsg, "use_config_token", a.UseConfigToken, "account", a.AccountID)
+		config.Logger.Warn("[get_pow] failed", "status", status, "code", code, "biz_code", bizCode, "msg", msg, "biz_msg", bizMsg, "use_config_token", a.UseConfigToken, "account", a.AccountID, "target_path", targetPath)
 		if a.UseConfigToken {
 			if !refreshed && shouldAttemptRefresh(status, code, bizCode, msg, bizMsg) {
 				if c.Auth.RefreshToken(ctx, a) {
@@ -201,6 +210,22 @@ func isAuthIndicativeBizFailure(msg string, bizMsg string) bool {
 	return false
 }
 // DeepSeek has returned create-session ids in both biz_data.id and
 // biz_data.chat_session.id across observed response variants; accept either.
 func extractCreateSessionID(resp map[string]any) string {
 	data, _ := resp["data"].(map[string]any)
 	bizData, _ := data["biz_data"].(map[string]any)
 	if sessionID, _ := bizData["id"].(string); strings.TrimSpace(sessionID) != "" {
 		return strings.TrimSpace(sessionID)
 	}
 	if chatSession, ok := bizData["chat_session"].(map[string]any); ok {
 		if sessionID, _ := chatSession["id"].(string); strings.TrimSpace(sessionID) != "" {
 			return strings.TrimSpace(sessionID)
 		}
 	}
 	return ""
 }
 func extractResponseStatus(resp map[string]any) (code int, bizCode int, msg string, bizMsg string) {
 	code = intFrom(resp["code"])
 	msg, _ = resp["msg"].(string)
--- a/internal/deepseek/client_auth_test.go
+++ b/internal/deepseek/client_auth_test.go
@@ -0,0 +1,34 @@
 package deepseek
 import "testing"
 func TestExtractCreateSessionIDSupportsLegacyShape(t *testing.T) {
 	resp := map[string]any{
 		"data": map[string]any{
 			"biz_data": map[string]any{
 				"id": "legacy-session-id",
 			},
 		},
 	}
 	if got := extractCreateSessionID(resp); got != "legacy-session-id" {
 		t.Fatalf("expected legacy session id, got %q", got)
 	}
 }
 func TestExtractCreateSessionIDSupportsNestedChatSessionShape(t *testing.T) {
 	resp := map[string]any{
 		"data": map[string]any{
 			"biz_data": map[string]any{
 				"chat_session": map[string]any{
 					"id":         "nested-session-id",
 					"model_type": "default",
 				},
 			},
 		},
 	}
 	if got := extractCreateSessionID(resp); got != "nested-session-id" {
 		t.Fatalf("expected nested session id, got %q", got)
 	}
 }
--- a/internal/deepseek/client_completion.go
+++ b/internal/deepseek/client_completion.go
@@ -10,18 +10,20 @@ import (
 	"ds2api/internal/auth"
 	"ds2api/internal/config"
 	trans "ds2api/internal/deepseek/transport"
 )
 func (c *Client) CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error) {
 	if maxAttempts <= 0 {
 		maxAttempts = c.maxRetries
 	}
 	clients := c.requestClientsForAuth(ctx, a)
 	headers := c.authHeaders(a.DeepSeekToken)
 	headers["x-ds-pow-response"] = powResp
 	captureSession := c.capture.Start("deepseek_completion", DeepSeekCompletionURL, a.AccountID, payload)
 	attempts := 0
 	for attempts < maxAttempts {
-		resp, err := c.streamPost(ctx, DeepSeekCompletionURL, headers, payload)
+		resp, err := c.streamPost(ctx, clients.stream, DeepSeekCompletionURL, headers, payload)
 		if err != nil {
 			attempts++
 			time.Sleep(time.Second)
@@ -44,11 +46,13 @@ func (c *Client) CallCompletion(ctx context.Context, a *auth.RequestAuth, payloa
 	return nil, errors.New("completion failed")
 }
-func (c *Client) streamPost(ctx context.Context, url string, headers map[string]string, payload any) (*http.Response, error) {
+func (c *Client) streamPost(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any) (*http.Response, error) {
 	b, err := json.Marshal(payload)
 	if err != nil {
 		return nil, err
 	}
 	headers = c.jsonHeaders(headers)
 	clients := c.requestClientsFromContext(ctx)
 	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
 	if err != nil {
 		return nil, err
@@ -56,7 +60,7 @@ func (c *Client) streamPost(ctx context.Context, url string, headers map[string]
 	for k, v := range headers {
 		req.Header.Set(k, v)
 	}
-	resp, err := c.stream.Do(req)
+	resp, err := doer.Do(req)
 	if err != nil {
 		config.Logger.Warn("[deepseek] fingerprint stream request failed, fallback to std transport", "url", url, "error", err)
 		req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
@@ -66,7 +70,7 @@ func (c *Client) streamPost(ctx context.Context, url string, headers map[string]
 		for k, v := range headers {
 			req2.Header.Set(k, v)
 		}
-		return c.fallbackS.Do(req2)
+		return clients.fallbackS.Do(req2)
 	}
 	return resp, nil
 }
--- a/internal/deepseek/client_continue.go
+++ b/internal/deepseek/client_continue.go
@@ -51,6 +51,7 @@ func (c *Client) callContinue(ctx context.Context, a *auth.RequestAuth, sessionI
 	if strings.TrimSpace(sessionID) == "" || responseMessageID <= 0 {
 		return nil, errors.New("missing continue identifiers")
 	}
 	clients := c.requestClientsForAuth(ctx, a)
 	headers := c.authHeaders(a.DeepSeekToken)
 	headers["x-ds-pow-response"] = powResp
 	payload := map[string]any{
@@ -60,7 +61,7 @@ func (c *Client) callContinue(ctx context.Context, a *auth.RequestAuth, sessionI
 	}
 	config.Logger.Info("[auto_continue] calling continue", "session_id", sessionID, "message_id", responseMessageID)
 	captureSession := c.capture.Start("deepseek_continue", DeepSeekContinueURL, a.AccountID, payload)
-	resp, err := c.streamPost(ctx, DeepSeekContinueURL, headers, payload)
+	resp, err := c.streamPost(ctx, clients.stream, DeepSeekContinueURL, headers, payload)
 	if err != nil {
 		return nil, err
 	}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
CJACK.	67501cf4d2	Merge pull request #256 from CJackHwang/dev 全模型全渠道附件上传deepseek功能全接口兼容性待测试	2026-04-13 04:00:49 +08:00
CJACK	25234af301	feat: enforce request body size limits and restrict inline file count to prevent resource exhaustion	2026-04-13 03:55:14 +08:00
CJACK	2aee80d0d3	fix: update URL decoding method and refine file ID extraction logic to exclude text-based inputs	2026-04-13 03:49:06 +08:00
CJACK	ab9f3cc417	refactor: remove unused leakedDanglingThinkOpenPattern regex from output sanitizer	2026-04-13 03:40:20 +08:00
CJACK	c92ed8d3c3	refactor: rename apiTester testSuccess key to requestSuccess and update localization files	2026-04-13 03:24:39 +08:00
CJACK	d78789a66e	feat: implement error handling for empty upstream responses in chat streams and update UI to display stream-level errors	2026-04-13 03:22:38 +08:00
CJACK	acb110865f	feat: implement cross-account validation and improved error handling for file attachments in API tester	2026-04-13 03:15:12 +08:00
CJACK	ffca8be597	feat: implement file readiness polling and add IsImage field to upload results	2026-04-13 02:55:45 +08:00
CJACK	7ef6a7d11f	feat: update to v3.4.0 and redesign model selection UI with a dropdown and descriptive panel	2026-04-13 02:27:12 +08:00
CJACK	d53a2ea7d2	refactor: remove unused purpose parameter from upload and upstream empty output handlers	2026-04-13 01:59:51 +08:00
CJACK	daa636e040	refactor: handle upstream thinking-only responses as errors and sanitize dangling think tags in output	2026-04-13 01:55:14 +08:00
CJACK	aa41bae044	feat: add file attachment support to chat interface and API requests	2026-04-13 00:04:38 +08:00
CJACK	2027c7cd77	fix: add JSON headers to DeepSeek requests and prevent string content from being parsed as file IDs in OpenAI adapter	2026-04-12 23:49:56 +08:00
CJACK	0591128601	refactor: fix file handling error suppression, optimize hash calculation, and update API documentation with additional models	2026-04-12 23:35:57 +08:00
CJACK	caafdedb00	feat: implement OpenAI-compatible file upload and reference handling for DeepSeek API	2026-04-12 23:30:22 +08:00
CJACK	0a23c77ff7	feat: add sanitization for think tags and BOS markers in leaked output and update golang.org/x/net dependency	2026-04-12 17:43:57 +08:00
CJACK.	d759804c33	Merge pull request #255 from CJackHwang/codex/refactor-prompt-concatenation-using-tokenizer feat(prompt): tokenizer-style prompt stitching with thinking-prefix support	2026-04-12 17:14:48 +08:00
CJACK.	433a3a877d	feat(prompt): align DeepSeek prompt assembly with tokenizer-style turns	2026-04-12 13:59:42 +08:00
CJACK.	792e295512	Merge pull request #254 from CJackHwang/main Update VERSION	2026-04-08 20:24:03 +08:00
CJACK.	d053d9ad04	Update VERSION	2026-04-08 20:22:55 +08:00
CJACK.	04e025c5e1	Update README.MD	2026-04-08 18:21:09 +08:00
CJACK.	184cbed3cb	Merge pull request #252 from CJackHwang/dev Merge pull request #249 from shuaihaoV/feat/deepseek-model-type-families Add default, expert, and vision DeepSeek model families	2026-04-08 18:06:07 +08:00
CJACK.	378f99be4a	Merge pull request #249 from shuaihaoV/feat/deepseek-model-type-families Add default, expert, and vision DeepSeek model families	2026-04-08 17:53:02 +08:00
Shuaihao	ba76a2163b	Add default, expert, and vision DeepSeek model families	2026-04-08 14:37:22 +08:00
CJACK.	af9c51f3a7	Merge pull request #245 from CJackHwang/dev Merge pull request #244 from CJackHwang/codex/temporarily-switch-to-internal-usage-count Temporarily ignore DeepSeek upstream usage fields and prefer internal token estimation	2026-04-07 21:27:32 +08:00
CJACK.	92bb25265e	Merge pull request #246 from CJackHwang/codex/fix-review-comments-before-merging Fix proxy-bound fallback behavior and redact proxy password responses	2026-04-07 21:26:13 +08:00
CJACK.	84050d87e4	fix proxy fallback binding and redact proxy password responses	2026-04-07 21:22:28 +08:00
CJACK.	c6a6f1cf4e	Merge pull request #244 from CJackHwang/codex/temporarily-switch-to-internal-usage-count Temporarily ignore DeepSeek upstream usage fields and prefer internal token estimation	2026-04-07 20:39:36 +08:00
CJACK.	f4ed10d38d	disable token-mismatch gate by default in raw stream simulator	2026-04-07 20:38:29 +08:00
CJACK.	d9e65c9710	remove upstream token-usage plumbing and always estimate from content	2026-04-07 20:12:18 +08:00
CJACK.	a14e5b0847	temporarily ignore upstream token usage fields globally	2026-04-07 19:40:47 +08:00
CJACK.	b59e991ad5	Merge pull request #241 from tanaer/feat/proxy-ip-management-dev feat: 增加 SOCKS5/SOCKS5H 代理管理与账号代理路由	2026-04-07 17:14:48 +08:00
Jason.li	c84347b625	docs: align agent rules with quality gate lint	2026-04-07 14:19:40 +08:00
Jason.li	8ae2ea10c8	feat(proxy): add proxy IP management and account routing Add admin CRUD and connectivity checks for SOCKS5/SOCKS5H proxy nodes. Allow accounts to bind to a proxy, route DeepSeek requests through the selected node, and expose proxy management in the admin UI.	2026-04-07 14:16:13 +08:00
CJACK.	d32765bc84	Merge pull request #240 from CJackHwang/dev Merge pull request #239 from CJackHwang/codex/fix-escaping-issues-and-token-counting Fix HTML-escaped tool-call args and preserve upstream token usage (stream & non-stream)	2026-04-07 13:16:49 +08:00
CJACK.	08b1344f81	Merge pull request #242 from CJackHwang/codex/fix-issues-in-pull-request-#240 fix: avoid double-decoding XML entity text in markup tool-call parsing	2026-04-07 13:16:01 +08:00
CJACK.	8b0da7b6f8	fix: avoid double XML entity decoding in toolcall parser	2026-04-07 13:14:30 +08:00
CJACK.	1c95942e5d	Merge pull request #239 from CJackHwang/codex/fix-escaping-issues-and-token-counting Fix HTML-escaped tool-call args and preserve upstream token usage (stream & non-stream)	2026-04-07 12:56:02 +08:00
CJACK.	da7c46b278	Limit HTML unescape to markup tool-call parsing	2026-04-07 12:55:06 +08:00
CJACK.	cfcca69396	Update VERSION	2026-04-07 12:46:15 +08:00
CJACK.	4475bfe92f	Merge pull request #238 from CJackHwang/codex/remove-project-structure-section-from-main-document docs: remove duplicated project structure sections from READMEs	2026-04-07 12:36:30 +08:00
CJACK.	77a401fb19	Fix tool-call HTML escaping and stabilize usage token mapping	2026-04-07 12:35:50 +08:00
CJACK.	a935f61f74	docs: remove duplicated project structure sections from READMEs	2026-04-07 12:32:52 +08:00
CJACK.	80b88b37ff	Merge pull request #236 from CJackHwang/codex/review-and-reorganize-all-md-documents docs: add architecture docs and centralize documentation index; update READMEs and API links	2026-04-07 11:55:11 +08:00
CJACK.	475c9086d2	docs: 为展开目录树补充文件夹作用注释	2026-04-07 11:51:14 +08:00
CJACK.	8cfba9c650	Merge pull request #232 from CJackHwang/dev refactor: improve XML tool parsing robustness, update system prompt constraints, and simplify tool filtering logic	2026-04-07 11:13:44 +08:00
CJACK.	98131881ed	Merge pull request #234 from CJackHwang/codex/fix-documentation-and-accumulated_token_usage Propagate DeepSeek SSE token usage to /v1/responses and remove stale POW env docs	2026-04-07 11:02:44 +08:00
CJACK.	86ecbc89bd	Preserve SSE frame delimiters when injecting Gemini usage	2026-04-07 10:59:27 +08:00
CJACK.	668b9c26bd	Unify token usage pass-through on OpenAI translate pipeline	2026-04-07 10:16:23 +08:00
CJACK.	5bcea3d727	Propagate upstream token usage across Gemini usage metadata	2026-04-07 10:16:00 +08:00
CJACK.	96b8587c5b	Fix token usage propagation and remove stale env docs	2026-04-07 08:27:03 +08:00
CJACK.	d09260d06f	Merge pull request #233 from CJackHwang/main 依赖升级	2026-04-07 07:12:40 +08:00
CJACK.	554b95d232	Merge pull request #231 from CJackHwang/dependabot/npm_and_yarn/webui/npm_and_yarn-7c6ac41456 chore(deps-dev): bump vite from 8.0.3 to 8.0.5 in /webui in the npm_and_yarn group across 1 directory	2026-04-07 07:02:53 +08:00
dependabot[bot]	b54ee05d12	chore(deps-dev): bump vite Bumps the npm_and_yarn group with 1 update in the /webui directory: [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite). Updates `vite` from 8.0.3 to 8.0.5 - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v8.0.5/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-version: 8.0.5 dependency-type: direct:development dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-06 18:44:30 +00:00
CJACK	9968221633	refactor: improve XML tool parsing robustness, update system prompt constraints, and simplify tool filtering logic	2026-04-07 02:10:45 +08:00
CJACK	b79a13efd5	feat: support explicit prompt token tracking in SSE parsing and stream handlers	2026-04-07 01:39:27 +08:00
CJACK	da778a18fb	refactor: replace WASM-based PoW with a high-performance native Go implementation and add context support for cancellation.	2026-04-07 01:20:01 +08:00
CJACK.	10921e0f84	Merge pull request #229 from CJackHwang/dev refactor: replace WASM-based PoW solver with a native Go implementation in the pow package	2026-04-07 00:57:33 +08:00
CJACK	e7d561694a	refactor: replace WASM-based PoW solver with a native Go implementation in the pow package	2026-04-07 00:10:36 +08:00
CJACK.	13687ce787	Merge pull request #227 from CJackHwang/codex/change-empty-responses-to-429 fix(openai): return 429 for empty upstream output	2026-04-06 17:11:07 +08:00
CJACK.	26aa02d4b5	fix(openai): return 429 for empty upstream output	2026-04-06 16:56:17 +08:00
CJACK.	89eaf048c3	Merge pull request #221 from CJackHwang/dev Dev	2026-04-06 16:50:00 +08:00
CJACK.	904211469a	Merge pull request #222 from CJackHwang/codex/resolve-pull-request-issues-and-complete-tests Add golangci-lint bootstrap and CI lint gate; update docs and .gitignore	2026-04-06 13:56:05 +08:00
CJACK.	530872ff2f	Merge pull request #224 from CJackHwang/codex/fix-four-issues-from-pull-requests-99aozg Fix lint bootstrap behavior and harden SSE token replay parsing	2026-04-06 13:55:30 +08:00
CJACK.	fbe1e25c7b	Merge pull request #225 from CJackHwang/codex/fix-golangci-lint-bootstrap-compatibility Treat missing golangci-lint as bootstrap-compatible	2026-04-06 13:54:50 +08:00
CJACK.	cd7e03d936	Merge pull request #226 from CJackHwang/codex/fix-issue-with-passing-thresholds Handle deferred Close errors, normalize error messages, and add nolint annotations	2026-04-06 13:54:14 +08:00
CJACK.	37fb758191	Make full quality gates pass across repository	2026-04-06 13:41:58 +08:00
CJACK.	fb6be8a8ee	fix lint bootstrap on missing golangci-lint	2026-04-06 13:38:17 +08:00
CJACK.	57114a36f5	fix: address codex review issues for lint bootstrap and token replay	2026-04-06 13:12:36 +08:00
CJACK.	a671d82759	chore: auto-bootstrap golangci-lint for Go 1.26 compatibility	2026-04-06 12:52:56 +08:00
CJACK.	da75ed6966	Merge pull request #220 from CJackHwang/codex/fix-pull-request-review-comments Migrate and reorganize .golangci.yml to v2 with updated linters and exclusions	2026-04-06 12:33:51 +08:00
CJACK.	3b99d2edbe	docs: add full-sample token replay command and report fields	2026-04-06 12:32:31 +08:00
CJACK.	f6c09ebd63	fix: keep node error-branch token semantics and add grep fallback	2026-04-06 12:32:26 +08:00
CJACK.	36af2e00f6	Merge pull request #219 from CJackHwang/dev Dev	2026-04-06 11:17:39 +08:00
CJACK.	9e0fd83a76	test: validate raw stream token replay and enforce gofmt in lint script	2026-04-06 11:15:08 +08:00
CJACK.	a8c160b05d	fix: parse DeepSeek accumulated_token_usage robustly and stabilize lint	2026-04-06 11:14:48 +08:00
CJACK.	89ca57122c	fix: migrate golangci config to v2 schema	2026-04-06 09:29:22 +08:00
CJACK	6b6ce3eea8	refactor: move toolcall utility files to internal/toolcall directory	2026-04-06 03:56:42 +08:00
CJACK	870144de17	ci: remove golangci-lint step from quality gates workflow	2026-04-06 03:53:03 +08:00
CJACK	1530246e4f	refactor: move tool call parsing and formatting logic to a dedicated internal/toolcall package	2026-04-06 03:19:18 +08:00
CJACK.	d6ecdad6de	Merge pull request #218 from CJackHwang/dev fix: reverse snapshot order to preserve capture sequence during stabl…	2026-04-06 02:55:59 +08:00
CJACK	2857a171cc	fix: reverse snapshot order to preserve capture sequence during stable sort	2026-04-06 02:51:06 +08:00
CJACK.	eb8b45e667	Merge pull request #217 from CJackHwang/dev Dev	2026-04-06 02:47:44 +08:00
CJACK	1664349a29	docs: update documentation for raw stream test samples	2026-04-06 02:44:20 +08:00
CJACK	b105d54c00	feat: add admin endpoints for capturing, querying, and persisting raw upstream samples and increase default capture limits	2026-04-06 02:38:15 +08:00
CJACK	039d7d3db1	feat: implement raw sample capture querying and persistence, and add environment-based configuration for dev capture store.	2026-04-06 02:33:02 +08:00
CJACK	49012a227c	feat: implement trimContinuationOverlap utility to remove redundant stream prefixes and add associated tests.	2026-04-06 02:23:28 +08:00
CJACK	4d36afea4c	修复接续流的增量bug	2026-04-06 02:01:41 +08:00