Merge pull request #34 from CJackHwang/dev

Merge pull request #33 from CJackHwang/codex/add-docker-image-build-to-github-actions ci: build and publish Docker images in release workflow
Merge pull request #33 from CJackHwang/codex/add-docker-image-build-to-github-actions
2026-05-02 15:35:27 +08:00 · 2026-02-17 19:46:04 +08:00 · 2026-02-17 19:44:20 +08:00 · 2026-02-17 19:43:12 +08:00 · 2026-02-17 14:36:29 +08:00 · 2026-02-17 14:35:24 +08:00
112 changed files with 14288 additions and 271875 deletions
--- a/.env.example
+++ b/.env.example
@@ -1,74 +1,90 @@
-# DS2API 环境变量配置模板
-# 复制此文件为 .env 并根据需要修改
-# 最后更新：2026-02
+# DS2API environment template (Go runtime)
+# Copy this file to .env and adjust values.
+# Updated: 2026-02

-# ===============================================================
-#                        核心配置
-# ===============================================================
-
-# ----- 服务配置 -----
-# 服务端口（默认 5001）
+# ---------------------------------------------------------------
+# Runtime
+# ---------------------------------------------------------------
+# HTTP listen port (default: 5001)
 PORT=5001

-# 服务监听地址
-HOST=0.0.0.0
-
-# 日志级别 (DEBUG, INFO, WARNING, ERROR)
+# Log level: DEBUG | INFO | WARN | ERROR
 LOG_LEVEL=INFO

+# Max concurrent inflight requests per account in managed-key mode.
+# Default: 2
+# Recommended client concurrency is calculated dynamically as:
+#   account_count * DS2API_ACCOUNT_MAX_INFLIGHT
+# So by default it is account_count * 2.
+# Requests beyond inflight slots enter a waiting queue first.
+# Default queue size equals recommended concurrency, so 429 starts after:
+#   account_count * DS2API_ACCOUNT_MAX_INFLIGHT * 2
+# Alias: DS2API_ACCOUNT_CONCURRENCY
+# DS2API_ACCOUNT_MAX_INFLIGHT=2

-# ===============================================================
-#                    数据配置（三选一）
-# ===============================================================
+# Optional waiting queue size override for managed-key mode.
+# Default: recommended_concurrency (same as account_count * inflight_limit)
+# Alias: DS2API_ACCOUNT_QUEUE_SIZE
+# DS2API_ACCOUNT_MAX_QUEUE=10

-# 方式1: JSON 字符串（适合简单配置）
-# DS2API_CONFIG_JSON={"keys":["your-api-key"],"accounts":[{"email":"user@example.com","password":"xxx","token":""}]}
+# ---------------------------------------------------------------
+# Admin auth
+# ---------------------------------------------------------------
+# Admin key for /admin login and protected admin APIs.
+# Default is "admin" when unset, but setting it explicitly is recommended.
+DS2API_ADMIN_KEY=admin

-# 方式2: Base64 编码的 JSON（推荐用于 Vercel，避免特殊字符转义问题）
-# 生成方式: echo '{"keys":["your-api-key"],"accounts":[...]}' | base64
-# DS2API_CONFIG_JSON=eyJrZXlzIjpbInlvdXItYXBpLWtleSJdLCJhY2NvdW50cyI6W3siZW1haWwiOiJ1c2VyQGV4YW1wbGUuY29tIiwicGFzc3dvcmQiOiJ4eHgiLCJ0b2tlbiI6IiJ9XX0=
+# Optional JWT signing secret for admin token.
+# Defaults to DS2API_ADMIN_KEY when unset.
+# DS2API_JWT_SECRET=change-me

-# 方式3: 配置文件路径（本地开发推荐）
+# Optional admin JWT validity in hours (default: 24)
+# DS2API_JWT_EXPIRE_HOURS=24
+
+# ---------------------------------------------------------------
+# Config source (choose one)
+# ---------------------------------------------------------------
+# Option A: config file path (local/dev recommended)
 # DS2API_CONFIG_PATH=config.json

+# Option B: JSON string
+# DS2API_CONFIG_JSON={"keys":["your-api-key"],"accounts":[{"email":"user@example.com","password":"xxx","token":""}]}

-# ===============================================================
-#                    管理界面配置
-# ===============================================================
+# Option C: Base64 encoded JSON (recommended for Vercel env var)
+# DS2API_CONFIG_JSON=eyJrZXlzIjpbInlvdXItYXBpLWtleSJdLCJhY2NvdW50cyI6W3siZW1haWwiOiJ1c2VyQGV4YW1wbGUuY29tIiwicGFzc3dvcmQiOiJ4eHgiLCJ0b2tlbiI6IiJ9XX0=

-# Admin API 密钥（Vercel 部署必填！）
-# 用于保护 WebUI 管理界面，首次访问 /admin 时需要输入此密钥登录
-DS2API_ADMIN_KEY=your-admin-secret-key
-
-# JWT Token 过期时间（秒，默认 86400 = 24小时）
-# DS2API_SESSION_EXPIRE=86400
-
-
-# ===============================================================
-#                    Vercel 集成（可选）
-# ===============================================================
-
-# Vercel API Token
-# 获取方式: https://vercel.com/account/tokens
-# VERCEL_TOKEN=your-vercel-token
-
-# Vercel Project ID
-# 获取方式: Vercel 控制台 -> 项目设置 -> General -> Project ID
-# VERCEL_PROJECT_ID=prj_xxxxxxxxxxxx
-
-# Vercel Team ID（个人项目无需填写，团队项目才需要）
-# VERCEL_TEAM_ID=
-
-
-# ===============================================================
-#                    高级配置（可选）
-# ===============================================================
-
-# Tokenizer 目录（留空使用项目根目录）
-# DS2API_TOKENIZER_DIR=
-
-# 模板目录
-# DS2API_TEMPLATES_DIR=templates
-
-# WASM 文件路径（PoW 计算用）
+# ---------------------------------------------------------------
+# Paths (optional)
+# ---------------------------------------------------------------
+# WASM file used for PoW solving
 # DS2API_WASM_PATH=sha3_wasm_bg.7b9ca65ddd.wasm
+
+# Built admin static assets directory
+# DS2API_STATIC_ADMIN_DIR=static/admin
+
+# Auto-build WebUI on startup when static/admin is missing.
+# Default: enabled on local/Docker, disabled on Vercel.
+# DS2API_AUTO_BUILD_WEBUI=true
+
+# Internal auth secret used by the Vercel hybrid streaming path
+# (Go prepare endpoint <-> Node stream function).
+# Optional: falls back to DS2API_ADMIN_KEY when unset.
+# DS2API_VERCEL_INTERNAL_SECRET=change-me
+
+# Stream lease TTL seconds for Vercel hybrid streaming.
+# During this window, the managed account stays occupied until Node calls release.
+# Default: 900 (15 minutes)
+# DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS=900
+
+# ---------------------------------------------------------------
+# Vercel sync integration (optional)
+# ---------------------------------------------------------------
+# VERCEL_TOKEN=your-vercel-token
+# VERCEL_PROJECT_ID=prj_xxxxxxxxxxxx
+# VERCEL_TEAM_ID=team_xxxxxxxxxxxx
+
+# Optional: Vercel deployment protection bypass secret.
+# If deployment protection is enabled, DS2API will use this value as
+# x-vercel-protection-bypass for internal Node->Go calls on Vercel.
+# You can also use VERCEL_AUTOMATION_BYPASS_SECRET directly.
+# DS2API_VERCEL_PROTECTION_BYPASS=your-bypass-secret
--- a/.github/workflows/release-artifacts.yml
+++ b/.github/workflows/release-artifacts.yml
@@ -0,0 +1,116 @@
+name: Release Artifacts
+
+on:
+  release:
+    types:
+      - published
+
+permissions:
+  contents: write
+  packages: write
+
+jobs:
+  build-and-upload:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Setup Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: "1.24.x"
+
+      - name: Setup Node
+        uses: actions/setup-node@v4
+        with:
+          node-version: "20"
+          cache: "npm"
+          cache-dependency-path: webui/package-lock.json
+
+      - name: Build WebUI
+        run: |
+          npm ci --prefix webui
+          npm run build --prefix webui
+
+      - name: Build Multi-Platform Archives
+        run: |
+          set -euo pipefail
+          TAG="${{ github.event.release.tag_name }}"
+          mkdir -p dist
+
+          targets=(
+            "linux/amd64"
+            "linux/arm64"
+            "darwin/amd64"
+            "darwin/arm64"
+            "windows/amd64"
+          )
+
+          for target in "${targets[@]}"; do
+            GOOS="${target%/*}"
+            GOARCH="${target#*/}"
+            PKG="ds2api_${TAG}_${GOOS}_${GOARCH}"
+            STAGE="dist/${PKG}"
+            BIN="ds2api"
+            if [ "${GOOS}" = "windows" ]; then
+              BIN="ds2api.exe"
+            fi
+
+            mkdir -p "${STAGE}/static"
+            CGO_ENABLED=0 GOOS="${GOOS}" GOARCH="${GOARCH}" \
+              go build -trimpath -ldflags="-s -w" -o "${STAGE}/${BIN}" ./cmd/ds2api
+
+            cp config.example.json .env.example sha3_wasm_bg.7b9ca65ddd.wasm LICENSE README.MD README.en.md "${STAGE}/"
+            cp -R static/admin "${STAGE}/static/admin"
+
+            if [ "${GOOS}" = "windows" ]; then
+              (cd dist && zip -rq "${PKG}.zip" "${PKG}")
+            else
+              tar -C dist -czf "dist/${PKG}.tar.gz" "${PKG}"
+            fi
+
+            rm -rf "${STAGE}"
+          done
+
+          (cd dist && sha256sum *.tar.gz *.zip > sha256sums.txt)
+
+      - name: Upload Release Assets
+        uses: softprops/action-gh-release@v2
+        with:
+          files: |
+            dist/*.tar.gz
+            dist/*.zip
+            dist/sha256sums.txt
+
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Log in to GHCR
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract Docker metadata
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ghcr.io/${{ github.repository }}
+          tags: |
+            type=raw,value=${{ github.event.release.tag_name }}
+            type=raw,value=latest
+
+      - name: Build and Push Docker Image
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          file: ./Dockerfile
+          push: true
+          platforms: linux/amd64,linux/arm64
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
--- a/.gitignore
+++ b/.gitignore
@@ -45,6 +45,7 @@ env/
 *.log
 logs/
 uvicorn.log
+artifacts/

 # Vercel
 .vercel
@@ -64,8 +65,12 @@ pnpm-lock.yaml
 *.tsbuildinfo
 .cache/
 .parcel-cache/
-static/admin/*
-!static/admin/.gitkeep
+static/admin/
+internal/webui/assets/admin/
+
+# Go compiled binaries
+ds2api
+ds2api-tests

 # Environment
 .env.local
--- a/API.en.md
+++ b/API.en.md
--- a/API.md
+++ b/API.md
--- a/CONTRIBUTING.en.md
+++ b/CONTRIBUTING.en.md
@@ -2,93 +2,134 @@

 Language: [中文](CONTRIBUTING.md) | [English](CONTRIBUTING.en.md)

-Thank you for contributing to DS2API!
+Thanks for your interest in contributing to DS2API!

 ## Development Setup

-### Backend
+### Prerequisites
+
+- Go 1.24+
+- Node.js 20+ (for WebUI development)
+- npm (bundled with Node.js)
+
+### Backend Development

 ```bash
-# 1. Clone the repo
+# 1. Clone
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. Create a virtual environment (recommended)
-python -m venv venv
-source venv/bin/activate  # Windows: venv\Scripts\activate
-
-# 3. Install dependencies
-pip install -r requirements.txt
-
-# 4. Configure
+# 2. Configure
 cp config.example.json config.json
-# Edit config.json
+# Edit config.json with test accounts

-# 5. Run
-python dev.py
+# 3. Run backend
+go run ./cmd/ds2api
+# Default: http://localhost:5001
 ```

-### Frontend (WebUI)
+### Frontend Development (WebUI)

 ```bash
+# 1. Navigate to WebUI directory
 cd webui
+
+# 2. Install dependencies
 npm install
+
+# 3. Start dev server (hot reload)
 npm run dev
+# Default: http://localhost:5173, auto-proxies API to backend
 ```

-WebUI language packs live in `webui/src/locales/`. Add new locale JSON files there.
+WebUI tech stack:
+- React + Vite
+- Tailwind CSS
+- Bilingual language packs: `webui/src/locales/zh.json` / `en.json`
+
+### Docker Development
+
+```bash
+docker-compose -f docker-compose.dev.yml up
+```

 ## Code Standards

- **Python**: Follow PEP 8, use 4-space indentation
- **JavaScript/React**: Use 4-space indentation and function components
- **Commit messages**: Use semantic prefixes (e.g. `feat:`, `fix:`, `docs:`)
+| Language | Standards |
+| --- | --- |
+| **Go** | Run `gofmt` and ensure `go test ./...` passes before committing |
+| **JavaScript/React** | Follow existing project style (functional components) |
+| **Commit messages** | Use semantic prefixes: `feat:`, `fix:`, `docs:`, `refactor:`, `style:`, `perf:`, `chore:` |

 ## Submitting a PR

-1. Fork this repo
-2. Create a feature branch (`git checkout -b feature/xxx`)
-3. Commit your changes (`git commit -m 'feat: add xxx'`)
-4. Push your branch (`git push origin feature/xxx`)
+1. Fork the repo
+2. Create a branch (e.g. `feature/xxx` or `fix/xxx`)
+3. Commit changes
+4. Push your branch
 5. Open a Pull Request

-## WebUI Build
+> 💡 If you modify files under `webui/`, no manual build is needed — CI handles it automatically.

-> **Important**: After modifying `webui/`, **no manual build is required**.
+## Build WebUI

-When a PR is merged into `main`, GitHub Actions will automatically:
-1. Build the WebUI
-2. Commit build artifacts to `static/admin/`
+Manually build WebUI to `static/admin/`:

-If you need a local build (for testing):
 ```bash
 ./scripts/build-webui.sh
 ```

+## Running Tests
+
+```bash
+# Go unit tests
+go test ./...
+
+# End-to-end live tests (real accounts)
+./scripts/testsuite/run-live.sh
+```
+
 ## Project Structure

-```
+```text
 ds2api/
-├── app.py              # FastAPI entrypoint
-├── dev.py              # Development server
-├── core/               # Core modules
-│   ├── auth.py         # Account auth & rotation
-│   ├── config.py       # Configuration management
-│   ├── deepseek.py     # DeepSeek API calls
-│   ├── models.py       # Model definitions
-│   ├── pow.py          # PoW calculations
-│   └── sse_parser.py   # SSE parsing
-├── routes/             # API routes
-│   ├── openai.py       # OpenAI-compatible endpoints
-│   ├── claude.py       # Claude-compatible endpoints
-│   ├── home.py         # Landing page routes
-│   └── admin/          # Admin endpoints
-├── webui/              # React WebUI source
-├── static/admin/       # WebUI build output (auto-generated)
-└── scripts/            # Helper scripts
+├── cmd/
+│   ├── ds2api/              # Local/container entrypoint
+│   └── ds2api-tests/        # End-to-end testsuite entrypoint
+├── api/
+│   ├── index.go             # Vercel Serverless Go entry
+│   ├── chat-stream.js       # Vercel Node.js stream relay
+│   └── helpers/             # Node.js helper modules
+├── internal/
+│   ├── account/             # Account pool and concurrency queue
+│   ├── adapter/
+│   │   ├── openai/          # OpenAI adapter
+│   │   └── claude/          # Claude adapter
+│   ├── admin/               # Admin API handlers
+│   ├── auth/                # Auth and JWT
+│   ├── config/              # Config loading and hot-reload
+│   ├── deepseek/            # DeepSeek client, PoW WASM
+│   ├── server/              # HTTP routing (chi router)
+│   ├── sse/                 # SSE parsing utilities
+│   ├── testsuite/           # Testsuite core logic
+│   ├── util/                # Common utilities
+│   └── webui/               # WebUI static hosting
+├── webui/                   # React WebUI source
+│   └── src/
+│       ├── components/      # Components
+│       └── locales/         # Language packs
+├── scripts/                 # Build and test scripts
+├── static/admin/            # WebUI build output (not committed)
+├── Dockerfile               # Multi-stage build
+├── docker-compose.yml       # Production
+├── docker-compose.dev.yml   # Development
+└── vercel.json              # Vercel config
 ```

 ## Reporting Issues

- Use [GitHub Issues](https://github.com/CJackHwang/ds2api/issues)
- Provide detailed reproduction steps and logs
+Please use [GitHub Issues](https://github.com/CJackHwang/ds2api/issues) and include:
+
+- Steps to reproduce
+- Relevant log output
+- Environment info (OS, Go version, deployment method)
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,93 +2,134 @@

 语言 / Language: [中文](CONTRIBUTING.md) | [English](CONTRIBUTING.en.md)

-感谢你对 DS2API 的贡献！
+感谢你对 DS2API 的关注与贡献！

 ## 开发环境设置

-### 后端
+### 前置要求
+
+- Go 1.24+
+- Node.js 20+（WebUI 开发时）
+- npm（随 Node.js 提供）
+
+### 后端开发

 ```bash
 # 1. 克隆仓库
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. 创建虚拟环境（推荐）
-python -m venv venv
-source venv/bin/activate  # Windows: venv\Scripts\activate
-
-# 3. 安装依赖
-pip install -r requirements.txt
-
-# 4. 配置
+# 2. 配置
 cp config.example.json config.json
-# 编辑 config.json
+# 编辑 config.json，填入测试账号

-# 5. 启动
-python dev.py
+# 3. 启动后端
+go run ./cmd/ds2api
+# 默认监听 http://localhost:5001
 ```

-### 前端 (WebUI)
+### 前端开发（WebUI）

 ```bash
+# 1. 进入 WebUI 目录
 cd webui
+
+# 2. 安装依赖
 npm install
+
+# 3. 启动开发服务器（热更新）
 npm run dev
+# 默认监听 http://localhost:5173，自动代理 API 到后端
 ```

-WebUI 语言包位于 `webui/src/locales/`，新增语言请在此处添加对应 JSON 文件。
+WebUI 技术栈：
+- React + Vite
+- Tailwind CSS
+- 中英文语言包：`webui/src/locales/zh.json` / `en.json`
+
+### Docker 开发环境
+
+```bash
+docker-compose -f docker-compose.dev.yml up
+```

 ## 代码规范

- **Python**: 遵循 PEP 8，使用 4 空格缩进
- **JavaScript/React**: 使用 4 空格缩进，使用函数组件
- **提交信息**: 使用语义化提交格式（如 `feat:`, `fix:`, `docs:`）
+| 语言 | 规范 |
+| --- | --- |
+| **Go** | 提交前运行 `gofmt`，确保 `go test ./...` 通过 |
+| **JavaScript/React** | 保持现有代码风格（函数组件） |
+| **提交信息** | 使用语义化前缀：`feat:`、`fix:`、`docs:`、`refactor:`、`style:`、`perf:`、`chore:` |

 ## 提交 PR

-1. Fork 本仓库
-2. 创建功能分支 (`git checkout -b feature/xxx`)
-3. 提交更改 (`git commit -m 'feat: 添加xxx功能'`)
-4. 推送分支 (`git push origin feature/xxx`)
-5. 创建 Pull Request
+1. Fork 仓库
+2. 创建分支（如 `feature/xxx` 或 `fix/xxx`）
+3. 提交更改
+4. 推送分支
+5. 发起 Pull Request
+
+> 💡 如果修改了 `webui/` 目录下的文件，无需手动构建——CI 会自动处理。

 ## WebUI 构建

-> **重要**: 修改 `webui/` 目录后 **无需手动构建**！
+手动构建 WebUI 到 `static/admin/`：

-当 PR 合并到 `main` 分支后，GitHub Actions 会自动：
-1. 构建 WebUI
-2. 提交构建产物到 `static/admin/`
-
-如果需要本地构建（测试用）：
 ```bash
 ./scripts/build-webui.sh
 ```

+## 运行测试
+
+```bash
+# Go 单元测试
+go test ./...
+
+# 端到端全链路测试（真实账号）
+./scripts/testsuite/run-live.sh
+```
+
 ## 项目结构

-```
+```text
 ds2api/
-├── app.py              # FastAPI 应用入口
-├── dev.py              # 开发服务器
-├── core/               # 核心模块
-│   ├── auth.py         # 账号认证与轮询
-│   ├── config.py       # 配置管理
-│   ├── deepseek.py     # DeepSeek API 调用
-│   ├── models.py       # 模型定义
-│   ├── pow.py          # PoW 计算
-│   └── sse_parser.py   # SSE 解析
-├── routes/             # API 路由
-│   ├── openai.py       # OpenAI 兼容接口
-│   ├── claude.py       # Claude 兼容接口
-│   ├── home.py         # 首页路由
-│   └── admin/          # 管理接口
-├── webui/              # React WebUI 源码
-├── static/admin/       # WebUI 构建产物（自动生成）
-└── scripts/            # 辅助脚本
+├── cmd/
+│   ├── ds2api/              # 本地/容器启动入口
+│   └── ds2api-tests/        # 端到端测试集入口
+├── api/
+│   ├── index.go             # Vercel Serverless Go 入口
+│   ├── chat-stream.js       # Vercel Node.js 流式转发
+│   └── helpers/             # Node.js 辅助模块
+├── internal/
+│   ├── account/             # 账号池与并发队列
+│   ├── adapter/
+│   │   ├── openai/          # OpenAI 兼容适配器
+│   │   └── claude/          # Claude 兼容适配器
+│   ├── admin/               # Admin API handlers
+│   ├── auth/                # 鉴权与 JWT
+│   ├── config/              # 配置加载与热更新
+│   ├── deepseek/            # DeepSeek 客户端、PoW WASM
+│   ├── server/              # HTTP 路由（chi router）
+│   ├── sse/                 # SSE 解析工具
+│   ├── testsuite/           # 测试集核心逻辑
+│   ├── util/                # 通用工具
+│   └── webui/               # WebUI 静态托管
+├── webui/                   # React WebUI 源码
+│   └── src/
+│       ├── components/      # 组件
+│       └── locales/         # 语言包
+├── scripts/                 # 构建与测试脚本
+├── static/admin/            # WebUI 构建产物（不提交）
+├── Dockerfile               # 多阶段构建
+├── docker-compose.yml       # 生产环境
+├── docker-compose.dev.yml   # 开发环境
+└── vercel.json              # Vercel 配置
 ```

 ## 问题反馈

- 使用 [GitHub Issues](https://github.com/CJackHwang/ds2api/issues) 报告问题
- 提供详细的复现步骤和日志信息
+请使用 [GitHub Issues](https://github.com/CJackHwang/ds2api/issues) 并附上：
+
+- 复现步骤
+- 相关日志输出
+- 运行环境信息（OS、Go 版本、部署方式）
--- a/DEPLOY.en.md
+++ b/DEPLOY.en.md
@@ -2,409 +2,492 @@

 Language: [中文](DEPLOY.md) | [English](DEPLOY.en.md)

-This document covers all supported DS2API deployment methods.
+This guide covers all deployment methods for the current Go-based codebase.

 ---

 ## Table of Contents

- [Vercel Deployment (Recommended)](#vercel-deployment-recommended)
- [Docker Deployment (Recommended)](#docker-deployment-recommended)
- [Local Development](#local-development)
- [Production Deployment](#production-deployment)
- [FAQ](#faq)
+- [Prerequisites](#0-prerequisites)
+- [1. Local Run](#1-local-run)
+- [2. Docker Deployment](#2-docker-deployment)
+- [3. Vercel Deployment](#3-vercel-deployment)
+- [4. Download Release Binaries](#4-download-release-binaries)
+- [5. Reverse Proxy (Nginx)](#5-reverse-proxy-nginx)
+- [6. Linux systemd Service](#6-linux-systemd-service)
+- [7. Post-Deploy Checks](#7-post-deploy-checks)
+- [8. Pre-Release Local Regression](#8-pre-release-local-regression)

 ---

-## Vercel Deployment (Recommended)
+## 0. Prerequisites

-### One-click deployment
+| Dependency | Minimum Version | Notes |
+| --- | --- | --- |
+| Go | 1.24+ | Build backend |
+| Node.js | 20+ | Only needed to build WebUI locally |
+| npm | Bundled with Node.js | Install WebUI dependencies |

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api&env=DS2API_ADMIN_KEY&envDescription=Admin%20console%20access%20key%20%28required%29&envLink=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api%23environment-variables&project-name=ds2api&repository-name=ds2api)
+Config source (choose one):

-### Steps
-
-1. **Click the deploy button**
-   - Sign in to GitHub
-   - Authorize Vercel access
-
-2. **Set environment variables**
-   - `DS2API_ADMIN_KEY`: Admin console password (**required**)
-
-3. **Wait for deployment**
-   - Vercel builds and deploys automatically
-   - You will receive a deployment URL
-
-4. **Configure accounts**
-   - Visit `https://your-project.vercel.app/admin`
-   - Log in with the admin key
-   - Add DeepSeek accounts
-   - Set custom API keys
-
-5. **Sync configuration**
-   - Click "Sync to Vercel"
-   - The first sync requires a Vercel token and project ID
-   - After sync, the configuration is persisted
-
-### Get Vercel credentials
-
-**Vercel token**:
-1. Visit https://vercel.com/account/tokens
-2. Click "Create Token"
-3. Set a name and expiration
-4. Copy the token
-
-**Project ID**:
-1. Open your Vercel project
-2. Go to Settings → General
-3. Copy the "Project ID"
+- **File**: `config.json` (recommended for local/Docker)
+- **Environment variable**: `DS2API_CONFIG_JSON` (recommended for Vercel; supports raw JSON or Base64)

 ---

-## Local Development
+## 1. Local Run

-### Requirements
-
- Python 3.9+
- Node.js 18+ (WebUI development)
- pip
-
-### Quick start
+### 1.1 Basic Steps

 ```bash
-# 1. Clone the repo
+# Clone
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. Install Python dependencies
-pip install -r requirements.txt
-
-# 3. Configure accounts
+# Copy and edit config
 cp config.example.json config.json
-# Edit config.json and fill in DeepSeek account info
+# Open config.json and fill in:
+#   - keys: your API access keys
+#   - accounts: DeepSeek accounts (email or mobile + password)

-# 4. Start the service
-python dev.py
+# Start
+go run ./cmd/ds2api
 ```

-### Config example
+Default address: `http://0.0.0.0:5001` (override with `PORT`).

-```json
-{
-  "keys": ["my-api-key-1", "my-api-key-2"],
-  "accounts": [
-    {
-      "email": "your-email@example.com",
-      "password": "your-password",
-      "token": ""
-    },
-    {
-      "mobile": "12345678901",
-      "password": "your-password",
-      "token": ""
-    }
-  ]
-}
-```
+### 1.2 WebUI Build

-**Notes**:
- `keys`: Custom API keys for calling the service
- `accounts`: DeepSeek Web accounts
-  - Supports `email` or `mobile` login
-  - Leave `token` blank; it will be fetched automatically
+On first local startup, if `static/admin/` is missing, DS2API will automatically attempt to build the WebUI (requires Node.js/npm).

-### WebUI development
+Manual build:

 ```bash
-# Enter the WebUI directory
-cd webui
-
-# Install dependencies
-npm install
-
-# Start the dev server
-npm run dev
-```
-
-The WebUI dev server runs on `http://localhost:5173` and proxies API requests to `http://localhost:5001`.
-
-### WebUI build
-
-Build artifacts are located in `static/admin/`.
-
-**Automatic build (recommended)**:
- Vercel builds the WebUI during deployment (see `vercel.json` `buildCommand`)
- The GitHub Actions WebUI build workflow is disabled
- `static/admin/` build artifacts are no longer committed
-
-**Manual build**:
-```bash
-# Option 1: use script
 ./scripts/build-webui.sh
+```

-# Option 2: run directly
+Or step by step:
+
+```bash
 cd webui
 npm install
 npm run build
+# Output goes to static/admin/
 ```

-> **Contributor note**: No manual build is required after modifying WebUI; Vercel deploys will build it automatically.
+Control auto-build via environment variable:
+
+```bash
+# Disable auto-build
+DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
+
+# Force enable auto-build
+DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
+```
+
+### 1.3 Compile to Binary
+
+```bash
+go build -o ds2api ./cmd/ds2api
+./ds2api
+```

 ---

-## Docker Deployment (Recommended)
+## 2. Docker Deployment

-Docker uses a **non-invasive, decoupled design**:
- Dockerfile executes standard Python steps and avoids hardcoded project configs
- WebUI is built during image build (for non-Vercel deployments)
- Configuration lives in environment variables and `.env`
- **Rebuild the image to update code without touching Docker config**
-
-### Quick start (Docker Compose)
+### 2.1 Basic Steps

 ```bash
-# 1. Copy the environment template
+# Copy and edit environment
 cp .env.example .env
-# Edit .env with DS2API_ADMIN_KEY and DS2API_CONFIG_JSON
+# Edit .env, at minimum set:
+#   DS2API_ADMIN_KEY=your-admin-key
+#   DS2API_CONFIG_JSON={"keys":[...],"accounts":[...]}

-# 2. Start the service
+# Start
 docker-compose up -d

-# 3. Check logs
+# View logs
 docker-compose logs -f
+```

-# 4. Rebuild after code updates
+### 2.2 Update
+
+```bash
 docker-compose up -d --build
 ```

-### Mount a config file
+### 2.3 Docker Architecture

-To use `config.json` instead of environment variables:
+The `Dockerfile` uses a three-stage build:

-```yaml
-# docker-compose.yml
-services:
-  ds2api:
-    build: .
-    ports:
-      - "5001:5001"
-    environment:
-      - DS2API_ADMIN_KEY=your-admin-key
-    volumes:
-      - ./config.json:/app/config.json:ro
-    restart: unless-stopped
-```
+1. **WebUI build stage**: `node:20` image, runs `npm ci && npm run build`
+2. **Go build stage**: `golang:1.24` image, compiles the binary
+3. **Runtime stage**: `debian:bookworm-slim` minimal image

-### Docker CLI deployment
+Container entry command: `/usr/local/bin/ds2api`, default exposed port: `5001`.
+
+### 2.4 Development Mode

 ```bash
-# Build the image
-docker build -t ds2api:latest .
-
-# Run with env variables
-docker run -d \
-  --name ds2api \
-  -p 5001:5001 \
-  -e DS2API_ADMIN_KEY=your-admin-key \
-  -e DS2API_CONFIG_JSON='{"keys":["api-key"],"accounts":[...]}' \
-  --restart unless-stopped \
-  ds2api:latest
-
-# Or mount a config file
-docker run -d \
-  --name ds2api \
-  -p 5001:5001 \
-  -e DS2API_ADMIN_KEY=your-admin-key \
-  -v $(pwd)/config.json:/app/config.json:ro \
-  --restart unless-stopped \
-  ds2api:latest
-```
-
-### Development mode (hot reload)
-
-```bash
-# Use the dev compose file to enable hot reload
 docker-compose -f docker-compose.dev.yml up
 ```

-Development mode:
- Source code is mounted into the container
- Log level set to DEBUG
- Reads local `config.json`
+Development features:
+- Source code mounted (live changes)
+- `LOG_LEVEL=DEBUG`
+- No auto-restart

-### Maintenance commands
+### 2.5 Health Check
+
+Docker Compose includes a built-in health check:
+
+```yaml
+healthcheck:
+  test: ["CMD", "wget", "-qO-", "http://localhost:${PORT:-5001}/healthz"]
+  interval: 30s
+  timeout: 10s
+  retries: 3
+  start_period: 10s
+```
+
+### 2.6 Docker Troubleshooting
+
+If container logs look normal but the admin panel is unreachable, check these first:
+
+1. **Port alignment**: when `PORT` is not `5001`, use the same port in your URL (for example `http://localhost:8080/admin`).
+2. **WebUI assets in dev compose**: `docker-compose.dev.yml` runs `go run` in a dev image and does not auto-install Node.js inside the container; if `static/admin` is missing in your repo, `/admin` will return 404. Build once on host: `./scripts/build-webui.sh`.
+
+---
+
+## 3. Vercel Deployment
+
+### 3.1 Steps
+
+1. **Fork** the repo to your GitHub account
+2. **Import** the project on Vercel
+3. **Set environment variables** (at minimum):
+
+   | Variable | Description |
+   | --- | --- |
+   | `DS2API_ADMIN_KEY` | Admin key (required) |
+   | `DS2API_CONFIG_JSON` | Config content, raw JSON or Base64 (required) |
+
+4. **Deploy**
+
+### 3.2 Optional Environment Variables
+
+| Variable | Description | Default |
+| --- | --- | --- |
+| `DS2API_ACCOUNT_MAX_INFLIGHT` | Per-account inflight limit | `2` |
+| `DS2API_ACCOUNT_CONCURRENCY` | Alias (legacy compat) | — |
+| `DS2API_ACCOUNT_MAX_QUEUE` | Waiting queue limit | `recommended_concurrency` |
+| `DS2API_ACCOUNT_QUEUE_SIZE` | Alias (legacy compat) | — |
+| `DS2API_VERCEL_INTERNAL_SECRET` | Hybrid streaming internal auth | Falls back to `DS2API_ADMIN_KEY` |
+| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | Stream lease TTL | `900` |
+| `VERCEL_TOKEN` | Vercel sync token | — |
+| `VERCEL_PROJECT_ID` | Vercel project ID | — |
+| `VERCEL_TEAM_ID` | Vercel team ID | — |
+| `DS2API_VERCEL_PROTECTION_BYPASS` | Deployment protection bypass for internal Node→Go calls | — |
+
+### 3.3 Vercel Architecture
+
+```text
+Request ──────┐
+              │
+              ▼
+         vercel.json routing
+              │
+        ┌─────┴─────┐
+        │           │
+        ▼           ▼
+  api/index.go   api/chat-stream.js
+  (Go Runtime)   (Node Runtime)
+```
+
+- **Go entry**: `api/index.go` (Serverless Go)
+- **Stream entry**: `api/chat-stream.js` (Node Runtime for real-time SSE)
+- **Routing**: `vercel.json`
+- **Build command**: `npm ci --prefix webui && npm run build --prefix webui` (automatic)
+
+#### Streaming Pipeline
+
+Vercel Go Runtime applies platform-level response buffering, so this project uses a hybrid "**Go prepare + Node stream**" path on Vercel:
+
+1. `api/chat-stream.js` receives `/v1/chat/completions` request
+2. Node calls Go internal prepare endpoint (`?__stream_prepare=1`) for session ID, PoW, token
+3. Go prepare creates a stream lease, locking the account
+4. Node connects directly to DeepSeek upstream, relays SSE in real-time to client (including OpenAI chunk framing and tools anti-leak sieve)
+5. After stream ends, Node calls Go release endpoint (`?__stream_release=1`) to free the account
+
+> This adaptation is **Vercel-only**; local and Docker remain pure Go.
+
+#### Non-Stream Fallback and Tool Call Handling
+
+- `api/chat-stream.js` falls back to Go entry (`?__go=1`) for non-stream requests only
+- Streaming requests (including requests with `tools`) stay on the Node path and use Go-aligned tool-call anti-leak handling
+- WebUI non-stream test calls `?__go=1` directly to avoid Node hop timeout on long requests
+
+#### Function Duration
+
+`vercel.json` sets `maxDuration: 300` for both `api/chat-stream.js` and `api/index.go` (subject to your Vercel plan limits).
+
+### 3.4 Vercel Troubleshooting
+
+#### Go Build Failure
+
+```text
+Error: Command failed: go build -ldflags -s -w -o .../bootstrap ...
+```
+
+**Cause**: Invalid Go build flag settings in Vercel (`-ldflags` not passed as a single argument).
+
+**Fix**:
+
+1. Open Vercel Project Settings → Build and Development Settings
+2. **Clear** custom Go Build Flags / Build Command (recommended)
+3. If ldflags must be used, set `-ldflags="-s -w"` (ensure it's one argument)
+4. Verify `go.mod` uses a supported version (currently `go 1.24`)
+5. Redeploy (recommended: clear cache)
+
+#### Internal Package Import Error
+
+```text
+use of internal package ds2api/internal/server not allowed
+```
+
+**Cause**: Vercel Go entrypoint directly imports `internal/...`.
+
+**Fix**: This repo uses a public bridge package: `api/index.go` → `ds2api/app` → `internal/server`.
+
+#### Output Directory Error
+
+```text
+No Output Directory named "public" found after the Build completed.
+```
+
+**Fix**: This repo uses `static` as output directory (`"outputDirectory": "static"` in `vercel.json`). If you manually changed Output Directory in Project Settings, set it to `static` or clear it.
+
+#### Deployment Protection Blocking
+
+If API responses return Vercel HTML `Authentication Required`:
+
+- **Option A**: Disable Deployment Protection for that environment (recommended for public APIs)
+- **Option B**: Add `x-vercel-protection-bypass` header to requests
+- **Option C**: Set `VERCEL_AUTOMATION_BYPASS_SECRET` (or `DS2API_VERCEL_PROTECTION_BYPASS`) for internal Node→Go calls
+
+### 3.5 Build Artifacts Not Committed
+
+- `static/admin` directory is not in Git
+- Vercel / Docker automatically generate WebUI assets during build
+
+---
+
+## 4. Download Release Binaries
+
+Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
+
+- **Trigger**: only on Release `published` (no build on normal push)
+- **Outputs**: multi-platform binary archives + `sha256sums.txt`
+
+| Platform | Architecture | Format |
+| --- | --- | --- |
+| Linux | amd64, arm64 | `.tar.gz` |
+| macOS | amd64, arm64 | `.tar.gz` |
+| Windows | amd64 | `.zip` |
+
+Each archive includes:
+
+- `ds2api` executable (`ds2api.exe` on Windows)
+- `static/admin/` (built WebUI assets)
+- `sha3_wasm_bg.7b9ca65ddd.wasm`
+- `config.example.json`, `.env.example`
+- `README.MD`, `README.en.md`, `LICENSE`
+
+### Usage

 ```bash
-# Check container status
-docker-compose ps
+# 1. Download the archive for your platform
+# 2. Extract
+tar -xzf ds2api_v1.7.0_linux_amd64.tar.gz
+cd ds2api_v1.7.0_linux_amd64

-# View logs
-docker-compose logs -f ds2api
+# 3. Configure
+cp config.example.json config.json
+# Edit config.json

-# Restart
-docker-compose restart
+# 4. Start
+./ds2api
+```

-# Stop
-docker-compose down
+### Maintainer Release Flow

-# Full rebuild (clear cache)
-docker-compose down
-docker-compose build --no-cache
-docker-compose up -d
+1. Create and publish a GitHub Release (with tag, e.g. `v1.7.0`)
+2. Wait for the `Release Artifacts` workflow to complete
+3. Download the matching archive from Release Assets
+
+---
+
+## 5. Reverse Proxy (Nginx)
+
+When deploying behind Nginx, **you must disable buffering** for SSE streaming to work:
+
+```nginx
+location / {
+    proxy_pass http://127.0.0.1:5001;
+    proxy_http_version 1.1;
+    proxy_set_header Connection "";
+    proxy_buffering off;
+    proxy_cache off;
+    chunked_transfer_encoding on;
+    tcp_nodelay on;
+}
+```
+
+For HTTPS, add SSL at the Nginx layer:
+
+```nginx
+server {
+    listen 443 ssl;
+    server_name api.example.com;
+
+    ssl_certificate /path/to/cert.pem;
+    ssl_certificate_key /path/to/key.pem;
+
+    location / {
+        proxy_pass http://127.0.0.1:5001;
+        proxy_http_version 1.1;
+        proxy_set_header Connection "";
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+        proxy_buffering off;
+        proxy_cache off;
+        chunked_transfer_encoding on;
+        tcp_nodelay on;
+    }
+}
 ```

 ---

-## Production Deployment
+## 6. Linux systemd Service

-### Using systemd (Linux)
-
-1. **Create the service file**
+### 6.1 Installation

 ```bash
-sudo nano /etc/systemd/system/ds2api.service
+# Copy compiled binary and related files to target directory
+sudo mkdir -p /opt/ds2api
+sudo cp ds2api config.json sha3_wasm_bg.7b9ca65ddd.wasm /opt/ds2api/
+sudo cp -r static/admin /opt/ds2api/static/admin
 ```

+### 6.2 Create systemd Service File
+
 ```ini
+# /etc/systemd/system/ds2api.service
+
 [Unit]
-Description=DS2API Service
+Description=DS2API (Go)
 After=network.target

 [Service]
 Type=simple
-User=www-data
 WorkingDirectory=/opt/ds2api
-ExecStart=/usr/bin/python3 app.py
-Restart=always
-RestartSec=10
 Environment=PORT=5001
-Environment=DS2API_ADMIN_KEY=your-admin-key
+Environment=DS2API_CONFIG_PATH=/opt/ds2api/config.json
+Environment=DS2API_ADMIN_KEY=your-admin-key-here
+ExecStart=/opt/ds2api/ds2api
+Restart=always
+RestartSec=5

 [Install]
 WantedBy=multi-user.target
 ```

-2. **Start the service**
+### 6.3 Common Commands

 ```bash
+# Reload service config
 sudo systemctl daemon-reload
+
+# Enable on boot
 sudo systemctl enable ds2api
+
+# Start
 sudo systemctl start ds2api
-```

-3. **Check status**
-
-```bash
+# Check status
 sudo systemctl status ds2api
+
+# View logs
 sudo journalctl -u ds2api -f
-```

-### Nginx reverse proxy
+# Restart
+sudo systemctl restart ds2api

-```nginx
-server {
-    listen 80;
-    server_name api.yourdomain.com;
-
-    # SSL configuration (recommended)
-    # listen 443 ssl http2;
-    # ssl_certificate /path/to/cert.pem;
-    # ssl_certificate_key /path/to/key.pem;
-
-    location / {
-        proxy_pass http://127.0.0.1:5001;
-        proxy_http_version 1.1;
-        
-        # Disable buffering for SSE
-        proxy_buffering off;
-        proxy_cache off;
-        
-        # Connection settings
-        proxy_set_header Host $host;
-        proxy_set_header X-Real-IP $remote_addr;
-        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
-        proxy_set_header X-Forwarded-Proto $scheme;
-        
-        # SSE timeouts
-        proxy_read_timeout 300s;
-        proxy_send_timeout 300s;
-        
-        # Chunked transfer
-        chunked_transfer_encoding on;
-        tcp_nopush on;
-        tcp_nodelay on;
-        keepalive_timeout 120;
-    }
-}
+# Stop
+sudo systemctl stop ds2api
 ```

 ---

-## FAQ
+## 7. Post-Deploy Checks

-### Q: What if account validation fails?
+After deployment (any method), verify in order:

-**A**: Check the following:
-1. Confirm the DeepSeek account password is correct
-2. Ensure the account is not banned or requires verification
-3. Log in once in a browser
-4. Check logs for detailed errors
-
-### Q: Streaming responses disconnect?
-
-**A**:
-1. Check Nginx / reverse proxy config and ensure `proxy_buffering` is off
-2. Increase `proxy_read_timeout`
-3. Verify network stability
-
-### Q: Configuration lost after Vercel deploy?
-
-**A**:
-1. Ensure you clicked "Sync to Vercel"
-2. Verify the Vercel token is valid and unexpired
-3. Ensure the project ID is correct
-
-### Q: How to update to the latest version?
-
-**Local deployment**:
 ```bash
-git pull origin main
-pip install -r requirements.txt
-# Restart the service
+# 1. Liveness probe
+curl -s http://127.0.0.1:5001/healthz
+# Expected: {"status":"ok"}
+
+# 2. Readiness probe
+curl -s http://127.0.0.1:5001/readyz
+# Expected: {"status":"ready"}
+
+# 3. Model list
+curl -s http://127.0.0.1:5001/v1/models
+# Expected: {"object":"list","data":[...]}
+
+# 4. Admin panel (if WebUI is built)
+curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:5001/admin
+# Expected: 200
+
+# 5. Test API call
+curl http://127.0.0.1:5001/v1/chat/completions \
+  -H "Authorization: Bearer your-api-key" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"hello"}]}'
 ```

-**Docker deployment**:
-```bash
-# Pull the latest code
-git pull origin main
-
-# Rebuild and start (Docker config unchanged)
-docker-compose up -d --build
-```
-
-**Vercel deployment**:
- The project auto-syncs from GitHub
- Or trigger a redeploy in the Vercel console
-
-### Q: How do I view logs?
-
-**Local dev**:
-```bash
-# Set log level
-export LOG_LEVEL=DEBUG
-python dev.py
-```
-
-**Vercel**:
- Vercel console → Project → Deployments → Logs
-
-### Q: Token counting is inaccurate?
-
-**A**: DS2API uses a heuristic estimate (characters / 4). The official OpenAI tokenizer may differ, so treat it as a reference only.
-
 ---

-## Get Help
+## 8. Pre-Release Local Regression

- **GitHub Issues**: https://github.com/CJackHwang/ds2api/issues
- **Docs**: https://github.com/CJackHwang/ds2api
+Run the full live testsuite before release (real account tests):
+
+```bash
+./scripts/testsuite/run-live.sh
+```
+
+With custom flags:
+
+```bash
+go run ./cmd/ds2api-tests \
+  --config config.json \
+  --admin-key admin \
+  --out artifacts/testsuite \
+  --timeout 120 \
+  --retries 2
+```
+
+The testsuite automatically performs:
+
+- ✅ Preflight checks (syntax/build/unit tests)
+- ✅ Isolated config copy startup (no mutation to your original `config.json`)
+- ✅ Live scenario verification (OpenAI/Claude/Admin/concurrency/toolcall/streaming)
+- ✅ Full request/response artifact logging for debugging
+
+For detailed testsuite documentation, see [TESTING.md](TESTING.md).
--- a/DEPLOY.md
+++ b/DEPLOY.md
@@ -2,409 +2,492 @@

 语言 / Language: [中文](DEPLOY.md) | [English](DEPLOY.en.md)

-本文档详细介绍 DS2API 的各种部署方式。
+本指南基于当前 Go 代码库，详细说明各种部署方式。

 ---

 ## 目录

- [Vercel 部署（推荐）](#vercel-部署推荐)
- [Docker 部署（推荐）](#docker-部署推荐)
- [本地开发](#本地开发)
- [生产环境部署](#生产环境部署)
- [常见问题](#常见问题)
+- [前置要求](#0-前置要求)
+- [一、本地运行](#一本地运行)
+- [二、Docker 部署](#二docker-部署)
+- [三、Vercel 部署](#三vercel-部署)
+- [四、下载 Release 构建包](#四下载-release-构建包)
+- [五、反向代理（Nginx）](#五反向代理nginx)
+- [六、Linux systemd 服务化](#六linux-systemd-服务化)
+- [七、部署后检查](#七部署后检查)
+- [八、发布前进行本地回归](#八发布前进行本地回归)

 ---

-## Vercel 部署（推荐）
+## 0. 前置要求

-### 一键部署
+| 依赖 | 最低版本 | 说明 |
+| --- | --- | --- |
+| Go | 1.24+ | 编译后端 |
+| Node.js | 20+ | 仅在需要本地构建 WebUI 时 |
+| npm | 随 Node.js 提供 | 安装 WebUI 依赖 |

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api&env=DS2API_ADMIN_KEY&envDescription=管理面板访问密码（必填）&envLink=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api%23环境变量&project-name=ds2api&repository-name=ds2api)
+配置来源（任选其一）：

-### 部署步骤
-
-1. **点击部署按钮**
-   - 登录你的 GitHub 账号
-   - 授权 Vercel 访问
-
-2. **设置环境变量**
-   - `DS2API_ADMIN_KEY`: 管理面板密码（**必填**）
-
-3. **等待部署完成**
-   - Vercel 会自动构建并部署项目
-   - 部署完成后获得访问 URL
-
-4. **配置账号**
-   - 访问 `https://your-project.vercel.app/admin`
-   - 输入管理密码登录
-   - 添加 DeepSeek 账号
-   - 设置自定义 API Key
-
-5. **同步配置**
-   - 点击「同步到 Vercel」按钮
-   - 首次需要输入 Vercel Token 和 Project ID
-   - 同步成功后配置会持久化
-
-### 获取 Vercel 凭证
-
-**Vercel Token**:
-1. 访问 https://vercel.com/account/tokens
-2. 点击 "Create Token"
-3. 设置名称和有效期
-4. 复制生成的 Token
-
-**Project ID**:
-1. 进入 Vercel 项目页面
-2. 点击 Settings -> General
-3. 复制 "Project ID"
+- **文件方式**：`config.json`（推荐本地/Docker 使用）
+- **环境变量方式**：`DS2API_CONFIG_JSON`（推荐 Vercel 使用，支持 JSON 字符串或 Base64 编码）

 ---

-## 本地开发
+## 一、本地运行

-### 环境要求
-
- Python 3.9+
- Node.js 18+ (WebUI 开发)
- pip
-
-### 快速开始
+### 1.1 基本步骤

 ```bash
-# 1. 克隆项目
+# 克隆仓库
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. 安装 Python 依赖
-pip install -r requirements.txt
-
-# 3. 配置账号
+# 复制并编辑配置
 cp config.example.json config.json
-# 编辑 config.json，填入 DeepSeek 账号信息
+# 使用你喜欢的编辑器打开 config.json，填入：
+#   - keys: 你的 API 访问密钥
+#   - accounts: DeepSeek 账号（email 或 mobile + password）

-# 4. 启动服务
-python dev.py
+# 启动服务
+go run ./cmd/ds2api
 ```

-### 配置文件示例
+默认监听 `http://0.0.0.0:5001`，可通过 `PORT` 环境变量覆盖。

-```json
-{
-  "keys": ["my-api-key-1", "my-api-key-2"],
-  "accounts": [
-    {
-      "email": "your-email@example.com",
-      "password": "your-password",
-      "token": ""
-    },
-    {
-      "mobile": "12345678901",
-      "password": "your-password",
-      "token": ""
-    }
-  ]
-}
-```
+### 1.2 WebUI 构建

-**说明**：
- `keys`: 自定义 API Key，用于调用本服务的接口
- `accounts`: DeepSeek 网页版账号
-  - 支持 `email` 或 `mobile` 登录
-  - `token` 留空，系统会自动获取
+本地首次启动时，若 `static/admin/` 不存在，服务会自动尝试构建 WebUI（需要 Node.js/npm）。

-### WebUI 开发
+你也可以手动构建：

 ```bash
-# 进入 WebUI 目录
-cd webui
-
-# 安装依赖
-npm install
-
-# 启动开发服务器
-npm run dev
-```
-
-WebUI 开发服务器会启动在 `http://localhost:5173`，并自动代理 API 请求到后端 `http://localhost:5001`。
-
-### WebUI 构建
-
-WebUI 构建产物位于 `static/admin/` 目录。
-
-**自动构建（推荐）**：
- 当前由 Vercel 在部署时执行 WebUI 构建（见 `vercel.json` 的 `buildCommand`）
- GitHub Actions 的 WebUI 自动构建流程已关闭
- `static/admin/` 构建产物不再提交到仓库
-
-**手动构建**：
-```bash
-# 方式1：使用脚本
 ./scripts/build-webui.sh
+```

-# 方式2：直接执行
+或手动执行：
+
+```bash
 cd webui
 npm install
 npm run build
+# 产物输出到 static/admin/
 ```

-> **贡献者注意**：修改 WebUI 后无需手动构建，Vercel 部署会自动构建。
+通过环境变量控制自动构建行为：
+
+```bash
+# 强制关闭自动构建
+DS2API_AUTO_BUILD_WEBUI=false go run ./cmd/ds2api
+
+# 强制开启自动构建
+DS2API_AUTO_BUILD_WEBUI=true go run ./cmd/ds2api
+```
+
+### 1.3 编译为二进制文件
+
+```bash
+go build -o ds2api ./cmd/ds2api
+./ds2api
+```

 ---

-## Docker 部署（推荐）
+## 二、Docker 部署

-Docker 部署采用**零侵入、解耦设计**：
- Dockerfile 仅执行标准 Python 项目操作，不硬编码任何项目特定配置
- 构建镜像时会一并构建 WebUI（便于非 Vercel 部署直接访问管理面板）
- 所有配置通过环境变量和 `.env` 文件管理
- **主代码更新时只需重新构建镜像，无需修改 Docker 配置**
-
-### 快速开始（Docker Compose）
+### 2.1 基本步骤

 ```bash
-# 1. 复制环境变量模板
+# 复制并编辑环境变量
 cp .env.example .env
-# 编辑 .env，填写 DS2API_ADMIN_KEY 和 DS2API_CONFIG_JSON
+# 编辑 .env，至少设置：
+#   DS2API_ADMIN_KEY=your-admin-key
+#   DS2API_CONFIG_JSON={"keys":[...],"accounts":[...]}

-# 2. 启动服务
+# 启动
 docker-compose up -d

-# 3. 查看日志
+# 查看日志
 docker-compose logs -f
+```

-# 4. 主代码更新后重新构建
+### 2.2 更新
+
+```bash
 docker-compose up -d --build
 ```

-### 配置文件挂载方式
+### 2.3 Docker 架构说明

-如需使用 `config.json` 而非环境变量：
+`Dockerfile` 使用三阶段构建：

-```yaml
-# docker-compose.yml
-services:
-  ds2api:
-    build: .
-    ports:
-      - "5001:5001"
-    environment:
-      - DS2API_ADMIN_KEY=your-admin-key
-    volumes:
-      - ./config.json:/app/config.json:ro
-    restart: unless-stopped
-```
+1. **WebUI 构建阶段**：`node:20` 镜像，执行 `npm ci && npm run build`
+2. **Go 构建阶段**：`golang:1.24` 镜像，编译二进制文件
+3. **运行阶段**：`debian:bookworm-slim` 精简镜像

-### Docker 命令行部署
+容器内启动命令：`/usr/local/bin/ds2api`，默认暴露端口 `5001`。
+
+### 2.4 开发环境

 ```bash
-# 构建镜像
-docker build -t ds2api:latest .
-
-# 使用环境变量运行
-docker run -d \
-  --name ds2api \
-  -p 5001:5001 \
-  -e DS2API_ADMIN_KEY=your-admin-key \
-  -e DS2API_CONFIG_JSON='{"keys":["api-key"],"accounts":[...]}' \
-  --restart unless-stopped \
-  ds2api:latest
-
-# 或使用配置文件挂载
-docker run -d \
-  --name ds2api \
-  -p 5001:5001 \
-  -e DS2API_ADMIN_KEY=your-admin-key \
-  -v $(pwd)/config.json:/app/config.json:ro \
-  --restart unless-stopped \
-  ds2api:latest
-```
-
-### 开发模式（热重载）
-
-```bash
-# 使用开发配置启动，代码修改实时生效
 docker-compose -f docker-compose.dev.yml up
 ```

 开发模式特性：
- 源代码挂载到容器，修改即时生效
- 日志级别设为 DEBUG
- 自动读取本地 `config.json`
+- 源代码挂载（修改即生效）
+- `LOG_LEVEL=DEBUG`
+- 不自动重启

-### 维护命令
+### 2.5 健康检查
+
+Docker Compose 已配置内置健康检查：
+
+```yaml
+healthcheck:
+  test: ["CMD", "wget", "-qO-", "http://localhost:${PORT:-5001}/healthz"]
+  interval: 30s
+  timeout: 10s
+  retries: 3
+  start_period: 10s
+```
+
+### 2.6 Docker 常见排查
+
+如果容器日志正常但面板打不开，优先检查：
+
+1. **端口是否一致**：`PORT` 改成非 `5001` 时，访问地址也要改成对应端口（如 `http://localhost:8080/admin`）。
+2. **开发 compose 的 WebUI 静态文件**：`docker-compose.dev.yml` 使用 `go run` 开发镜像，不会在容器内自动安装 Node.js；若仓库里没有 `static/admin`，`/admin` 会返回 404。可先在宿主机构建一次：`./scripts/build-webui.sh`。
+
+---
+
+## 三、Vercel 部署
+
+### 3.1 部署步骤
+
+1. **Fork 仓库**到你的 GitHub 账号
+2. **在 Vercel 上导入项目**
+3. **配置环境变量**（至少设置以下两项）：
+
+   | 变量 | 说明 |
+   | --- | --- |
+   | `DS2API_ADMIN_KEY` | 管理密钥（必填） |
+   | `DS2API_CONFIG_JSON` | 配置内容，JSON 字符串或 Base64 编码（必填） |
+
+4. **部署**
+
+### 3.2 可选环境变量
+
+| 变量 | 说明 | 默认值 |
+| --- | --- | --- |
+| `DS2API_ACCOUNT_MAX_INFLIGHT` | 每账号并发上限 | `2` |
+| `DS2API_ACCOUNT_CONCURRENCY` | 同上（兼容别名） | — |
+| `DS2API_ACCOUNT_MAX_QUEUE` | 等待队列上限 | `recommended_concurrency` |
+| `DS2API_ACCOUNT_QUEUE_SIZE` | 同上（兼容别名） | — |
+| `DS2API_VERCEL_INTERNAL_SECRET` | 混合流式内部鉴权 | 回退用 `DS2API_ADMIN_KEY` |
+| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | 流式 lease TTL | `900` |
+| `VERCEL_TOKEN` | Vercel 同步 token | — |
+| `VERCEL_PROJECT_ID` | Vercel 项目 ID | — |
+| `VERCEL_TEAM_ID` | Vercel 团队 ID | — |
+| `DS2API_VERCEL_PROTECTION_BYPASS` | 部署保护绕过密钥（内部 Node→Go 调用） | — |
+
+### 3.3 Vercel 架构说明
+
+```text
+请求 ─────┐
+          │
+          ▼
+     vercel.json 路由规则
+          │
+    ┌─────┴─────┐
+    │           │
+    ▼           ▼
+api/index.go  api/chat-stream.js
+(Go Runtime)  (Node Runtime)
+```
+
+- **入口文件**：`api/index.go`（Serverless Go）
+- **流式入口**：`api/chat-stream.js`（Node Runtime，保证实时 SSE）
+- **路由重写**：`vercel.json`
+- **构建命令**：`npm ci --prefix webui && npm run build --prefix webui`（自动执行）
+
+#### 流式处理链路
+
+由于 Vercel Go Runtime 存在平台层响应缓冲，本项目在 Vercel 上采用"**Go prepare + Node stream**"的混合链路：
+
+1. `api/chat-stream.js` 收到 `/v1/chat/completions` 请求
+2. Node 调用 Go 内部 prepare 接口（`?__stream_prepare=1`），获取会话 ID、PoW、token 等
+3. Go prepare 创建 stream lease，锁定账号
+4. Node 直连 DeepSeek 上游，实时流式转发 SSE 给客户端（含 OpenAI chunk 封装与 tools 防泄漏筛分）
+5. 流结束后 Node 调用 Go release 接口（`?__stream_release=1`），释放账号
+
+> 该适配**仅在 Vercel 环境生效**；本地与 Docker 仍走纯 Go 链路。
+
+#### 非流式回退与 Tool Call 处理
+
+- `api/chat-stream.js` 仅对非流式请求回退到 Go 入口（`?__go=1`）
+- 流式请求（包括带 `tools`）走 Node 路径，并执行与 Go 对齐的 tool-call 防泄漏处理
+- WebUI 的"非流式测试"直接请求 `?__go=1`，避免 Node 中转造成长请求超时
+
+#### 函数时长
+
+`vercel.json` 已将 `api/chat-stream.js` 与 `api/index.go` 的 `maxDuration` 设为 `300`（受 Vercel 套餐上限约束）。
+
+### 3.4 Vercel 常见报错排查
+
+#### Go 构建失败
+
+```text
+Error: Command failed: go build -ldflags -s -w -o .../bootstrap ...
+```
+
+**原因**：Vercel 项目的 Go 构建参数配置不正确（`-ldflags` 没有作为一个整体字符串传递）。
+
+**解决**：
+
+1. 进入 Vercel Project Settings → Build and Development Settings
+2. **清空**自定义 Go Build Flags / Build Command（推荐）
+3. 若必须设置 ldflags，使用 `-ldflags="-s -w"`（保证它是一个参数）
+4. 确认仓库 `go.mod` 为受支持版本（当前为 `go 1.24`）
+5. 重新部署（建议清缓存后 Redeploy）
+
+#### Internal 包导入错误
+
+```text
+use of internal package ds2api/internal/server not allowed
+```
+
+**原因**：Vercel Go 入口文件直接 `import internal/...`。
+
+**解决**：当前仓库已通过公开桥接包 `app` 解决：`api/index.go` → `ds2api/app` → `internal/server`。
+
+#### 输出目录错误
+
+```text
+No Output Directory named "public" found after the Build completed.
+```
+
+**解决**：当前仓库使用 `static` 作为输出目录（`vercel.json` 中 `"outputDirectory": "static"`）。若你在项目设置里手动改过 Output Directory，请设为 `static` 或清空让仓库配置生效。
+
+#### 部署保护拦截
+
+如果接口返回 Vercel HTML 页面 `Authentication Required`：
+
+- **方案 A**：关闭该部署/环境的 Deployment Protection（推荐用于公开 API）
+- **方案 B**：请求中添加 `x-vercel-protection-bypass` 头
+- **方案 C**：设置 `VERCEL_AUTOMATION_BYPASS_SECRET`（或 `DS2API_VERCEL_PROTECTION_BYPASS`），仅影响内部 Node→Go 调用
+
+### 3.5 仓库不提交构建产物
+
+- `static/admin` 目录不在 Git 中
+- Vercel / Docker 构建阶段自动生成 WebUI 静态文件
+
+---
+
+## 四、下载 Release 构建包
+
+仓库内置 GitHub Actions 工作流：`.github/workflows/release-artifacts.yml`
+
+- **触发条件**：仅在 Release `published` 时触发（普通 push 不会构建）
+- **构建产物**：多平台二进制压缩包 + `sha256sums.txt`
+
+| 平台 | 架构 | 文件格式 |
+| --- | --- | --- |
+| Linux | amd64, arm64 | `.tar.gz` |
+| macOS | amd64, arm64 | `.tar.gz` |
+| Windows | amd64 | `.zip` |
+
+每个压缩包包含：
+
+- `ds2api` 可执行文件（Windows 为 `ds2api.exe`）
+- `static/admin/`（WebUI 构建产物）
+- `sha3_wasm_bg.7b9ca65ddd.wasm`
+- `config.example.json`、`.env.example`
+- `README.MD`、`README.en.md`、`LICENSE`
+
+### 使用步骤

 ```bash
-# 查看容器状态
-docker-compose ps
+# 1. 下载对应平台的压缩包
+# 2. 解压
+tar -xzf ds2api_v1.7.0_linux_amd64.tar.gz
+cd ds2api_v1.7.0_linux_amd64

-# 查看日志
-docker-compose logs -f ds2api
+# 3. 配置
+cp config.example.json config.json
+# 编辑 config.json

-# 重启服务
-docker-compose restart
+# 4. 启动
+./ds2api
+```

-# 停止服务
-docker-compose down
+### 维护者发布步骤

-# 完全重建（清除缓存）
-docker-compose down
-docker-compose build --no-cache
-docker-compose up -d
+1. 在 GitHub 创建并发布 Release（带 tag，如 `v1.7.0`）
+2. 等待 Actions 工作流 `Release Artifacts` 完成
+3. 在 Release 的 Assets 下载对应平台压缩包
+
+---
+
+## 五、反向代理（Nginx）
+
+如果在 Nginx 后部署，**必须关闭缓冲**以保证 SSE 流式响应正常工作：
+
+```nginx
+location / {
+    proxy_pass http://127.0.0.1:5001;
+    proxy_http_version 1.1;
+    proxy_set_header Connection "";
+    proxy_buffering off;
+    proxy_cache off;
+    chunked_transfer_encoding on;
+    tcp_nodelay on;
+}
+```
+
+如果需要 HTTPS，可以在 Nginx 层配置 SSL 证书：
+
+```nginx
+server {
+    listen 443 ssl;
+    server_name api.example.com;
+
+    ssl_certificate /path/to/cert.pem;
+    ssl_certificate_key /path/to/key.pem;
+
+    location / {
+        proxy_pass http://127.0.0.1:5001;
+        proxy_http_version 1.1;
+        proxy_set_header Connection "";
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+        proxy_buffering off;
+        proxy_cache off;
+        chunked_transfer_encoding on;
+        tcp_nodelay on;
+    }
+}
 ```

 ---

-## 生产环境部署
+## 六、Linux systemd 服务化

-### 使用 systemd (Linux)
-
-1. **创建服务文件**
+### 6.1 安装

 ```bash
-sudo nano /etc/systemd/system/ds2api.service
+# 将编译好的二进制文件和相关文件复制到目标目录
+sudo mkdir -p /opt/ds2api
+sudo cp ds2api config.json sha3_wasm_bg.7b9ca65ddd.wasm /opt/ds2api/
+sudo cp -r static/admin /opt/ds2api/static/admin
 ```

+### 6.2 创建 systemd 服务文件
+
 ```ini
+# /etc/systemd/system/ds2api.service
+
 [Unit]
-Description=DS2API Service
+Description=DS2API (Go)
 After=network.target

 [Service]
 Type=simple
-User=www-data
 WorkingDirectory=/opt/ds2api
-ExecStart=/usr/bin/python3 app.py
-Restart=always
-RestartSec=10
 Environment=PORT=5001
-Environment=DS2API_ADMIN_KEY=your-admin-key
+Environment=DS2API_CONFIG_PATH=/opt/ds2api/config.json
+Environment=DS2API_ADMIN_KEY=your-admin-key-here
+ExecStart=/opt/ds2api/ds2api
+Restart=always
+RestartSec=5

 [Install]
 WantedBy=multi-user.target
 ```

-2. **启动服务**
+### 6.3 常用命令

 ```bash
+# 加载服务配置
 sudo systemctl daemon-reload
+
+# 设置开机自启
 sudo systemctl enable ds2api
+
+# 启动服务
 sudo systemctl start ds2api
-```

-3. **查看状态**
-
-```bash
+# 查看状态
 sudo systemctl status ds2api
+
+# 查看日志
 sudo journalctl -u ds2api -f
-```

-### Nginx 反向代理
-
-```nginx
-server {
-    listen 80;
-    server_name api.yourdomain.com;
-
-    # SSL 配置（推荐）
-    # listen 443 ssl http2;
-    # ssl_certificate /path/to/cert.pem;
-    # ssl_certificate_key /path/to/key.pem;
-
-    location / {
-        proxy_pass http://127.0.0.1:5001;
-        proxy_http_version 1.1;
-        
-        # 关闭缓冲，支持 SSE
-        proxy_buffering off;
-        proxy_cache off;
-        
-        # 连接设置
-        proxy_set_header Host $host;
-        proxy_set_header X-Real-IP $remote_addr;
-        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
-        proxy_set_header X-Forwarded-Proto $scheme;
-        
-        # SSE 超时设置
-        proxy_read_timeout 300s;
-        proxy_send_timeout 300s;
-        
-        # 分块传输
-        chunked_transfer_encoding on;
-        tcp_nopush on;
-        tcp_nodelay on;
-        keepalive_timeout 120;
-    }
-}
-```
-
---
-
-## 常见问题
-
-### Q: 账号验证失败怎么办？
-
-**A**: 检查以下几点：
-1. 确认 DeepSeek 账号密码正确
-2. 检查账号是否被封禁或需要验证
-3. 尝试在浏览器中手动登录一次
-4. 查看日志获取详细错误信息
-
-### Q: 流式响应断开怎么办？
-
-**A**: 
-1. 检查 Nginx/反向代理配置，确保关闭了 `proxy_buffering`
-2. 增加 `proxy_read_timeout` 超时时间
-3. 检查网络连接稳定性
-
-### Q: Vercel 部署后配置丢失？
-
-**A**: 
-1. 确保点击了「同步到 Vercel」按钮
-2. 检查 Vercel Token 是否正确且未过期
-3. 确认 Project ID 正确
-
-### Q: 如何更新到新版本？
-
-**本地部署**:
-```bash
-git pull origin main
-pip install -r requirements.txt
 # 重启服务
+sudo systemctl restart ds2api
+
+# 停止服务
+sudo systemctl stop ds2api
 ```

-**Docker 部署**:
-```bash
-# 拉取最新代码
-git pull origin main
-
-# 重新构建并启动（无需修改 Docker 配置）
-docker-compose up -d --build
-```
-
-**Vercel 部署**:
- 项目会自动从 GitHub 同步更新
- 或在 Vercel 控制台手动触发重新部署
-
-### Q: 如何查看日志？
-
-**本地开发**:
-```bash
-# 设置日志级别
-export LOG_LEVEL=DEBUG
-python dev.py
-```
-
-**Vercel**:
- 访问 Vercel 控制台 -> 项目 -> Deployments -> Logs
-
-### Q: Token 计数不准确？
-
-**A**: DS2API 使用估算方式计算 token 数量（字符数 / 4），与 OpenAI 官方的 tokenizer 可能有差异，仅供参考。
-
 ---

-## 获取帮助
+## 七、部署后检查

- **GitHub Issues**: https://github.com/CJackHwang/ds2api/issues
- **文档**: https://github.com/CJackHwang/ds2api
+无论使用哪种部署方式，启动后建议依次检查：
+
+```bash
+# 1. 存活探针
+curl -s http://127.0.0.1:5001/healthz
+# 预期: {"status":"ok"}
+
+# 2. 就绪探针
+curl -s http://127.0.0.1:5001/readyz
+# 预期: {"status":"ready"}
+
+# 3. 模型列表
+curl -s http://127.0.0.1:5001/v1/models
+# 预期: {"object":"list","data":[...]}
+
+# 4. 管理台页面（如果已构建 WebUI）
+curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:5001/admin
+# 预期: 200
+
+# 5. 测试 API 调用
+curl http://127.0.0.1:5001/v1/chat/completions \
+  -H "Authorization: Bearer your-api-key" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"hello"}]}'
+```
+
+---
+
+## 八、发布前进行本地回归
+
+建议在发布前执行完整的端到端测试集（使用真实账号）：
+
+```bash
+./scripts/testsuite/run-live.sh
+```
+
+可自定义参数：
+
+```bash
+go run ./cmd/ds2api-tests \
+  --config config.json \
+  --admin-key admin \
+  --out artifacts/testsuite \
+  --timeout 120 \
+  --retries 2
+```
+
+测试集自动执行内容：
+
+- ✅ 语法/构建/单测 preflight
+- ✅ 隔离副本配置启动服务（不污染原始 `config.json`）
+- ✅ 真实调用场景验证（OpenAI/Claude/Admin/并发/toolcall/流式）
+- ✅ 全量请求与响应日志落盘（用于故障复盘）
+
+详细测试集说明参阅 [TESTING.md](TESTING.md)。
--- a/33
+++ b/33
@@ -1,33 +1,26 @@
-# DS2API Docker 镜像
-# 采用极简、零侵入设计，所有配置通过环境变量传递
-# 主代码更新时只需重新构建镜像，无需修改 Dockerfile
-
 FROM node:20 AS webui-builder

 WORKDIR /app/webui
-
 COPY webui/package.json webui/package-lock.json ./
 RUN npm ci
-
 COPY webui ./
 RUN npm run build

-FROM python:3.11-slim
-
+FROM golang:1.24 AS go-builder
 WORKDIR /app
-
-# 安装依赖（利用 Docker 缓存层）
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-
-# 复制整个项目（保留原始目录结构）
+ARG TARGETOS=linux
+ARG TARGETARCH=amd64
+COPY go.mod go.sum* ./
+RUN go mod download
 COPY . .
+RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -o /out/ds2api ./cmd/ds2api

-# 拷贝 WebUI 构建产物（非 Vercel / Docker 部署可直接使用）
+FROM debian:bookworm-slim
+WORKDIR /app
+RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates wget && rm -rf /var/lib/apt/lists/*
+COPY --from=go-builder /out/ds2api /usr/local/bin/ds2api
+COPY --from=go-builder /app/sha3_wasm_bg.7b9ca65ddd.wasm /app/sha3_wasm_bg.7b9ca65ddd.wasm
+COPY --from=go-builder /app/config.example.json /app/config.example.json
 COPY --from=webui-builder /app/static/admin /app/static/admin
-
-# 暴露服务端口
 EXPOSE 5001
-
-# 启动命令（依赖项目自身的启动逻辑）
-CMD ["python", "app.py"]
+CMD ["/usr/local/bin/ds2api"]
--- a/README.MD
+++ b/README.MD
@@ -4,98 +4,170 @@
 ![Stars](https://img.shields.io/github/stars/CJackHwang/ds2api.svg)
 ![Forks](https://img.shields.io/github/forks/CJackHwang/ds2api.svg)
 [![Version](https://img.shields.io/badge/version-1.6.11-blue.svg)](version.txt)
-[![Docker](https://img.shields.io/badge/docker-ready-blue.svg)](DEPLOY.md#docker-部署推荐)
+[![Docker](https://img.shields.io/badge/docker-ready-blue.svg)](DEPLOY.md)

 语言 / Language: [中文](README.MD) | [English](README.en.md)

-将 DeepSeek 免费对话版转换为 **OpenAI & Claude 兼容 API**，支持多账号轮询、自动 Token 刷新、可视化管理界面。
+将 DeepSeek Web 对话能力转换为 OpenAI 与 Claude 兼容 API。后端为 **Go 全量实现**，前端为 React WebUI 管理台（源码在 `webui/`，部署时自动构建到 `static/admin`）。

-![p1](https://github.com/user-attachments/assets/07296a50-50d4-4f05-a9e5-280df14e9532)
-![p2](https://github.com/user-attachments/assets/03b4a763-766f-4050-aea8-1a183e70ae6a)
-![p3](https://github.com/user-attachments/assets/fc8b9836-11e3-4c38-a684-eb2c79b80fe9)
-![p4](https://github.com/user-attachments/assets/513e9ca7-aa9e-45a6-8f7e-f362b1650675)
+## 架构概览

+```mermaid
+flowchart LR
+    Client["🖥️ 客户端\n(OpenAI / Claude 兼容)"]

+    subgraph DS2API["DS2API 服务"]
+        direction TB
+        CORS["CORS 中间件"]
+        Auth["🔐 鉴权中间件"]

-## ✨ 特性
+        subgraph Adapters["适配器层"]
+            OA["OpenAI 适配器\n/v1/*"]
+            CA["Claude 适配器\n/anthropic/*"]
+        end

- 🔄 **双协议兼容** - 同时支持 OpenAI 和 Claude (Anthropic) API 格式
- 🚀 **多账号轮询** - Round-Robin 负载均衡，支持高并发场景
- 🔐 **Token 自动刷新** - 过期自动重新登录，无需手动维护
- 🌐 **WebUI 管理** - 可视化添加账号、测试 API、同步 Vercel 配置
- 🌍 **多语言切换** - WebUI 内置中英双语，可随时切换
- 🔍 **联网搜索** - 支持 DeepSeek 原生搜索增强模式
- 🧠 **深度思考** - 支持推理模式，输出思考过程
- 🛠️ **工具调用** - 兼容 OpenAI Function Calling 格式
- ☁️ **Vercel 一键部署** - 无需服务器，快速上线
+        subgraph Support["支撑模块"]
+            Pool["📦 账号池 / 并发队列"]
+            PoW["⚙️ PoW WASM\n(wazero)"]
+        end

-## 📋 模型支持
+        Admin["🛠️ Admin API\n/admin/*"]
+        WebUI["🌐 WebUI\n(/admin)"]
+    end

-### OpenAI 兼容接口 (`/v1/chat/completions`)
+    DS["☁️ DeepSeek API"]

-| 模型 | 深度思考 | 联网搜索 | 说明 |
-|-----|:--------:|:--------:|------|
-| `deepseek-chat` | ❌ | ❌ | 标准对话模式 |
-| `deepseek-reasoner` | ✅ | ❌ | 推理模式（输出思考过程） |
-| `deepseek-chat-search` | ❌ | ✅ | 联网搜索模式 |
-| `deepseek-reasoner-search` | ✅ | ✅ | 推理 + 联网搜索 |
+    Client -- "请求" --> CORS --> Auth
+    Auth --> OA & CA
+    OA & CA -- "调用" --> DS
+    Auth --> Admin
+    OA & CA -. "轮询选账号" .-> Pool
+    OA & CA -. "计算 PoW" .-> PoW
+    DS -- "响应" --> Client
+```

-### Claude 兼容接口 (`/anthropic/v1/messages`)
+- **后端**：Go（`cmd/ds2api/`、`api/`、`internal/`），不依赖 Python 运行时
+- **前端**：React 管理台（`webui/`），运行时托管静态构建产物
+- **部署**：本地运行、Docker、Vercel Serverless、Linux systemd

-| 模型 | 说明 |
-|-----|------|
-| `claude-sonnet-4-20250514` | 映射到 deepseek-chat（标准模式） |
-| `claude-sonnet-4-20250514-fast` | 映射到 deepseek-chat（快速模式） |
-| `claude-sonnet-4-20250514-slow` | 映射到 deepseek-reasoner（推理模式） |
+## 核心能力

-> **提示**：Claude 接口实际调用的是 DeepSeek，响应格式会自动转换为 Anthropic 标准格式。
+| 能力 | 说明 |
+| --- | --- |
+| OpenAI 兼容 | `GET /v1/models`、`POST /v1/chat/completions`（流式/非流式） |
+| Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens` |
+| 多账号轮询 | 自动 token 刷新、邮箱/手机号双登录方式 |
+| 并发队列控制 | 每账号 in-flight 上限 + 等待队列，动态计算建议并发值 |
+| DeepSeek PoW | WASM 计算（`wazero`），无需外部 Node.js 依赖 |
+| Tool Calling | 防泄漏处理：自动缓冲、识别、结构化输出 |
+| Admin API | 配置管理、账号测试 / 批量测试、导入导出、Vercel 同步 |
+| WebUI 管理台 | `/admin` 单页应用（中英文双语、深色模式） |
+| 运维探针 | `GET /healthz`（存活）、`GET /readyz`（就绪） |

-## 🚀 快速开始
+## 模型支持

-### 方式一：Vercel 部署（推荐）
+### OpenAI 接口

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api&env=DS2API_ADMIN_KEY&envDescription=管理面板访问密码（必填）&envLink=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api%23环境变量&project-name=ds2api&repository-name=ds2api)
+| 模型 | thinking | search |
+| --- | --- | --- |
+| `deepseek-chat` | ❌ | ❌ |
+| `deepseek-reasoner` | ✅ | ❌ |
+| `deepseek-chat-search` | ❌ | ✅ |
+| `deepseek-reasoner-search` | ✅ | ✅ |

-1. 点击上方按钮，设置管理密码 `DS2API_ADMIN_KEY`
-2. 部署完成后访问 `/admin` 管理界面
-3. 添加 DeepSeek 账号和自定义 API Key
-4. 点击「同步到 Vercel」保存配置
+### Claude 接口

-> **首次同步会自动验证账号并保存 Token，后续操作无需重复输入凭证。**
+| 模型 | 默认映射 |
+| --- | --- |
+| `claude-sonnet-4-5` | `deepseek-chat` |
+| `claude-haiku-4-5`（兼容 `claude-3-5-haiku-latest`） | `deepseek-chat` |
+| `claude-opus-4-6` | `deepseek-reasoner` |

-### 方式二：本地开发
+可通过配置中的 `claude_mapping` 或 `claude_model_mapping` 覆盖映射关系。
+另外，`/anthropic/v1/models` 现已包含 Claude 1.x/2.x/3.x/4.x 历史模型 ID 与常见别名，便于旧客户端直接兼容。
+
+## 快速开始
+
+### 方式一：本地运行
+
+**前置要求**：Go 1.24+，Node.js 20+（仅在需要构建 WebUI 时）

 ```bash
 # 1. 克隆仓库
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. 安装依赖
-pip install -r requirements.txt
-
-# 3. 配置账号
+# 2. 配置
 cp config.example.json config.json
-# 编辑 config.json，添加 DeepSeek 账号信息
+# 编辑 config.json，填入你的 DeepSeek 账号信息和 API key

-# 4. 启动服务
-python dev.py
+# 3. 启动
+go run ./cmd/ds2api
 ```

-服务启动后访问 `http://localhost:5001`
+默认监听地址：`http://localhost:5001`

-## ⚙️ 配置说明
+> **WebUI 自动构建**：本地首次启动时，若 `static/admin` 不存在，会自动尝试执行 `npm install && npm run build`（需要本机有 Node.js）。你也可以手动构建：`./scripts/build-webui.sh`

-### 环境变量
+### 方式二：Docker 运行

-| 变量 | 说明 | 必填 |
-|-----|------|:----:|
-| `DS2API_ADMIN_KEY` | 管理面板密码 | Vercel 必填 |
-| `DS2API_CONFIG_JSON` | 配置 JSON 或 Base64 编码 | 可选 |
-| `VERCEL_TOKEN` | Vercel API Token（用于同步） | 可选 |
-| `VERCEL_PROJECT_ID` | Vercel 项目 ID | 可选 |
-| `PORT` | 服务端口（默认 5001） | 可选 |
+```bash
+# 1. 配置环境变量
+cp .env.example .env
+# 编辑 .env

-### 配置文件格式 (`config.json`)
+# 2. 启动
+docker-compose up -d
+
+# 3. 查看日志
+docker-compose logs -f
+```
+
+更新镜像：`docker-compose up -d --build`
+
+### 方式三：Vercel 部署
+
+1. Fork 仓库到自己的 GitHub
+2. 在 Vercel 上导入项目
+3. 配置环境变量（至少设置 `DS2API_ADMIN_KEY` 和 `DS2API_CONFIG_JSON`）
+4. 部署
+
+> **流式说明**：`/v1/chat/completions` 在 Vercel 上默认走 `api/chat-stream.js`（Node Runtime）以保证实时 SSE。鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口完成；流式响应（含 `tools`）在 Node 侧执行与 Go 对齐的输出组装与防泄漏处理。
+
+详细部署说明请参阅 [部署指南](DEPLOY.md)。
+
+### 方式四：下载 Release 构建包
+
+每次发布 Release 时，GitHub Actions 会自动构建多平台二进制包：
+
+```bash
+# 下载对应平台的压缩包后
+tar -xzf ds2api_v1.7.0_linux_amd64.tar.gz
+cd ds2api_v1.7.0_linux_amd64
+cp config.example.json config.json
+# 编辑 config.json
+./ds2api
+```
+
+### 方式五：OpenCode CLI 接入
+
+1. 复制示例配置：
+
+```bash
+cp opencode.json.example opencode.json
+```
+
+2. 编辑 `opencode.json`：
+- 将 `baseURL` 改为你的 DS2API 地址（例如 `https://your-domain.com/v1`）
+- 将 `apiKey` 改为你的 DS2API key（对应 `config.keys`）
+
+3. 在项目目录启动 OpenCode CLI（按你的安装方式运行 `opencode`）。
+
+> 建议优先使用 OpenAI 兼容路径（`/v1/*`），即示例里的 `@ai-sdk/openai-compatible` provider。
+
+## 配置说明
+
+### `config.json` 示例

 ```json
 {
@@ -111,125 +183,158 @@ python dev.py
      "password": "your-password",
      "token": ""
    }
-  ]
+  ],
+  "claude_model_mapping": {
+    "fast": "deepseek-chat",
+    "slow": "deepseek-reasoner"
+  }
 }
 ```

-> **说明**：
-> - `keys`: 自定义的 API 密钥，用于调用本服务
-> - `accounts`: DeepSeek 网页版账号，支持邮箱或手机号登录
-> - `token`: 留空即可，系统会自动获取并刷新
+- `keys`：API 访问密钥列表，客户端通过 `Authorization: Bearer <key>` 鉴权
+- `accounts`：DeepSeek 账号列表，支持 `email` 或 `mobile` 登录
+- `token`：留空则首次请求时自动登录获取；也可预填已有 token
+- `claude_model_mapping`：字典中 `fast`/`slow` 后缀映射到对应 DeepSeek 模型

-## 📡 API 使用
+### 环境变量

-完整 API 文档请参阅 **[API.md](API.md)**
+| 变量 | 用途 | 默认值 |
+| --- | --- | --- |
+| `PORT` | 服务端口 | `5001` |
+| `LOG_LEVEL` | 日志级别 | `INFO`（可选：`DEBUG`/`WARN`/`ERROR`） |
+| `DS2API_ADMIN_KEY` | Admin 登录密钥 | `admin` |
+| `DS2API_JWT_SECRET` | Admin JWT 签名密钥 | 等同 `DS2API_ADMIN_KEY` |
+| `DS2API_JWT_EXPIRE_HOURS` | Admin JWT 过期小时数 | `24` |
+| `DS2API_CONFIG_PATH` | 配置文件路径 | `config.json` |
+| `DS2API_CONFIG_JSON` | 直接注入配置（JSON 或 Base64） | — |
+| `DS2API_WASM_PATH` | PoW WASM 文件路径 | 自动查找 |
+| `DS2API_STATIC_ADMIN_DIR` | 管理台静态文件目录 | `static/admin` |
+| `DS2API_AUTO_BUILD_WEBUI` | 启动时自动构建 WebUI | 本地开启，Vercel 关闭 |
+| `DS2API_ACCOUNT_MAX_INFLIGHT` | 每账号最大并发 in-flight 请求数 | `2` |
+| `DS2API_ACCOUNT_CONCURRENCY` | 同上（兼容旧名） | — |
+| `DS2API_ACCOUNT_MAX_QUEUE` | 等待队列上限 | `recommended_concurrency` |
+| `DS2API_ACCOUNT_QUEUE_SIZE` | 同上（兼容旧名） | — |
+| `DS2API_VERCEL_INTERNAL_SECRET` | Vercel 混合流式内部鉴权密钥 | 回退用 `DS2API_ADMIN_KEY` |
+| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | 流式 lease 过期秒数 | `900` |
+| `VERCEL_TOKEN` | Vercel 同步 token | — |
+| `VERCEL_PROJECT_ID` | Vercel 项目 ID | — |
+| `VERCEL_TEAM_ID` | Vercel 团队 ID | — |
+| `DS2API_VERCEL_PROTECTION_BYPASS` | Vercel 部署保护绕过密钥（内部 Node→Go 调用） | — |

-### 快速示例
+## 鉴权模式

-**获取模型列表**：
-```bash
-curl http://localhost:5001/v1/models
+调用业务接口（`/v1/*`、`/anthropic/*`）时支持两种模式：
+
+| 模式 | 说明 |
+| --- | --- |
+| **托管账号模式** | `Bearer` 或 `x-api-key` 传入 `config.keys` 中的 key，由服务自动轮询选择账号 |
+| **直通 token 模式** | 传入 token 不在 `config.keys` 中时，直接作为 DeepSeek token 使用 |
+
+可选请求头 `X-Ds2-Target-Account`：指定使用某个托管账号（值为 email 或 mobile）。
+
+## 并发模型
+
+```
+每账号可用并发 = DS2API_ACCOUNT_MAX_INFLIGHT（默认 2）
+建议并发值 = 账号数量 × 每账号并发上限
+等待队列上限 = DS2API_ACCOUNT_MAX_QUEUE（默认 = 建议并发值）
+429 阈值 = in-flight + 等待队列 ≈ 账号数量 × 4
 ```

-**OpenAI 格式调用**：
-```bash
-curl http://localhost:5001/v1/chat/completions \
-  -H "Authorization: Bearer your-api-key" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "deepseek-chat",
-    "messages": [{"role": "user", "content": "你好"}],
-    "stream": true
-  }'
+- 当 in-flight 槽位满时，请求进入等待队列，**不会立即 429**
+- 超出总承载上限后才返回 `429 Too Many Requests`
+- `GET /admin/queue/status` 返回实时并发状态
+
+## Tool Call 适配
+
+当请求中带 `tools` 时，DS2API 会做防泄漏处理：
+
+1. `stream=true` 时先**缓冲**正文片段
+2. 若识别到工具调用 → 仅输出结构化 `tool_calls`，不透传原始 JSON 文本
+3. 若最终不是工具调用 → 一次性输出普通文本
+4. 解析器支持混合文本、fenced JSON、`function.arguments` 字符串等格式
+
+## 项目结构
+
+```text
+ds2api/
+├── cmd/
+│   ├── ds2api/              # 本地 / 容器启动入口
+│   └── ds2api-tests/        # 端到端测试集入口
+├── api/
+│   ├── index.go             # Vercel Serverless Go 入口
+│   ├── chat-stream.js       # Vercel Node.js 流式转发
+│   └── helpers/             # Node.js 辅助模块
+├── internal/
+│   ├── account/             # 账号池与并发队列
+│   ├── adapter/
+│   │   ├── openai/          # OpenAI 兼容适配器（含 Tool Call 解析、Vercel 流式 prepare/release）
+│   │   └── claude/          # Claude 兼容适配器
+│   ├── admin/               # Admin API handlers
+│   ├── auth/                # 鉴权与 JWT
+│   ├── config/              # 配置加载与热更新
+│   ├── deepseek/            # DeepSeek API 客户端、PoW WASM
+│   ├── server/              # HTTP 路由与中间件（chi router）
+│   ├── sse/                 # SSE 解析工具
+│   ├── util/                # 通用工具函数
+│   └── webui/               # WebUI 静态文件托管与自动构建
+├── webui/                   # React WebUI 源码（Vite + Tailwind）
+│   └── src/
+│       ├── components/      # AccountManager / ApiTester / BatchImport / VercelSync / Login / LandingPage
+│       └── locales/         # 中英文语言包（zh.json / en.json）
+├── scripts/
+│   ├── build-webui.sh       # WebUI 手动构建脚本
+│   └── testsuite/           # 测试集运行脚本
+├── static/admin/            # WebUI 构建产物（不提交到 Git）
+├── .github/
+│   ├── workflows/           # GitHub Actions（Release 自动构建）
+│   ├── ISSUE_TEMPLATE/      # Issue 模板
+│   └── PULL_REQUEST_TEMPLATE.md
+├── config.example.json      # 配置文件示例
+├── .env.example             # 环境变量示例
+├── Dockerfile               # 多阶段构建（WebUI + Go）
+├── docker-compose.yml       # 生产环境 Docker Compose
+├── docker-compose.dev.yml   # 开发环境 Docker Compose
+├── vercel.json              # Vercel 路由与构建配置
+├── go.mod / go.sum          # Go 模块依赖
+└── version.txt              # 版本号
 ```

-**Claude 格式调用**：
-```bash
-curl http://localhost:5001/anthropic/v1/messages \
-  -H "x-api-key: your-api-key" \
-  -H "Content-Type: application/json" \
-  -H "anthropic-version: 2023-06-01" \
-  -d '{
-    "model": "claude-sonnet-4-20250514",
-    "max_tokens": 1024,
-    "messages": [{"role": "user", "content": "你好"}]
-  }'
-```
+## 文档索引

-### Python SDK 使用
+| 文档 | 说明 |
+| --- | --- |
+| [API.md](API.md) / [API.en.md](API.en.md) | API 接口文档（含请求/响应示例） |
+| [DEPLOY.md](DEPLOY.md) / [DEPLOY.en.md](DEPLOY.en.md) | 部署指南（本地/Docker/Vercel/systemd） |
+| [CONTRIBUTING.md](CONTRIBUTING.md) / [CONTRIBUTING.en.md](CONTRIBUTING.en.md) | 贡献指南 |
+| [TESTING.md](TESTING.md) | 测试集使用指南 |

-```python
-from openai import OpenAI
-
-client = OpenAI(
-    api_key="your-api-key",
-    base_url="http://localhost:5001/v1"
-)
-
-response = client.chat.completions.create(
-    model="deepseek-reasoner",
-    messages=[{"role": "user", "content": "请解释量子纠缠"}],
-    stream=True
-)
-
-for chunk in response:
-    if chunk.choices[0].delta.content:
-        print(chunk.choices[0].delta.content, end="")
-```
-
-## 🔧 部署配置
-
-### Nginx 反向代理
-
-```nginx
-location / {
-    proxy_pass http://localhost:5001;
-    proxy_http_version 1.1;
-    proxy_set_header Connection "";
-    proxy_buffering off;
-    proxy_cache off;
-    chunked_transfer_encoding on;
-    tcp_nopush on;
-    tcp_nodelay on;
-    keepalive_timeout 120;
-}
-```
-
-### 方式三：Docker 部署
+## 测试

 ```bash
-# 1. 克隆仓库并进入目录
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
+# 单元测试
+go test ./...

-# 2. 配置环境变量
-cp .env.example .env
-# 编辑 .env，填写 DS2API_ADMIN_KEY 和 DS2API_CONFIG_JSON
+# 一键端到端全链路测试（真实账号，生成完整请求/响应日志）
+./scripts/testsuite/run-live.sh

-# 3. 启动服务
-docker-compose up -d
-
-# 4. 查看日志
-docker-compose logs -f
+# 或自定义参数
+go run ./cmd/ds2api-tests \
+  --config config.json \
+  --admin-key admin \
+  --out artifacts/testsuite \
+  --timeout 120 \
+  --retries 2
 ```

-> **Docker 优势**：零侵入设计，主代码更新只需 `docker-compose up -d --build`，无需修改 Docker 配置。详见 [DEPLOY.md](DEPLOY.md#docker-部署推荐)。
+## Release 自动构建（GitHub Actions）

-## ⚠️ 免责声明
+工作流文件：`.github/workflows/release-artifacts.yml`

-**本项目基于逆向工程实现，服务稳定性无法保证。**
+- **触发条件**：仅在 GitHub Release `published` 时触发（普通 push 不会触发）
+- **构建产物**：多平台二进制包（`linux/amd64`、`linux/arm64`、`darwin/amd64`、`darwin/arm64`、`windows/amd64`）+ `sha256sums.txt`
+- **每个压缩包包含**：`ds2api` 可执行文件、`static/admin`、WASM 文件、配置示例、README、LICENSE

- 仅供学习研究使用，**禁止商业用途或对外提供服务**
- 建议正式项目使用 [DeepSeek 官方 API](https://platform.deepseek.com/)
- 使用本项目产生的任何风险由用户自行承担
+## 免责声明

-## 📜 鸣谢
-
-本项目基于以下开源项目：
-
- [iidamie/deepseek2api](https://github.com/iidamie/deepseek2api)
- [LLM-Red-Team/deepseek-free-api](https://github.com/LLM-Red-Team/deepseek-free-api)
-
-## 📊 Star History
-
-[![Star History Chart](https://api.star-history.com/svg?repos=CJackHwang/ds2api&type=Date)](https://star-history.com/#CJackHwang/ds2api&Date)
+本项目基于逆向方式实现，仅供学习与研究使用。稳定性和可用性不作保证，请勿用于违反服务条款或法律法规的场景。
--- a/README.en.md
+++ b/README.en.md
@@ -4,96 +4,170 @@
 ![Stars](https://img.shields.io/github/stars/CJackHwang/ds2api.svg)
 ![Forks](https://img.shields.io/github/forks/CJackHwang/ds2api.svg)
 [![Version](https://img.shields.io/badge/version-1.6.11-blue.svg)](version.txt)
-[![Docker](https://img.shields.io/badge/docker-ready-blue.svg)](DEPLOY.md#docker-deployment-recommended)
+[![Docker](https://img.shields.io/badge/docker-ready-blue.svg)](DEPLOY.en.md)

 Language: [中文](README.MD) | [English](README.en.md)

-Convert DeepSeek Web into an **OpenAI & Claude compatible API**, with multi-account rotation, automatic token refresh, and a visual admin console.
+DS2API converts DeepSeek Web chat capability into OpenAI-compatible and Claude-compatible APIs. The backend is a **pure Go implementation**, with a React WebUI admin panel (source in `webui/`, build output auto-generated to `static/admin` during deployment).

-![p1](https://github.com/user-attachments/assets/07296a50-50d4-4f05-a9e5-280df14e9532)
-![p2](https://github.com/user-attachments/assets/03b4a763-766f-4050-aea8-1a183e70ae6a)
-![p3](https://github.com/user-attachments/assets/fc8b9836-11e3-4c38-a684-eb2c79b80fe9)
-![p4](https://github.com/user-attachments/assets/513e9ca7-aa9e-45a6-8f7e-f362b1650675)
+## Architecture Overview

-## ✨ Features
+```mermaid
+flowchart LR
+    Client["🖥️ Clients\n(OpenAI / Claude compat)"]

- 🔄 **Dual-protocol support** - OpenAI and Claude (Anthropic) compatible APIs
- 🚀 **Multi-account rotation** - Round-robin load balancing for high concurrency
- 🔐 **Automatic token refresh** - Re-auth on expiry without manual maintenance
- 🌐 **WebUI management** - Add accounts, test APIs, and sync Vercel settings visually
- 🌍 **Language toggle** - Built-in Chinese and English UI switcher
- 🔍 **Web search** - DeepSeek native search enhancement mode
- 🧠 **Deep reasoning** - Reasoning mode with trace output
- 🛠️ **Tool calling** - OpenAI Function Calling compatible
- ☁️ **One-click Vercel deploy** - No server required
+    subgraph DS2API["DS2API Service"]
+        direction TB
+        CORS["CORS Middleware"]
+        Auth["🔐 Auth Middleware"]

-## 📋 Model Support
+        subgraph Adapters["Adapter Layer"]
+            OA["OpenAI Adapter\n/v1/*"]
+            CA["Claude Adapter\n/anthropic/*"]
+        end

-### OpenAI compatible endpoint (`/v1/chat/completions`)
+        subgraph Support["Support Modules"]
+            Pool["📦 Account Pool / Queue"]
+            PoW["⚙️ PoW WASM\n(wazero)"]
+        end

-| Model | Reasoning | Search | Notes |
-|-----|:--------:|:------:|------|
-| `deepseek-chat` | ❌ | ❌ | Standard chat |
-| `deepseek-reasoner` | ✅ | ❌ | Reasoning (shows trace) |
-| `deepseek-chat-search` | ❌ | ✅ | Web search mode |
-| `deepseek-reasoner-search` | ✅ | ✅ | Reasoning + search |
+        Admin["🛠️ Admin API\n/admin/*"]
+        WebUI["🌐 WebUI\n(/admin)"]
+    end

-### Claude compatible endpoint (`/anthropic/v1/messages`)
+    DS["☁️ DeepSeek API"]

-| Model | Notes |
-|-----|------|
-| `claude-sonnet-4-20250514` | Maps to deepseek-chat (standard) |
-| `claude-sonnet-4-20250514-fast` | Maps to deepseek-chat (fast) |
-| `claude-sonnet-4-20250514-slow` | Maps to deepseek-reasoner (reasoning) |
+    Client -- "Request" --> CORS --> Auth
+    Auth --> OA & CA
+    OA & CA -- "Call" --> DS
+    Auth --> Admin
+    OA & CA -. "Rotate accounts" .-> Pool
+    OA & CA -. "Compute PoW" .-> PoW
+    DS -- "Response" --> Client
+```

-> **Tip**: The Claude endpoint actually calls DeepSeek and returns Anthropic-format responses.
+- **Backend**: Go (`cmd/ds2api/`, `api/`, `internal/`), no Python runtime
+- **Frontend**: React admin panel (`webui/`), served as static build at runtime
+- **Deployment**: local run, Docker, Vercel serverless, Linux systemd

-## 🚀 Quick Start
+## Key Capabilities

-### Option 1: Vercel deployment (recommended)
+| Capability | Details |
+| --- | --- |
+| OpenAI compatible | `GET /v1/models`, `POST /v1/chat/completions` (stream/non-stream) |
+| Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` |
+| Multi-account rotation | Auto token refresh, email/mobile dual login |
+| Concurrency control | Per-account in-flight limit + waiting queue, dynamic recommended concurrency |
+| DeepSeek PoW | WASM solving via `wazero`, no external Node.js dependency |
+| Tool Calling | Anti-leak handling: auto buffer, detect, structured output |
+| Admin API | Config management, account testing/batch test, import/export, Vercel sync |
+| WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode) |
+| Health Probes | `GET /healthz` (liveness), `GET /readyz` (readiness) |

-[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api&env=DS2API_ADMIN_KEY&envDescription=Admin%20console%20access%20key%20%28required%29&envLink=https%3A%2F%2Fgithub.com%2FCJackHwang%2Fds2api%23environment-variables&project-name=ds2api&repository-name=ds2api)
+## Model Support

-1. Click the button above and set `DS2API_ADMIN_KEY`
-2. After deployment, visit `/admin`
-3. Add DeepSeek accounts and custom API keys
-4. Click "Sync to Vercel" to persist configuration
+### OpenAI Endpoint

-> **First sync validates accounts and stores tokens automatically.**
+| Model | thinking | search |
+| --- | --- | --- |
+| `deepseek-chat` | ❌ | ❌ |
+| `deepseek-reasoner` | ✅ | ❌ |
+| `deepseek-chat-search` | ❌ | ✅ |
+| `deepseek-reasoner-search` | ✅ | ✅ |

-### Option 2: Local development
+### Claude Endpoint
+
+| Model | Default Mapping |
+| --- | --- |
+| `claude-sonnet-4-5` | `deepseek-chat` |
+| `claude-haiku-4-5` (compatible with `claude-3-5-haiku-latest`) | `deepseek-chat` |
+| `claude-opus-4-6` | `deepseek-reasoner` |
+
+Override mapping via `claude_mapping` or `claude_model_mapping` in config.
+In addition, `/anthropic/v1/models` now includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases for legacy client compatibility.
+
+## Quick Start
+
+### Option 1: Local Run
+
+**Prerequisites**: Go 1.24+, Node.js 20+ (only if building WebUI locally)

 ```bash
-# 1. Clone the repo
+# 1. Clone
 git clone https://github.com/CJackHwang/ds2api.git
 cd ds2api

-# 2. Install dependencies
-pip install -r requirements.txt
-
-# 3. Configure accounts
+# 2. Configure
 cp config.example.json config.json
-# Edit config.json to add DeepSeek account info
+# Edit config.json with your DeepSeek account info and API keys

-# 4. Start the service
-python dev.py
+# 3. Start
+go run ./cmd/ds2api
 ```

-Visit `http://localhost:5001` after startup.
+Default URL: `http://localhost:5001`

-## ⚙️ Configuration
+> **WebUI auto-build**: On first local startup, if `static/admin` is missing, DS2API will auto-run `npm install && npm run build` (requires Node.js). You can also build manually: `./scripts/build-webui.sh`

-### Environment variables
+### Option 2: Docker

-| Variable | Description | Required |
-|-----|------|:----:|
-| `DS2API_ADMIN_KEY` | Admin console password | Required on Vercel |
-| `DS2API_CONFIG_JSON` | Config JSON or Base64 | Optional |
-| `VERCEL_TOKEN` | Vercel API token (for sync) | Optional |
-| `VERCEL_PROJECT_ID` | Vercel project ID | Optional |
-| `PORT` | Service port (default 5001) | Optional |
+```bash
+# 1. Configure environment
+cp .env.example .env
+# Edit .env

-### Config file format (`config.json`)
+# 2. Start
+docker-compose up -d
+
+# 3. View logs
+docker-compose logs -f
+```
+
+Rebuild after updates: `docker-compose up -d --build`
+
+### Option 3: Vercel
+
+1. Fork this repo to your GitHub account
+2. Import the project on Vercel
+3. Set environment variables (minimum: `DS2API_ADMIN_KEY` and `DS2API_CONFIG_JSON`)
+4. Deploy
+
+> **Streaming note**: `/v1/chat/completions` on Vercel is routed to `api/chat-stream.js` (Node Runtime) for real-time SSE. Auth, account selection, and session/PoW preparation are still handled by the Go internal prepare endpoint; streaming output (including `tools`) is assembled on Node with Go-aligned anti-leak handling.
+
+For detailed deployment instructions, see the [Deployment Guide](DEPLOY.en.md).
+
+### Option 4: Download Release Binaries
+
+GitHub Actions automatically builds multi-platform archives on each Release:
+
+```bash
+# After downloading the archive for your platform
+tar -xzf ds2api_v1.7.0_linux_amd64.tar.gz
+cd ds2api_v1.7.0_linux_amd64
+cp config.example.json config.json
+# Edit config.json
+./ds2api
+```
+
+### Option 5: OpenCode CLI
+
+1. Copy the example config:
+
+```bash
+cp opencode.json.example opencode.json
+```
+
+2. Edit `opencode.json`:
+- Set `baseURL` to your DS2API endpoint (for example, `https://your-domain.com/v1`)
+- Set `apiKey` to your DS2API key (from `config.keys`)
+
+3. Start OpenCode CLI in the project directory (run `opencode` using your installed method).
+
+> Recommended: use the OpenAI-compatible path (`/v1/*`) via `@ai-sdk/openai-compatible` as shown in the example.
+
+## Configuration
+
+### `config.json` Example

 ```json
 {
@@ -109,125 +183,158 @@ Visit `http://localhost:5001` after startup.
      "password": "your-password",
      "token": ""
    }
-  ]
+  ],
+  "claude_model_mapping": {
+    "fast": "deepseek-chat",
+    "slow": "deepseek-reasoner"
+  }
 }
 ```

-> **Notes**:
-> - `keys`: Custom API keys for calling this service
-> - `accounts`: DeepSeek Web accounts (email or mobile)
-> - `token`: Leave blank; DS2API will fetch and refresh automatically
+- `keys`: API access keys; clients authenticate via `Authorization: Bearer <key>`
+- `accounts`: DeepSeek account list, supports `email` or `mobile` login
+- `token`: Leave empty for auto-login on first request; or pre-fill an existing token
+- `claude_model_mapping`: Maps `fast`/`slow` suffixes to corresponding DeepSeek models

-## 📡 API Usage
+### Environment Variables

-See **[API.md](API.md)** for full API documentation.
+| Variable | Purpose | Default |
+| --- | --- | --- |
+| `PORT` | Service port | `5001` |
+| `LOG_LEVEL` | Log level | `INFO` (`DEBUG`/`WARN`/`ERROR`) |
+| `DS2API_ADMIN_KEY` | Admin login key | `admin` |
+| `DS2API_JWT_SECRET` | Admin JWT signing secret | Same as `DS2API_ADMIN_KEY` |
+| `DS2API_JWT_EXPIRE_HOURS` | Admin JWT TTL in hours | `24` |
+| `DS2API_CONFIG_PATH` | Config file path | `config.json` |
+| `DS2API_CONFIG_JSON` | Inline config (JSON or Base64) | — |
+| `DS2API_WASM_PATH` | PoW WASM file path | Auto-detect |
+| `DS2API_STATIC_ADMIN_DIR` | Admin static assets dir | `static/admin` |
+| `DS2API_AUTO_BUILD_WEBUI` | Auto-build WebUI on startup | Enabled locally, disabled on Vercel |
+| `DS2API_ACCOUNT_MAX_INFLIGHT` | Max in-flight requests per account | `2` |
+| `DS2API_ACCOUNT_CONCURRENCY` | Alias (legacy compat) | — |
+| `DS2API_ACCOUNT_MAX_QUEUE` | Waiting queue limit | `recommended_concurrency` |
+| `DS2API_ACCOUNT_QUEUE_SIZE` | Alias (legacy compat) | — |
+| `DS2API_VERCEL_INTERNAL_SECRET` | Vercel hybrid streaming internal auth | Falls back to `DS2API_ADMIN_KEY` |
+| `DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS` | Stream lease TTL seconds | `900` |
+| `VERCEL_TOKEN` | Vercel sync token | — |
+| `VERCEL_PROJECT_ID` | Vercel project ID | — |
+| `VERCEL_TEAM_ID` | Vercel team ID | — |
+| `DS2API_VERCEL_PROTECTION_BYPASS` | Vercel deployment protection bypass for internal Node→Go calls | — |

-### Quick examples
+## Authentication Modes

-**List models**:
-```bash
-curl http://localhost:5001/v1/models
+For business endpoints (`/v1/*`, `/anthropic/*`), DS2API supports two modes:
+
+| Mode | Description |
+| --- | --- |
+| **Managed account** | Use a key from `config.keys` via `Authorization: Bearer ...` or `x-api-key`; DS2API auto-selects an account |
+| **Direct token** | If the token is not in `config.keys`, DS2API treats it as a DeepSeek token directly |
+
+Optional header `X-Ds2-Target-Account`: Pin a specific managed account (value is email or mobile).
+
+## Concurrency Model
+
+```
+Per-account inflight = DS2API_ACCOUNT_MAX_INFLIGHT (default 2)
+Recommended concurrency = account_count × per_account_inflight
+Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
+429 threshold = inflight + queue ≈ account_count × 4
 ```

-**OpenAI-compatible call**:
-```bash
-curl http://localhost:5001/v1/chat/completions \
-  -H "Authorization: Bearer your-api-key" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "deepseek-chat",
-    "messages": [{"role": "user", "content": "Hello"}],
-    "stream": true
-  }'
+- When inflight slots are full, requests enter a waiting queue — **no immediate 429**
+- 429 is returned only when total load exceeds inflight + queue capacity
+- `GET /admin/queue/status` returns real-time concurrency state
+
+## Tool Call Adaptation
+
+When `tools` is present in the request, DS2API performs anti-leak handling:
+
+1. With `stream=true`, DS2API **buffers** text deltas first
+2. If a tool call is detected → only structured `tool_calls` are emitted, raw JSON is not leaked
+3. If no tool call → buffered text is emitted at once
+4. Parser supports mixed text, fenced JSON, and `function.arguments` payloads
+
+## Project Structure
+
+```text
+ds2api/
+├── cmd/
+│   ├── ds2api/              # Local / container entrypoint
+│   └── ds2api-tests/        # End-to-end testsuite entrypoint
+├── api/
+│   ├── index.go             # Vercel Serverless Go entry
+│   ├── chat-stream.js       # Vercel Node.js stream relay
+│   └── helpers/             # Node.js helper modules
+├── internal/
+│   ├── account/             # Account pool and concurrency queue
+│   ├── adapter/
+│   │   ├── openai/          # OpenAI adapter (incl. tool call parsing, Vercel stream prepare/release)
+│   │   └── claude/          # Claude adapter
+│   ├── admin/               # Admin API handlers
+│   ├── auth/                # Auth and JWT
+│   ├── config/              # Config loading and hot-reload
+│   ├── deepseek/            # DeepSeek API client, PoW WASM
+│   ├── server/              # HTTP routing and middleware (chi router)
+│   ├── sse/                 # SSE parsing utilities
+│   ├── util/                # Common utilities
+│   └── webui/               # WebUI static file serving and auto-build
+├── webui/                   # React WebUI source (Vite + Tailwind)
+│   └── src/
+│       ├── components/      # AccountManager / ApiTester / BatchImport / VercelSync / Login / LandingPage
+│       └── locales/         # Language packs (zh.json / en.json)
+├── scripts/
+│   ├── build-webui.sh       # Manual WebUI build script
+│   └── testsuite/           # Testsuite runner scripts
+├── static/admin/            # WebUI build output (not committed to Git)
+├── .github/
+│   ├── workflows/           # GitHub Actions (Release artifact automation)
+│   ├── ISSUE_TEMPLATE/      # Issue templates
+│   └── PULL_REQUEST_TEMPLATE.md
+├── config.example.json      # Config file template
+├── .env.example             # Environment variable template
+├── Dockerfile               # Multi-stage build (WebUI + Go)
+├── docker-compose.yml       # Production Docker Compose
+├── docker-compose.dev.yml   # Development Docker Compose
+├── vercel.json              # Vercel routing and build config
+├── go.mod / go.sum          # Go module dependencies
+└── version.txt              # Version number
 ```

-**Claude-compatible call**:
-```bash
-curl http://localhost:5001/anthropic/v1/messages \
-  -H "x-api-key: your-api-key" \
-  -H "Content-Type: application/json" \
-  -H "anthropic-version: 2023-06-01" \
-  -d '{
-    "model": "claude-sonnet-4-20250514",
-    "max_tokens": 1024,
-    "messages": [{"role": "user", "content": "Hello"}]
-  }'
-```
+## Documentation Index

-### Python SDK usage
+| Document | Description |
+| --- | --- |
+| [API.md](API.md) / [API.en.md](API.en.md) | API reference with request/response examples |
+| [DEPLOY.md](DEPLOY.md) / [DEPLOY.en.md](DEPLOY.en.md) | Deployment guide (local/Docker/Vercel/systemd) |
+| [CONTRIBUTING.md](CONTRIBUTING.md) / [CONTRIBUTING.en.md](CONTRIBUTING.en.md) | Contributing guide |
+| [TESTING.md](TESTING.md) | Testsuite guide |

-```python
-from openai import OpenAI
-
-client = OpenAI(
-    api_key="your-api-key",
-    base_url="http://localhost:5001/v1"
-)
-
-response = client.chat.completions.create(
-    model="deepseek-reasoner",
-    messages=[{"role": "user", "content": "Explain quantum entanglement"}],
-    stream=True
-)
-
-for chunk in response:
-    if chunk.choices[0].delta.content:
-        print(chunk.choices[0].delta.content, end="")
-```
-
-## 🔧 Deployment Notes
-
-### Nginx reverse proxy
-
-```nginx
-location / {
-    proxy_pass http://localhost:5001;
-    proxy_http_version 1.1;
-    proxy_set_header Connection "";
-    proxy_buffering off;
-    proxy_cache off;
-    chunked_transfer_encoding on;
-    tcp_nopush on;
-    tcp_nodelay on;
-    keepalive_timeout 120;
-}
-```
-
-### Option 3: Docker deployment
+## Testing

 ```bash
-# 1. Clone the repo and enter the directory
-git clone https://github.com/CJackHwang/ds2api.git
-cd ds2api
+# Unit tests
+go test ./...

-# 2. Configure environment variables
-cp .env.example .env
-# Edit .env and fill in DS2API_ADMIN_KEY and DS2API_CONFIG_JSON
+# One-command live end-to-end tests (real accounts, full request/response logs)
+./scripts/testsuite/run-live.sh

-# 3. Start the service
-docker-compose up -d
-
-# 4. Check logs
-docker-compose logs -f
+# Or with custom flags
+go run ./cmd/ds2api-tests \
+  --config config.json \
+  --admin-key admin \
+  --out artifacts/testsuite \
+  --timeout 120 \
+  --retries 2
 ```

-> **Docker advantage**: Zero-intrusion design; update the main code with `docker-compose up -d --build` without changing Docker configuration. See [DEPLOY.md](DEPLOY.md#docker-deployment-recommended).
+## Release Artifact Automation (GitHub Actions)

-## ⚠️ Disclaimer
+Workflow: `.github/workflows/release-artifacts.yml`

-**This project is based on reverse engineering and stability is not guaranteed.**
+- **Trigger**: only on GitHub Release `published` (normal pushes do not trigger builds)
+- **Outputs**: multi-platform archives (`linux/amd64`, `linux/arm64`, `darwin/amd64`, `darwin/arm64`, `windows/amd64`) + `sha256sums.txt`
+- **Each archive includes**: `ds2api` executable, `static/admin`, WASM file, config template, README, LICENSE

- For learning and research only. **No commercial use or public service is allowed.**
- For production, use the official [DeepSeek API](https://platform.deepseek.com/)
- You assume all risks from using this project
+## Disclaimer

-## 📜 Acknowledgements
-
-This project is based on the following open-source projects:
-
- [iidamie/deepseek2api](https://github.com/iidamie/deepseek2api)
- [LLM-Red-Team/deepseek-free-api](https://github.com/LLM-Red-Team/deepseek-free-api)
-
-## 📊 Star History
-
-[![Star History Chart](https://api.star-history.com/svg?repos=CJackHwang/ds2api&type=Date)](https://star-history.com/#CJackHwang/ds2api&Date)
+This project is built through reverse engineering and is provided for learning and research only. Stability is not guaranteed. Do not use it in scenarios that violate terms of service or laws.
--- a/TESTING.md
+++ b/TESTING.md
@@ -0,0 +1,188 @@
+# DS2API 测试指南
+
+语言 / Language: [中文 + English](TESTING.md)
+
+## 概述 | Overview
+
+DS2API 提供两个层级的测试：
+
+| 层级 | 命令 | 说明 |
+| --- | --- | --- |
+| 单元测试 | `go test ./...` | 不需要真实账号 |
+| 端到端测试 | `./scripts/testsuite/run-live.sh` | 使用真实账号执行全链路测试 |
+
+端到端测试集会录制完整的请求/响应日志，用于故障排查。
+
+---
+
+## 快速开始 | Quick Start
+
+### 单元测试 | Unit Tests
+
+```bash
+go test ./...
+```
+
+```bash
+node --test api/helpers/stream-tool-sieve.test.js api/chat-stream.test.js
+```
+
+### 端到端测试 | End-to-End Tests
+
+```bash
+./scripts/testsuite/run-live.sh
+```
+
+**默认行为**：
+
+1. **Preflight 检查**：
+   - `go test ./... -count=1`（单元测试）
+   - `node --check api/chat-stream.js`（语法检查）
+   - `node --check api/helpers/stream-tool-sieve.js`（语法检查）
+   - `node --test api/helpers/stream-tool-sieve.test.js api/chat-stream.test.js`（Node 流式拦截单测）
+   - `npm run build --prefix webui`（WebUI 构建检查）
+
+2. **隔离启动**：复制 `config.json` 到临时目录，启动独立服务进程
+
+3. **场景测试**：
+   - ✅ OpenAI 非流式 / 流式
+   - ✅ Claude 非流式 / 流式
+   - ✅ Admin API（登录 / 配置 / 账号管理）
+   - ✅ Tool Calling
+   - ✅ 并发压力测试
+   - ✅ Search 模型
+
+4. **结果收集**：继续执行所有用例（不中断），写入最终汇总
+
+---
+
+## CLI 参数 | CLI Flags
+
+```bash
+go run ./cmd/ds2api-tests \
+  --config config.json \
+  --admin-key admin \
+  --out artifacts/testsuite \
+  --port 0 \
+  --timeout 120 \
+  --retries 2 \
+  --no-preflight=false \
+  --keep 5
+```
+
+| 参数 | 说明 | 默认值 |
+| --- | --- | --- |
+| `--config` | 配置文件路径 | `config.json` |
+| `--admin-key` | Admin 密钥 | `DS2API_ADMIN_KEY` 环境变量，回退 `admin` |
+| `--out` | 产物输出根目录 | `artifacts/testsuite` |
+| `--port` | 测试服务端口（`0` = 自动分配空闲端口） | `0` |
+| `--timeout` | 单个请求超时秒数 | `120` |
+| `--retries` | 网络/5xx 请求重试次数 | `2` |
+| `--no-preflight` | 跳过 preflight 检查 | `false` |
+| `--keep` | 保留最近几次测试结果（`0` = 全部保留） | `5` |
+
+---
+
+## 自动清理 | Auto Cleanup
+
+每次测试运行完成后，程序会自动扫描输出目录（`--out`），按时间排序保留最近 `--keep` 次运行的结果，超出部分自动删除。
+
+- 默认保留 **5** 次
+- 设置 `--keep 0` 可关闭自动清理
+- 被删除的旧运行目录会打印日志提示
+
+---
+
+## 产物结构 | Artifact Layout
+
+每次运行会创建一个以运行 ID 命名的目录：
+
+```text
+artifacts/testsuite/<run_id>/
+├── summary.json          # 机器可读报告
+├── summary.md            # 人类可读报告
+├── server.log            # 测试期间服务端日志
+├── preflight.log         # Preflight 命令输出
+└── cases/
+    └── <case_id>/
+        ├── request.json      # 请求体
+        ├── response.headers  # 响应头
+        ├── response.body     # 响应体
+        ├── stream.raw        # 原始 SSE 数据（流式用例）
+        ├── assertions.json   # 断言结果
+        └── meta.json         # 元信息（耗时、状态码等）
+```
+
+---
+
+## Trace 关联 | Trace Binding
+
+每个测试请求自动注入 trace 信息，便于快速定位问题：
+
+| 位置 | 格式 |
+| --- | --- |
+| 请求头 | `X-Ds2-Test-Trace: <trace_id>` |
+| 查询参数 | `__trace_id=<trace_id>` |
+
+当用例失败时，`summary.md` 中会包含 trace ID。你可以快速搜索对应的服务端日志：
+
+```bash
+rg "<trace_id>" artifacts/testsuite/<run_id>/server.log
+```
+
+---
+
+## 退出码 | Exit Code
+
+| 退出码 | 含义 |
+| --- | --- |
+| `0` | 所有用例通过 ✅ |
+| `1` | 有用例失败 ❌ |
+
+可将测试集作为本地发布门禁使用（CI/CD 集成）。
+
+---
+
+## 安全提醒 | Sensitive Data Warning
+
+⚠️ 测试集会存储**完整的原始请求/响应载荷**用于调试。
+
+- **不要**将 artifacts 目录上传到公开仓库
+- **不要**在 Issue tracker 中分享未脱敏的 artifact 文件
+- 如需分享日志，请先手动清除敏感信息（token、密码等）
+
+---
+
+## 常见用法 | Common Usage
+
+### 仅跑单元测试
+
+```bash
+go test ./...
+```
+
+### 跑端到端测试（跳过 preflight）
+
+```bash
+go run ./cmd/ds2api-tests --no-preflight
+```
+
+### 指定输出目录和超时
+
+```bash
+go run ./cmd/ds2api-tests \
+  --out /tmp/ds2api-test \
+  --timeout 60
+```
+
+### 在 CI 中使用
+
+```bash
+# 确保 config.json 存在且包含有效测试账号
+./scripts/testsuite/run-live.sh
+exit_code=$?
+if [ $exit_code -ne 0 ]; then
+  echo "Tests failed! Check artifacts for details."
+  exit 1
+fi
+```
--- a/api/chat-stream.js
+++ b/api/chat-stream.js
@@ -0,0 +1,770 @@
+'use strict';
+
+const {
+  extractToolNames,
+  createToolSieveState,
+  processToolSieveChunk,
+  flushToolSieve,
+  parseToolCalls,
+  formatOpenAIStreamToolCalls,
+} = require('./helpers/stream-tool-sieve');
+
+const DEEPSEEK_COMPLETION_URL = 'https://chat.deepseek.com/api/v0/chat/completion';
+
+const BASE_HEADERS = {
+  Host: 'chat.deepseek.com',
+  'User-Agent': 'DeepSeek/1.6.11 Android/35',
+  Accept: 'application/json',
+  'Content-Type': 'application/json',
+  'x-client-platform': 'android',
+  'x-client-version': '1.6.11',
+  'x-client-locale': 'zh_CN',
+  'accept-charset': 'UTF-8',
+};
+
+const SKIP_PATTERNS = [
+  'quasi_status',
+  'elapsed_secs',
+  'token_usage',
+  'pending_fragment',
+  'conversation_mode',
+  'fragments/-1/status',
+  'fragments/-2/status',
+  'fragments/-3/status',
+];
+
+module.exports = async function handler(req, res) {
+  setCorsHeaders(res);
+  if (req.method === 'OPTIONS') {
+    res.statusCode = 204;
+    res.end();
+    return;
+  }
+  if (req.method !== 'POST') {
+    writeOpenAIError(res, 405, 'method not allowed');
+    return;
+  }
+
+  const rawBody = await readRawBody(req);
+
+  // Hard guard: only use Node data path for streaming on Vercel runtime.
+  // Any non-Vercel runtime always falls back to Go for full behavior parity.
+  if (!isVercelRuntime()) {
+    await proxyToGo(req, res, rawBody);
+    return;
+  }
+
+  let payload;
+  try {
+    payload = JSON.parse(rawBody.toString('utf8') || '{}');
+  } catch (_err) {
+    writeOpenAIError(res, 400, 'invalid json');
+    return;
+  }
+
+  // Keep all non-stream behavior on Go side to avoid compatibility regressions.
+  if (!toBool(payload.stream)) {
+    await proxyToGo(req, res, rawBody);
+    return;
+  }
+
+  const prep = await fetchStreamPrepare(req, rawBody);
+  if (!prep.ok) {
+    relayPreparedFailure(res, prep);
+    return;
+  }
+
+  const model = asString(prep.body.model) || asString(payload.model);
+  const sessionID = asString(prep.body.session_id) || `chatcmpl-${Date.now()}`;
+  const leaseID = asString(prep.body.lease_id);
+  const deepseekToken = asString(prep.body.deepseek_token);
+  const powHeader = asString(prep.body.pow_header);
+  const completionPayload = prep.body.payload && typeof prep.body.payload === 'object' ? prep.body.payload : null;
+  const finalPrompt = asString(prep.body.final_prompt);
+  const thinkingEnabled = toBool(prep.body.thinking_enabled);
+  const searchEnabled = toBool(prep.body.search_enabled);
+  const toolNames = extractToolNames(payload.tools);
+
+  if (!model || !leaseID || !deepseekToken || !powHeader || !completionPayload) {
+    writeOpenAIError(res, 500, 'invalid vercel prepare response');
+    return;
+  }
+  const releaseLease = createLeaseReleaser(req, leaseID);
+  try {
+    const completionRes = await fetch(DEEPSEEK_COMPLETION_URL, {
+      method: 'POST',
+      headers: {
+        ...BASE_HEADERS,
+        authorization: `Bearer ${deepseekToken}`,
+        'x-ds-pow-response': powHeader,
+      },
+      body: JSON.stringify(completionPayload),
+    });
+
+    if (!completionRes.ok || !completionRes.body) {
+      const detail = await safeReadText(completionRes);
+      writeOpenAIError(res, 500, detail ? `Failed to get completion: ${detail}` : 'Failed to get completion.');
+      return;
+    }
+
+    res.statusCode = 200;
+    res.setHeader('Content-Type', 'text/event-stream');
+    res.setHeader('Cache-Control', 'no-cache, no-transform');
+    res.setHeader('Connection', 'keep-alive');
+    res.setHeader('X-Accel-Buffering', 'no');
+    if (typeof res.flushHeaders === 'function') {
+      res.flushHeaders();
+    }
+
+    const created = Math.floor(Date.now() / 1000);
+    let firstChunkSent = false;
+    let currentType = thinkingEnabled ? 'thinking' : 'text';
+    let thinkingText = '';
+    let outputText = '';
+    const toolSieveEnabled = toolNames.length > 0;
+    const toolSieveState = createToolSieveState();
+    let toolCallsEmitted = false;
+    const decoder = new TextDecoder();
+    const reader = completionRes.body.getReader();
+    let buffered = '';
+    let ended = false;
+
+    const sendFrame = (obj) => {
+      res.write(`data: ${JSON.stringify(obj)}\n\n`);
+      if (typeof res.flush === 'function') {
+        res.flush();
+      }
+    };
+
+    const sendDeltaFrame = (delta) => {
+      const payloadDelta = { ...delta };
+      if (!firstChunkSent) {
+        payloadDelta.role = 'assistant';
+        firstChunkSent = true;
+      }
+      sendFrame({
+        id: sessionID,
+        object: 'chat.completion.chunk',
+        created,
+        model,
+        choices: [{ delta: payloadDelta, index: 0 }],
+      });
+    };
+
+    const finish = async (reason) => {
+      if (ended) {
+        return;
+      }
+      ended = true;
+      const detected = parseToolCalls(outputText, toolNames);
+      if (detected.length > 0 && !toolCallsEmitted) {
+        toolCallsEmitted = true;
+        sendDeltaFrame({ tool_calls: formatOpenAIStreamToolCalls(detected) });
+      } else if (toolSieveEnabled) {
+        const tailEvents = flushToolSieve(toolSieveState, toolNames);
+        for (const evt of tailEvents) {
+          if (evt.text) {
+            sendDeltaFrame({ content: evt.text });
+          }
+        }
+      }
+      if (detected.length > 0 || toolCallsEmitted) {
+        reason = 'tool_calls';
+      }
+      sendFrame({
+        id: sessionID,
+        object: 'chat.completion.chunk',
+        created,
+        model,
+        choices: [{ delta: {}, index: 0, finish_reason: reason }],
+        usage: buildUsage(finalPrompt, thinkingText, outputText),
+      });
+      res.write('data: [DONE]\n\n');
+      await releaseLease();
+      res.end();
+    };
+
+    try {
+      // eslint-disable-next-line no-constant-condition
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) {
+          break;
+        }
+        buffered += decoder.decode(value, { stream: true });
+        const lines = buffered.split('\n');
+        buffered = lines.pop() || '';
+
+        for (const rawLine of lines) {
+          const line = rawLine.trim();
+          if (!line.startsWith('data:')) {
+            continue;
+          }
+          const dataStr = line.slice(5).trim();
+          if (!dataStr) {
+            continue;
+          }
+          if (dataStr === '[DONE]') {
+            await finish('stop');
+            return;
+          }
+          let chunk;
+          try {
+            chunk = JSON.parse(dataStr);
+          } catch (_err) {
+            continue;
+          }
+          if (chunk.error || chunk.code === 'content_filter') {
+            await finish('content_filter');
+            return;
+          }
+          const parsed = parseChunkForContent(chunk, thinkingEnabled, currentType);
+          currentType = parsed.newType;
+          if (parsed.finished) {
+            await finish('stop');
+            return;
+          }
+
+          for (const p of parsed.parts) {
+            if (!p.text) {
+              continue;
+            }
+            if (searchEnabled && isCitation(p.text)) {
+              continue;
+            }
+            if (p.type === 'thinking') {
+              if (thinkingEnabled) {
+                thinkingText += p.text;
+                sendDeltaFrame({ reasoning_content: p.text });
+              }
+            } else {
+              outputText += p.text;
+              if (!toolSieveEnabled) {
+                sendDeltaFrame({ content: p.text });
+                continue;
+              }
+              const events = processToolSieveChunk(toolSieveState, p.text, toolNames);
+              for (const evt of events) {
+                if (evt.type === 'tool_calls') {
+                  toolCallsEmitted = true;
+                  sendDeltaFrame({ tool_calls: formatOpenAIStreamToolCalls(evt.calls) });
+                  continue;
+                }
+                if (evt.text) {
+                  sendDeltaFrame({ content: evt.text });
+                }
+              }
+            }
+          }
+        }
+      }
+      await finish('stop');
+    } catch (_err) {
+      await finish('stop');
+    }
+  } finally {
+    await releaseLease();
+  }
+};
+
+function setCorsHeaders(res) {
+  res.setHeader('Access-Control-Allow-Origin', '*');
+  res.setHeader('Access-Control-Allow-Methods', 'GET, POST, OPTIONS, PUT, DELETE');
+  res.setHeader(
+    'Access-Control-Allow-Headers',
+    'Content-Type, Authorization, X-API-Key, X-Ds2-Target-Account, X-Vercel-Protection-Bypass',
+  );
+}
+
+function header(req, key) {
+  if (!req || !req.headers) {
+    return '';
+  }
+  return asString(req.headers[key.toLowerCase()]);
+}
+
+async function readRawBody(req) {
+  if (Buffer.isBuffer(req.body)) {
+    return req.body;
+  }
+  if (typeof req.body === 'string') {
+    return Buffer.from(req.body);
+  }
+  if (req.body && typeof req.body === 'object') {
+    return Buffer.from(JSON.stringify(req.body));
+  }
+  const chunks = [];
+  for await (const chunk of req) {
+    chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk));
+  }
+  return Buffer.concat(chunks);
+}
+
+async function fetchStreamPrepare(req, rawBody) {
+  const url = buildInternalGoURL(req);
+  url.searchParams.set('__stream_prepare', '1');
+
+  const upstream = await fetch(url.toString(), {
+    method: 'POST',
+    headers: buildInternalGoHeaders(req, { withInternalToken: true, withContentType: true }),
+    body: rawBody,
+  });
+
+  const text = await upstream.text();
+  let body = {};
+  try {
+    body = JSON.parse(text || '{}');
+  } catch (_err) {
+    body = {};
+  }
+
+  return {
+    ok: upstream.ok,
+    status: upstream.status,
+    contentType: upstream.headers.get('content-type') || 'application/json',
+    text,
+    body,
+  };
+}
+
+function relayPreparedFailure(res, prep) {
+  if (prep.status === 401 && looksLikeVercelAuthPage(prep.text)) {
+    writeOpenAIError(
+      res,
+      401,
+      'Vercel Deployment Protection blocked internal prepare request. Disable protection for this deployment or set VERCEL_AUTOMATION_BYPASS_SECRET.',
+    );
+    return;
+  }
+  res.statusCode = prep.status || 500;
+  res.setHeader('Content-Type', prep.contentType || 'application/json');
+  if (prep.text) {
+    res.end(prep.text);
+    return;
+  }
+  writeOpenAIError(res, prep.status || 500, 'vercel prepare failed');
+}
+
+async function safeReadText(resp) {
+  if (!resp) {
+    return '';
+  }
+  try {
+    const text = await resp.text();
+    return text.trim();
+  } catch (_err) {
+    return '';
+  }
+}
+
+function internalSecret() {
+  return asString(process.env.DS2API_VERCEL_INTERNAL_SECRET) || asString(process.env.DS2API_ADMIN_KEY) || 'admin';
+}
+
+function buildInternalGoURL(req) {
+  const proto = asString(header(req, 'x-forwarded-proto')) || 'https';
+  const host = asString(header(req, 'host'));
+  const url = new URL(`${proto}://${host}${req.url || '/v1/chat/completions'}`);
+  url.searchParams.set('__go', '1');
+  const protectionBypass = resolveProtectionBypass(req);
+  if (protectionBypass) {
+    url.searchParams.set('x-vercel-protection-bypass', protectionBypass);
+  }
+  return url;
+}
+
+function buildInternalGoHeaders(req, opts = {}) {
+  const headers = {
+    authorization: asString(header(req, 'authorization')),
+    'x-api-key': asString(header(req, 'x-api-key')),
+    'x-ds2-target-account': asString(header(req, 'x-ds2-target-account')),
+    'x-vercel-protection-bypass': resolveProtectionBypass(req),
+  };
+  if (opts.withInternalToken) {
+    headers['x-ds2-internal-token'] = internalSecret();
+  }
+  if (opts.withContentType) {
+    headers['content-type'] = asString(header(req, 'content-type')) || 'application/json';
+  }
+  return headers;
+}
+
+function createLeaseReleaser(req, leaseID) {
+  let released = false;
+  return async () => {
+    if (released || !leaseID) {
+      return;
+    }
+    released = true;
+    try {
+      await releaseStreamLease(req, leaseID);
+    } catch (_err) {
+      // Ignore release errors. Lease TTL cleanup on Go side still prevents permanent leaks.
+    }
+  };
+}
+
+async function releaseStreamLease(req, leaseID) {
+  const url = buildInternalGoURL(req);
+  url.searchParams.set('__stream_release', '1');
+  const body = Buffer.from(JSON.stringify({ lease_id: leaseID }));
+
+  const controller = new AbortController();
+  const timeout = setTimeout(() => controller.abort(), 1500);
+  try {
+    await fetch(url.toString(), {
+      method: 'POST',
+      headers: buildInternalGoHeaders(req, { withInternalToken: true, withContentType: true }),
+      body,
+      signal: controller.signal,
+    });
+  } finally {
+    clearTimeout(timeout);
+  }
+}
+
+function resolveProtectionBypass(req) {
+  const fromHeader = asString(header(req, 'x-vercel-protection-bypass'));
+  if (fromHeader) {
+    return fromHeader;
+  }
+  return asString(process.env.VERCEL_AUTOMATION_BYPASS_SECRET) || asString(process.env.DS2API_VERCEL_PROTECTION_BYPASS);
+}
+
+function looksLikeVercelAuthPage(text) {
+  const body = asString(text).toLowerCase();
+  if (!body) {
+    return false;
+  }
+  return body.includes('authentication required') && body.includes('vercel');
+}
+
+function parseChunkForContent(chunk, thinkingEnabled, currentType) {
+  if (!chunk || typeof chunk !== 'object' || !Object.prototype.hasOwnProperty.call(chunk, 'v')) {
+    return { parts: [], finished: false, newType: currentType };
+  }
+  const pathValue = asString(chunk.p);
+  if (shouldSkipPath(pathValue)) {
+    return { parts: [], finished: false, newType: currentType };
+  }
+  if (pathValue === 'response/status' && asString(chunk.v) === 'FINISHED') {
+    return { parts: [], finished: true, newType: currentType };
+  }
+
+  let newType = currentType;
+  const parts = [];
+
+  if (pathValue === 'response/fragments' && asString(chunk.o).toUpperCase() === 'APPEND' && Array.isArray(chunk.v)) {
+    for (const frag of chunk.v) {
+      if (!frag || typeof frag !== 'object') {
+        continue;
+      }
+      const fragType = asString(frag.type).toUpperCase();
+      const content = asString(frag.content);
+      if (!content) {
+        continue;
+      }
+      if (fragType === 'THINK' || fragType === 'THINKING') {
+        newType = 'thinking';
+        parts.push({ text: content, type: 'thinking' });
+      } else if (fragType === 'RESPONSE') {
+        newType = 'text';
+        parts.push({ text: content, type: 'text' });
+      } else {
+        parts.push({ text: content, type: 'text' });
+      }
+    }
+  }
+
+  if (pathValue === 'response' && Array.isArray(chunk.v)) {
+    for (const item of chunk.v) {
+      if (!item || typeof item !== 'object') {
+        continue;
+      }
+      if (item.p === 'fragments' && item.o === 'APPEND' && Array.isArray(item.v)) {
+        for (const frag of item.v) {
+          const fragType = asString(frag && frag.type).toUpperCase();
+          if (fragType === 'THINK' || fragType === 'THINKING') {
+            newType = 'thinking';
+          } else if (fragType === 'RESPONSE') {
+            newType = 'text';
+          }
+        }
+      }
+    }
+  }
+
+  let partType = 'text';
+  if (pathValue === 'response/thinking_content') {
+    partType = 'thinking';
+  } else if (pathValue === 'response/content') {
+    partType = 'text';
+  } else if (pathValue.includes('response/fragments') && pathValue.includes('/content')) {
+    partType = newType;
+  } else if (!pathValue && thinkingEnabled) {
+    partType = newType;
+  }
+
+  const val = chunk.v;
+  if (typeof val === 'string') {
+    if (val === 'FINISHED' && (!pathValue || pathValue === 'status')) {
+      return { parts: [], finished: true, newType };
+    }
+    if (val) {
+      parts.push({ text: val, type: partType });
+    }
+    return { parts, finished: false, newType };
+  }
+
+  if (Array.isArray(val)) {
+    const extracted = extractContentRecursive(val, partType);
+    if (extracted.finished) {
+      return { parts: [], finished: true, newType };
+    }
+    parts.push(...extracted.parts);
+    return { parts, finished: false, newType };
+  }
+
+  if (val && typeof val === 'object') {
+    const resp = val.response && typeof val.response === 'object' ? val.response : val;
+    if (Array.isArray(resp.fragments)) {
+      for (const frag of resp.fragments) {
+        if (!frag || typeof frag !== 'object') {
+          continue;
+        }
+        const content = asString(frag.content);
+        if (!content) {
+          continue;
+        }
+        const t = asString(frag.type).toUpperCase();
+        if (t === 'THINK' || t === 'THINKING') {
+          newType = 'thinking';
+          parts.push({ text: content, type: 'thinking' });
+        } else if (t === 'RESPONSE') {
+          newType = 'text';
+          parts.push({ text: content, type: 'text' });
+        } else {
+          parts.push({ text: content, type: partType });
+        }
+      }
+    }
+  }
+  return { parts, finished: false, newType };
+}
+
+function extractContentRecursive(items, defaultType) {
+  const parts = [];
+  for (const it of items) {
+    if (!it || typeof it !== 'object') {
+      continue;
+    }
+    if (!Object.prototype.hasOwnProperty.call(it, 'v')) {
+      continue;
+    }
+    const itemPath = asString(it.p);
+    const itemV = it.v;
+    if (itemPath === 'status' && asString(itemV) === 'FINISHED') {
+      return { parts: [], finished: true };
+    }
+    if (shouldSkipPath(itemPath)) {
+      continue;
+    }
+    const content = asString(it.content);
+    if (content) {
+      const typeName = asString(it.type).toUpperCase();
+      if (typeName === 'THINK' || typeName === 'THINKING') {
+        parts.push({ text: content, type: 'thinking' });
+      } else if (typeName === 'RESPONSE') {
+        parts.push({ text: content, type: 'text' });
+      } else {
+        parts.push({ text: content, type: defaultType });
+      }
+      continue;
+    }
+
+    let partType = defaultType;
+    if (itemPath.includes('thinking')) {
+      partType = 'thinking';
+    } else if (itemPath.includes('content') || itemPath === 'response' || itemPath === 'fragments') {
+      partType = 'text';
+    }
+
+    if (typeof itemV === 'string') {
+      if (itemV && itemV !== 'FINISHED') {
+        parts.push({ text: itemV, type: partType });
+      }
+      continue;
+    }
+
+    if (!Array.isArray(itemV)) {
+      continue;
+    }
+    for (const inner of itemV) {
+      if (typeof inner === 'string') {
+        if (inner) {
+          parts.push({ text: inner, type: partType });
+        }
+        continue;
+      }
+      if (!inner || typeof inner !== 'object') {
+        continue;
+      }
+      const ct = asString(inner.content);
+      if (!ct) {
+        continue;
+      }
+      const typeName = asString(inner.type).toUpperCase();
+      if (typeName === 'THINK' || typeName === 'THINKING') {
+        parts.push({ text: ct, type: 'thinking' });
+      } else if (typeName === 'RESPONSE') {
+        parts.push({ text: ct, type: 'text' });
+      } else {
+        parts.push({ text: ct, type: partType });
+      }
+    }
+  }
+  return { parts, finished: false };
+}
+
+function shouldSkipPath(pathValue) {
+  if (pathValue === 'response/search_status') {
+    return true;
+  }
+  for (const p of SKIP_PATTERNS) {
+    if (pathValue.includes(p)) {
+      return true;
+    }
+  }
+  return false;
+}
+
+function isCitation(text) {
+  return asString(text).trim().startsWith('[citation:');
+}
+
+function buildUsage(prompt, thinking, output) {
+  const promptTokens = estimateTokens(prompt);
+  const reasoningTokens = estimateTokens(thinking);
+  const completionTokens = estimateTokens(output);
+  return {
+    prompt_tokens: promptTokens,
+    completion_tokens: reasoningTokens + completionTokens,
+    total_tokens: promptTokens + reasoningTokens + completionTokens,
+    completion_tokens_details: {
+      reasoning_tokens: reasoningTokens,
+    },
+  };
+}
+
+function estimateTokens(text) {
+  const t = asString(text);
+  if (!t) {
+    return 0;
+  }
+  const n = Math.floor(Array.from(t).length / 4);
+  return n < 1 ? 1 : n;
+}
+
+async function proxyToGo(req, res, rawBody) {
+  const url = buildInternalGoURL(req);
+
+  const upstream = await fetch(url.toString(), {
+    method: 'POST',
+    headers: buildInternalGoHeaders(req, { withContentType: true }),
+    body: rawBody,
+  });
+
+  res.statusCode = upstream.status;
+  upstream.headers.forEach((value, key) => {
+    if (key.toLowerCase() === 'content-length') {
+      return;
+    }
+    res.setHeader(key, value);
+  });
+
+  if (!upstream.body || typeof upstream.body.getReader !== 'function') {
+    const bytes = Buffer.from(await upstream.arrayBuffer());
+    res.end(bytes);
+    return;
+  }
+
+  const reader = upstream.body.getReader();
+  try {
+    // eslint-disable-next-line no-constant-condition
+    while (true) {
+      const { value, done } = await reader.read();
+      if (done) {
+        break;
+      }
+      if (value && value.length > 0) {
+        res.write(Buffer.from(value));
+        if (typeof res.flush === 'function') {
+          res.flush();
+        }
+      }
+    }
+    res.end();
+  } catch (_err) {
+    if (!res.writableEnded) {
+      res.end();
+    }
+  }
+}
+
+function writeOpenAIError(res, status, message) {
+  res.statusCode = status;
+  res.setHeader('Content-Type', 'application/json');
+  res.end(
+    JSON.stringify({
+      error: {
+        message,
+        type: openAIErrorType(status),
+      },
+    }),
+  );
+}
+
+function openAIErrorType(status) {
+  switch (status) {
+    case 400:
+      return 'invalid_request_error';
+    case 401:
+      return 'authentication_error';
+    case 403:
+      return 'permission_error';
+    case 429:
+      return 'rate_limit_error';
+    case 503:
+      return 'service_unavailable_error';
+    default:
+      return status >= 500 ? 'api_error' : 'invalid_request_error';
+  }
+}
+
+function toBool(v) {
+  return v === true;
+}
+
+function isVercelRuntime() {
+  return asString(process.env.VERCEL) !== '' || asString(process.env.NOW_REGION) !== '';
+}
+
+function asString(v) {
+  if (typeof v === 'string') {
+    return v.trim();
+  }
+  if (Array.isArray(v)) {
+    return asString(v[0]);
+  }
+  if (v == null) {
+    return '';
+  }
+  return String(v).trim();
+}
+
+module.exports.__test = {
+  parseChunkForContent,
+  extractContentRecursive,
+  shouldSkipPath,
+  asString,
+};
--- a/api/chat-stream.test.js
+++ b/api/chat-stream.test.js
@@ -0,0 +1,128 @@
+'use strict';
+
+const test = require('node:test');
+const assert = require('node:assert/strict');
+
+const handler = require('./chat-stream');
+const {
+  createToolSieveState,
+  processToolSieveChunk,
+  flushToolSieve,
+} = require('./helpers/stream-tool-sieve');
+
+const { parseChunkForContent } = handler.__test;
+
+test('chat-stream exposes parser test hooks', () => {
+  assert.equal(typeof parseChunkForContent, 'function');
+});
+
+test('parseChunkForContent keeps split response/content fragments inside response array', () => {
+  const chunk = {
+    p: 'response',
+    v: [
+      { p: 'response/content', v: '{"' },
+      { p: 'response/content', v: 'tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}' },
+    ],
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.equal(parsed.newType, 'text');
+  assert.equal(parsed.parts.length, 2);
+  const combined = parsed.parts.map((p) => p.text).join('');
+  assert.equal(combined, '{"tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}');
+});
+
+test('parseChunkForContent + sieve does not leak suspicious prefix in split tool json case', () => {
+  const chunk = {
+    p: 'response',
+    v: [
+      { p: 'response/content', v: '{"' },
+      { p: 'response/content', v: 'tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}' },
+    ],
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  const state = createToolSieveState();
+  const events = [];
+  for (const part of parsed.parts) {
+    events.push(...processToolSieveChunk(state, part.text, ['read_file']));
+  }
+  events.push(...flushToolSieve(state, ['read_file']));
+
+  const hasToolCalls = events.some((evt) => evt.type === 'tool_calls' && evt.calls && evt.calls.length > 0);
+  const leakedText = events
+    .filter((evt) => evt.type === 'text' && evt.text)
+    .map((evt) => evt.text)
+    .join('');
+
+  assert.equal(hasToolCalls, true);
+  assert.equal(leakedText.includes('{'), false);
+  assert.equal(leakedText.toLowerCase().includes('tool_calls'), false);
+});
+
+test('parseChunkForContent consumes nested item.v array payloads', () => {
+  const chunk = {
+    p: 'response',
+    v: [
+      { p: 'response/content', v: ['A', 'B'] },
+      { p: 'response/content', v: [{ content: 'C', type: 'RESPONSE' }] },
+    ],
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.equal(parsed.parts.map((p) => p.text).join(''), 'ABC');
+});
+
+test('parseChunkForContent detects nested status FINISHED in array payload', () => {
+  const chunk = {
+    p: 'response',
+    v: [{ p: 'status', v: 'FINISHED' }],
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, true);
+  assert.deepEqual(parsed.parts, []);
+});
+
+test('parseChunkForContent ignores items without v to match Go parser behavior', () => {
+  const chunk = {
+    p: 'response',
+    v: [{ type: 'RESPONSE', content: 'no-v-content' }],
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.deepEqual(parsed.parts, []);
+});
+
+test('parseChunkForContent handles response/fragments APPEND with thinking and response transitions', () => {
+  const chunk = {
+    p: 'response/fragments',
+    o: 'APPEND',
+    v: [
+      { type: 'THINK', content: '思考中' },
+      { type: 'RESPONSE', content: '结论' },
+    ],
+  };
+  const parsed = parseChunkForContent(chunk, true, 'thinking');
+  assert.equal(parsed.finished, false);
+  assert.equal(parsed.newType, 'text');
+  assert.deepEqual(parsed.parts, [
+    { text: '思考中', type: 'thinking' },
+    { text: '结论', type: 'text' },
+  ]);
+});
+
+test('parseChunkForContent supports wrapped response.fragments object shape', () => {
+  const chunk = {
+    p: 'response',
+    v: {
+      response: {
+        fragments: [
+          { type: 'RESPONSE', content: 'A' },
+          { type: 'RESPONSE', content: 'B' },
+        ],
+      },
+    },
+  };
+  const parsed = parseChunkForContent(chunk, false, 'text');
+  assert.equal(parsed.finished, false);
+  assert.equal(parsed.parts.map((p) => p.text).join(''), 'AB');
+});
--- a/api/helpers/stream-tool-sieve.js
+++ b/api/helpers/stream-tool-sieve.js
@@ -0,0 +1,477 @@
+'use strict';
+
+const crypto = require('crypto');
+const TOOL_CALL_PATTERN = /\{\s*["']tool_calls["']\s*:\s*\[(.*?)\]\s*\}/s;
+
+function extractToolNames(tools) {
+  if (!Array.isArray(tools) || tools.length === 0) {
+    return [];
+  }
+  const out = [];
+  for (const t of tools) {
+    if (!t || typeof t !== 'object') {
+      continue;
+    }
+    const fn = t.function && typeof t.function === 'object' ? t.function : t;
+    const name = toStringSafe(fn.name);
+    // Keep parity with Go injectToolPrompt: object tools without name still
+    // enter tool mode via fallback name "unknown".
+    out.push(name || 'unknown');
+  }
+  return out;
+}
+
+function createToolSieveState() {
+  return {
+    pending: '',
+    capture: '',
+    capturing: false,
+  };
+}
+
+function processToolSieveChunk(state, chunk, toolNames) {
+  if (!state) {
+    return [];
+  }
+  if (chunk) {
+    state.pending += chunk;
+  }
+  const events = [];
+  // eslint-disable-next-line no-constant-condition
+  while (true) {
+    if (state.capturing) {
+      if (state.pending) {
+        state.capture += state.pending;
+        state.pending = '';
+      }
+      const consumed = consumeToolCapture(state.capture, toolNames);
+      if (!consumed.ready) {
+        break;
+      }
+      state.capture = '';
+      state.capturing = false;
+      if (consumed.prefix) {
+        events.push({ type: 'text', text: consumed.prefix });
+      }
+      if (Array.isArray(consumed.calls) && consumed.calls.length > 0) {
+        events.push({ type: 'tool_calls', calls: consumed.calls });
+      }
+      if (consumed.suffix) {
+        state.pending += consumed.suffix;
+      }
+      continue;
+    }
+
+    if (!state.pending) {
+      break;
+    }
+
+    const start = findToolSegmentStart(state.pending);
+    if (start >= 0) {
+      const prefix = state.pending.slice(0, start);
+      if (prefix) {
+        events.push({ type: 'text', text: prefix });
+      }
+      state.capture = state.pending.slice(start);
+      state.pending = '';
+      state.capturing = true;
+      continue;
+    }
+
+    const [safe, hold] = splitSafeContentForToolDetection(state.pending);
+    if (!safe) {
+      break;
+    }
+    state.pending = hold;
+    events.push({ type: 'text', text: safe });
+  }
+  return events;
+}
+
+function flushToolSieve(state, toolNames) {
+  if (!state) {
+    return [];
+  }
+  const events = processToolSieveChunk(state, '', toolNames);
+  if (state.capturing) {
+    const consumed = consumeToolCapture(state.capture, toolNames);
+    if (consumed.ready) {
+      if (consumed.prefix) {
+        events.push({ type: 'text', text: consumed.prefix });
+      }
+      if (Array.isArray(consumed.calls) && consumed.calls.length > 0) {
+        events.push({ type: 'tool_calls', calls: consumed.calls });
+      }
+      if (consumed.suffix) {
+        events.push({ type: 'text', text: consumed.suffix });
+      }
+    } else if (state.capture) {
+      // Incomplete captured tool JSON at stream end: suppress raw capture.
+    }
+    state.capture = '';
+    state.capturing = false;
+  }
+  if (state.pending) {
+    events.push({ type: 'text', text: state.pending });
+    state.pending = '';
+  }
+  return events;
+}
+
+function splitSafeContentForToolDetection(s) {
+  const text = s || '';
+  if (!text) {
+    return ['', ''];
+  }
+  const suspiciousStart = findSuspiciousPrefixStart(text);
+  if (suspiciousStart < 0) {
+    return [text, ''];
+  }
+  if (suspiciousStart > 0) {
+    return [text.slice(0, suspiciousStart), text.slice(suspiciousStart)];
+  }
+  // If suspicious content starts at the beginning, keep holding until we can
+  // either parse a full tool JSON block or reach stream flush.
+  return ['', text];
+}
+
+function findSuspiciousPrefixStart(s) {
+  let start = -1;
+  for (const needle of ['{', '[', '```']) {
+    const idx = s.lastIndexOf(needle);
+    if (idx > start) {
+      start = idx;
+    }
+  }
+  return start;
+}
+
+function findToolSegmentStart(s) {
+  if (!s) {
+    return -1;
+  }
+  const lower = s.toLowerCase();
+  const keyIdx = lower.indexOf('tool_calls');
+  if (keyIdx < 0) {
+    return -1;
+  }
+  const start = s.slice(0, keyIdx).lastIndexOf('{');
+  return start >= 0 ? start : keyIdx;
+}
+
+function consumeToolCapture(captured, toolNames) {
+  if (!captured) {
+    return { ready: false, prefix: '', calls: [], suffix: '' };
+  }
+  const lower = captured.toLowerCase();
+  const keyIdx = lower.indexOf('tool_calls');
+  if (keyIdx < 0) {
+    return { ready: false, prefix: '', calls: [], suffix: '' };
+  }
+  const start = captured.slice(0, keyIdx).lastIndexOf('{');
+  if (start < 0) {
+    return { ready: false, prefix: '', calls: [], suffix: '' };
+  }
+  const obj = extractJSONObjectFrom(captured, start);
+  if (!obj.ok) {
+    return { ready: false, prefix: '', calls: [], suffix: '' };
+  }
+  const parsed = parseToolCalls(captured.slice(start, obj.end), toolNames);
+  if (parsed.length === 0) {
+    // `tool_calls` key exists but strict JSON parse failed.
+    // Drop the captured object body to avoid leaking raw tool JSON.
+    return {
+      ready: true,
+      prefix: captured.slice(0, start),
+      calls: [],
+      suffix: captured.slice(obj.end),
+    };
+  }
+  return {
+    ready: true,
+    prefix: captured.slice(0, start),
+    calls: parsed,
+    suffix: captured.slice(obj.end),
+  };
+}
+
+function extractJSONObjectFrom(text, start) {
+  if (!text || start < 0 || start >= text.length || text[start] !== '{') {
+    return { ok: false, end: 0 };
+  }
+  let depth = 0;
+  let quote = '';
+  let escaped = false;
+  for (let i = start; i < text.length; i += 1) {
+    const ch = text[i];
+    if (quote) {
+      if (escaped) {
+        escaped = false;
+        continue;
+      }
+      if (ch === '\\') {
+        escaped = true;
+        continue;
+      }
+      if (ch === quote) {
+        quote = '';
+      }
+      continue;
+    }
+    if (ch === '"' || ch === "'") {
+      quote = ch;
+      continue;
+    }
+    if (ch === '{') {
+      depth += 1;
+      continue;
+    }
+    if (ch === '}') {
+      depth -= 1;
+      if (depth === 0) {
+        return { ok: true, end: i + 1 };
+      }
+    }
+  }
+  return { ok: false, end: 0 };
+}
+
+function parseToolCalls(text, toolNames) {
+  if (!toStringSafe(text)) {
+    return [];
+  }
+  const candidates = buildToolCallCandidates(text);
+  let parsed = [];
+  for (const c of candidates) {
+    parsed = parseToolCallsPayload(c);
+    if (parsed.length > 0) {
+      break;
+    }
+  }
+  if (parsed.length === 0) {
+    return [];
+  }
+  const allowed = new Set((toolNames || []).filter(Boolean));
+  const out = [];
+  for (const tc of parsed) {
+    if (!tc || !tc.name) {
+      continue;
+    }
+    if (allowed.size > 0 && !allowed.has(tc.name)) {
+      continue;
+    }
+    out.push({ name: tc.name, input: tc.input || {} });
+  }
+  if (out.length === 0 && parsed.length > 0) {
+    for (const tc of parsed) {
+      if (!tc || !tc.name) {
+        continue;
+      }
+      out.push({ name: tc.name, input: tc.input || {} });
+    }
+  }
+  return out;
+}
+
+function buildToolCallCandidates(text) {
+  const trimmed = toStringSafe(text);
+  const candidates = [trimmed];
+  const fenced = trimmed.match(/```(?:json)?\s*([\s\S]*?)\s*```/gi) || [];
+  for (const block of fenced) {
+    const m = block.match(/```(?:json)?\s*([\s\S]*?)\s*```/i);
+    if (m && m[1]) {
+      candidates.push(toStringSafe(m[1]));
+    }
+  }
+  for (const candidate of extractToolCallObjects(trimmed)) {
+    candidates.push(toStringSafe(candidate));
+  }
+  const first = trimmed.indexOf('{');
+  const last = trimmed.lastIndexOf('}');
+  if (first >= 0 && last > first) {
+    candidates.push(toStringSafe(trimmed.slice(first, last + 1)));
+  }
+  const m = trimmed.match(TOOL_CALL_PATTERN);
+  if (m && m[1]) {
+    candidates.push(`{"tool_calls":[${m[1]}]}`);
+  }
+  return [...new Set(candidates.filter(Boolean))];
+}
+
+function extractToolCallObjects(text) {
+  const raw = toStringSafe(text);
+  if (!raw) {
+    return [];
+  }
+  const lower = raw.toLowerCase();
+  const out = [];
+  let offset = 0;
+  // eslint-disable-next-line no-constant-condition
+  while (true) {
+    let idx = lower.indexOf('tool_calls', offset);
+    if (idx < 0) {
+      break;
+    }
+    let start = raw.slice(0, idx).lastIndexOf('{');
+    while (start >= 0) {
+      const obj = extractJSONObjectFrom(raw, start);
+      if (obj.ok) {
+        out.push(raw.slice(start, obj.end).trim());
+        offset = obj.end;
+        idx = -1;
+        break;
+      }
+      start = raw.slice(0, start).lastIndexOf('{');
+    }
+    if (idx >= 0) {
+      offset = idx + 'tool_calls'.length;
+    }
+  }
+  return out;
+}
+
+function parseToolCallsPayload(payload) {
+  let decoded;
+  try {
+    decoded = JSON.parse(payload);
+  } catch (_err) {
+    return [];
+  }
+  if (Array.isArray(decoded)) {
+    return parseToolCallList(decoded);
+  }
+  if (!decoded || typeof decoded !== 'object') {
+    return [];
+  }
+  if (decoded.tool_calls) {
+    return parseToolCallList(decoded.tool_calls);
+  }
+  const one = parseToolCallItem(decoded);
+  return one ? [one] : [];
+}
+
+function parseToolCallList(v) {
+  if (!Array.isArray(v)) {
+    return [];
+  }
+  const out = [];
+  for (const item of v) {
+    if (!item || typeof item !== 'object') {
+      continue;
+    }
+    const one = parseToolCallItem(item);
+    if (one) {
+      out.push(one);
+    }
+  }
+  return out;
+}
+
+function parseToolCallItem(m) {
+  let name = toStringSafe(m.name);
+  let inputRaw = m.input;
+  let hasInput = Object.prototype.hasOwnProperty.call(m, 'input');
+  const fn = m.function && typeof m.function === 'object' ? m.function : null;
+  if (fn) {
+    if (!name) {
+      name = toStringSafe(fn.name);
+    }
+    if (!hasInput && Object.prototype.hasOwnProperty.call(fn, 'arguments')) {
+      inputRaw = fn.arguments;
+      hasInput = true;
+    }
+  }
+  if (!hasInput) {
+    for (const k of ['arguments', 'args', 'parameters', 'params']) {
+      if (Object.prototype.hasOwnProperty.call(m, k)) {
+        inputRaw = m[k];
+        hasInput = true;
+        break;
+      }
+    }
+  }
+  if (!name) {
+    return null;
+  }
+  return {
+    name,
+    input: parseToolCallInput(inputRaw),
+  };
+}
+
+function parseToolCallInput(v) {
+  if (v == null) {
+    return {};
+  }
+  if (typeof v === 'string') {
+    const raw = toStringSafe(v);
+    if (!raw) {
+      return {};
+    }
+    try {
+      const parsed = JSON.parse(raw);
+      if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
+        return parsed;
+      }
+      return { _raw: raw };
+    } catch (_err) {
+      return { _raw: raw };
+    }
+  }
+  if (typeof v === 'object' && !Array.isArray(v)) {
+    return v;
+  }
+  try {
+    const parsed = JSON.parse(JSON.stringify(v));
+    if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
+      return parsed;
+    }
+  } catch (_err) {
+    return {};
+  }
+  return {};
+}
+
+function formatOpenAIStreamToolCalls(calls) {
+  if (!Array.isArray(calls) || calls.length === 0) {
+    return [];
+  }
+  return calls.map((c, idx) => ({
+    index: idx,
+    id: `call_${newCallID()}`,
+    type: 'function',
+    function: {
+      name: c.name,
+      arguments: JSON.stringify(c.input || {}),
+    },
+  }));
+}
+
+function newCallID() {
+  if (typeof crypto.randomUUID === 'function') {
+    return crypto.randomUUID().replace(/-/g, '');
+  }
+  return `${Date.now()}${Math.floor(Math.random() * 1e9)}`;
+}
+
+function toStringSafe(v) {
+  if (typeof v === 'string') {
+    return v.trim();
+  }
+  if (Array.isArray(v)) {
+    return toStringSafe(v[0]);
+  }
+  if (v == null) {
+    return '';
+  }
+  return String(v).trim();
+}
+
+module.exports = {
+  extractToolNames,
+  createToolSieveState,
+  processToolSieveChunk,
+  flushToolSieve,
+  parseToolCalls,
+  formatOpenAIStreamToolCalls,
+};
--- a/api/helpers/stream-tool-sieve.test.js
+++ b/api/helpers/stream-tool-sieve.test.js
@@ -0,0 +1,130 @@
+'use strict';
+
+const test = require('node:test');
+const assert = require('node:assert/strict');
+
+const {
+  extractToolNames,
+  createToolSieveState,
+  processToolSieveChunk,
+  flushToolSieve,
+  parseToolCalls,
+} = require('./stream-tool-sieve');
+
+function runSieve(chunks, toolNames) {
+  const state = createToolSieveState();
+  const events = [];
+  for (const chunk of chunks) {
+    events.push(...processToolSieveChunk(state, chunk, toolNames));
+  }
+  events.push(...flushToolSieve(state, toolNames));
+  return events;
+}
+
+function collectText(events) {
+  return events
+    .filter((evt) => evt.type === 'text' && evt.text)
+    .map((evt) => evt.text)
+    .join('');
+}
+
+test('extractToolNames keeps tool mode enabled with unknown fallback', () => {
+  const names = extractToolNames([
+    { function: { description: 'no name tool' } },
+    { function: { name: ' read_file ' } },
+    {},
+  ]);
+  assert.deepEqual(names, ['unknown', 'read_file', 'unknown']);
+});
+
+test('parseToolCalls keeps non-object argument strings as _raw (Go parity)', () => {
+  const payload = JSON.stringify({
+    tool_calls: [
+      { name: 'read_file', input: '123' },
+      { name: 'list_dir', input: '[1,2,3]' },
+    ],
+  });
+  const calls = parseToolCalls(payload, ['read_file', 'list_dir']);
+  assert.deepEqual(calls, [
+    { name: 'read_file', input: { _raw: '123' } },
+    { name: 'list_dir', input: { _raw: '[1,2,3]' } },
+  ]);
+});
+
+test('parseToolCalls still intercepts unknown schema names to avoid leaks', () => {
+  const payload = JSON.stringify({
+    tool_calls: [{ name: 'not_in_schema', input: { q: 'go' } }],
+  });
+  const calls = parseToolCalls(payload, ['search']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'not_in_schema');
+});
+
+test('parseToolCalls supports fenced json and function.arguments string payload', () => {
+  const text = [
+    'I will call a tool now.',
+    '```json',
+    '{"tool_calls":[{"function":{"name":"read_file","arguments":"{\\"path\\":\\"README.md\\"}"}}]}',
+    '```',
+  ].join('\n');
+  const calls = parseToolCalls(text, ['read_file']);
+  assert.equal(calls.length, 1);
+  assert.equal(calls[0].name, 'read_file');
+  assert.deepEqual(calls[0].input, { path: 'README.md' });
+});
+
+test('sieve emits tool_calls and does not leak suspicious prefix on late key convergence', () => {
+  const events = runSieve(
+    [
+      '{"',
+      'tool_calls":[{"name":"read_file","input":{"path":"README.MD"}}]}',
+      '后置正文C。',
+    ],
+    ['read_file'],
+  );
+  const leakedText = collectText(events);
+  const hasToolCall = events.some((evt) => evt.type === 'tool_calls' && Array.isArray(evt.calls) && evt.calls.length > 0);
+  assert.equal(hasToolCall, true);
+  assert.equal(leakedText.includes('{'), false);
+  assert.equal(leakedText.toLowerCase().includes('tool_calls'), false);
+  assert.equal(leakedText.includes('后置正文C。'), true);
+});
+
+test('sieve drops invalid tool json body while preserving surrounding text', () => {
+  const events = runSieve(
+    [
+      '前置正文D。',
+      "{'tool_calls':[{'name':'read_file','input':{'path':'README.MD'}}]}",
+      '后置正文E。',
+    ],
+    ['read_file'],
+  );
+  const leakedText = collectText(events);
+  const hasToolCall = events.some((evt) => evt.type === 'tool_calls');
+  assert.equal(hasToolCall, false);
+  assert.equal(leakedText.includes('前置正文D。'), true);
+  assert.equal(leakedText.includes('后置正文E。'), true);
+  assert.equal(leakedText.toLowerCase().includes('tool_calls'), false);
+});
+
+test('sieve suppresses incomplete captured tool json on stream finalize', () => {
+  const events = runSieve(
+    ['前置正文F。', '{"tool_calls":[{"name":"read_file"'],
+    ['read_file'],
+  );
+  const leakedText = collectText(events);
+  assert.equal(leakedText.includes('前置正文F。'), true);
+  assert.equal(leakedText.toLowerCase().includes('tool_calls'), false);
+  assert.equal(leakedText.includes('{'), false);
+});
+
+test('sieve keeps plain text intact in tool mode when no tool call appears', () => {
+  const events = runSieve(
+    ['你好，', '这是普通文本回复。', '请继续。'],
+    ['read_file'],
+  );
+  const leakedText = collectText(events);
+  const hasToolCall = events.some((evt) => evt.type === 'tool_calls');
+  assert.equal(hasToolCall, false);
+  assert.equal(leakedText, '你好，这是普通文本回复。请继续。');
+});
--- a/api/index.go
+++ b/api/index.go
@@ -0,0 +1,20 @@
+package handler
+
+import (
+	"net/http"
+	"sync"
+
+	"ds2api/app"
+)
+
+var (
+	once sync.Once
+	h    http.Handler
+)
+
+func Handler(w http.ResponseWriter, r *http.Request) {
+	once.Do(func() {
+		h = app.NewHandler()
+	})
+	h.ServeHTTP(w, r)
+}
--- a/app.py
+++ b/app.py
@@ -1,69 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-DS2API - DeepSeek to OpenAI API 转换服务
-
-支持:
- OpenAI 兼容接口: /v1/chat/completions, /v1/models
- Claude 兼容接口: /anthropic/v1/messages, /anthropic/v1/models
-
-使用方法:
-    本地开发: python dev.py
-    生产环境: uvicorn app:app --host 0.0.0.0 --port 5001
-    Vercel: 自动部署
-"""
-import os
-
-from fastapi import FastAPI, Request
-from fastapi.middleware.cors import CORSMiddleware
-from fastapi.responses import JSONResponse
-
-from core.config import IS_VERCEL, logger
-
-# 创建 FastAPI 应用
-app = FastAPI(
-    title="DS2API",
-    description="DeepSeek to OpenAI/Claude API",
-    version="1.0.0",
-)
-
-
-# 全局异常处理
-@app.exception_handler(Exception)
-async def unhandled_exception_handler(request: Request, exc: Exception):
-    logger.exception(f"[unhandled_exception] {request.method} {request.url.path}: {exc}")
-    return JSONResponse(
-        status_code=500,
-        content={"error": {"type": "api_error", "message": "Internal Server Error"}},
-    )
-
-
-# CORS 中间件
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["*"],
-    allow_credentials=True,
-    allow_methods=["GET", "POST", "OPTIONS", "PUT", "DELETE"],
-    allow_headers=["Content-Type", "Authorization"],
-)
-
-# 注册路由
-from routes.openai import router as openai_router
-from routes.claude import router as claude_router
-from routes.home import router as home_router
-from routes.admin import router as admin_router
-
-app.include_router(openai_router)
-app.include_router(claude_router)
-# admin_router 必须在 home_router 之前，否则 home.py 的 /admin/{path:path} 会拦截 admin API
-app.include_router(admin_router)
-app.include_router(home_router)
-
-
-# ----------------------------------------------------------------------
-# 本地运行入口
-# ----------------------------------------------------------------------
-if __name__ == "__main__" and not IS_VERCEL:
-    import uvicorn
-
-    port = int(os.getenv("PORT", "5001"))
-    uvicorn.run(app, host="0.0.0.0", port=port)
--- a/app/handler.go
+++ b/app/handler.go
@@ -0,0 +1,11 @@
+package app
+
+import (
+	"net/http"
+
+	"ds2api/internal/server"
+)
+
+func NewHandler() http.Handler {
+	return server.NewApp().Router
+}
--- a/cmd/ds2api-tests/main.go
+++ b/cmd/ds2api-tests/main.go
@@ -0,0 +1,37 @@
+package main
+
+import (
+	"context"
+	"flag"
+	"fmt"
+	"os"
+	"time"
+
+	"ds2api/internal/testsuite"
+)
+
+func main() {
+	opts := testsuite.DefaultOptions()
+	var timeoutSeconds int
+
+	flag.StringVar(&opts.ConfigPath, "config", opts.ConfigPath, "Path to config file (default: config.json)")
+	flag.StringVar(&opts.AdminKey, "admin-key", opts.AdminKey, "Admin key (default: DS2API_ADMIN_KEY or admin)")
+	flag.StringVar(&opts.OutputDir, "out", opts.OutputDir, "Output artifact directory")
+	flag.IntVar(&opts.Port, "port", opts.Port, "Server port (0 means auto-select free port)")
+	flag.IntVar(&timeoutSeconds, "timeout", int(opts.Timeout.Seconds()), "Per-request timeout in seconds")
+	flag.IntVar(&opts.Retries, "retries", opts.Retries, "Retry count for network/5xx requests")
+	flag.BoolVar(&opts.NoPreflight, "no-preflight", opts.NoPreflight, "Skip preflight checks")
+	flag.IntVar(&opts.MaxKeepRuns, "keep", opts.MaxKeepRuns, "Max test runs to keep (0 = keep all)")
+	flag.Parse()
+
+	if timeoutSeconds <= 0 {
+		timeoutSeconds = 120
+	}
+	opts.Timeout = time.Duration(timeoutSeconds) * time.Second
+
+	if err := testsuite.Run(context.Background(), opts); err != nil {
+		fmt.Fprintln(os.Stderr, err.Error())
+		os.Exit(1)
+	}
+	fmt.Fprintln(os.Stdout, "testsuite completed successfully")
+}
--- a/cmd/ds2api/main.go
+++ b/cmd/ds2api/main.go
@@ -0,0 +1,56 @@
+package main
+
+import (
+	"context"
+	"net/http"
+	"os"
+	"os/signal"
+	"strings"
+	"syscall"
+	"time"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/server"
+	"ds2api/internal/webui"
+)
+
+func main() {
+	webui.EnsureBuiltOnStartup()
+	_ = auth.AdminKey()
+	app := server.NewApp()
+	port := strings.TrimSpace(os.Getenv("PORT"))
+	if port == "" {
+		port = "5001"
+	}
+
+	srv := &http.Server{
+		Addr:    "0.0.0.0:" + port,
+		Handler: app.Router,
+	}
+
+	// Start server in a goroutine so we can listen for shutdown signals.
+	go func() {
+		config.Logger.Info("starting ds2api", "port", port)
+		if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
+			config.Logger.Error("server stopped unexpectedly", "error", err)
+			os.Exit(1)
+		}
+	}()
+
+	// Wait for interrupt signal (Ctrl+C / SIGTERM).
+	quit := make(chan os.Signal, 1)
+	signal.Notify(quit, os.Interrupt, syscall.SIGTERM)
+	sig := <-quit
+	config.Logger.Info("shutdown signal received", "signal", sig.String())
+
+	// Graceful shutdown: allow up to 10 seconds for in-flight requests to complete.
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+	defer cancel()
+
+	if err := srv.Shutdown(ctx); err != nil {
+		config.Logger.Error("graceful shutdown failed, forcing exit", "error", err)
+		os.Exit(1)
+	}
+	config.Logger.Info("server gracefully stopped")
+}
--- a/core/init.py
+++ b/core/init.py
@@ -1 +0,0 @@
-# DS2API Core Modules
--- a/core/auth.py
+++ b/core/auth.py
@@ -1,247 +0,0 @@
-# -*- coding: utf-8 -*-
-"""账号认证与管理模块 - 轮询(Round-Robin)策略"""
-import threading
-from fastapi import HTTPException, Request
-
-from .config import CONFIG, logger
-from .deepseek import login_deepseek_via_account, BASE_HEADERS
-from .utils import get_account_identifier
-
-# -------------------------- 全局账号队列 --------------------------
-# 使用列表实现轮询队列，配合线程锁保证并发安全
-account_queue = []  # 可用账号队列
-in_use_accounts = {}  # 正在使用的账号 {account_id: account}
-_queue_lock = threading.Lock()  # 线程锁
-
-claude_api_key_queue = []  # 维护所有可用的Claude API keys
-
-
-def init_account_queue():
-    """初始化时从配置加载账号（不再随机排序，保持配置顺序）"""
-    global account_queue, in_use_accounts
-    with _queue_lock:
-        account_queue = CONFIG.get("accounts", [])[:]  # 深拷贝
-        in_use_accounts = {}
-        # 按 token 有无排序：有 token 的账号优先
-        account_queue.sort(key=lambda a: 0 if a.get("token", "").strip() else 1)
-        logger.info(f"[init_account_queue] 初始化 {len(account_queue)} 个账号，轮询模式")
-
-
-def init_claude_api_key_queue():
-    """Claude API keys由用户自己的token提供，这里初始化为空"""
-    global claude_api_key_queue
-    claude_api_key_queue = []
-
-
-# 初始化
-init_account_queue()
-init_claude_api_key_queue()
-
-
-# get_account_identifier 已移至 core.utils
-
-
-def get_queue_status() -> dict:
-    """获取账号队列状态（用于监控）"""
-    with _queue_lock:
-        # total 应该是配置中的账号总数，而非队列相加（避免状态不一致导致重复计数）
-        total_accounts = len(CONFIG.get("accounts", []))
-        return {
-            "available": len(account_queue),
-            "in_use": len(in_use_accounts),
-            "total": total_accounts,
-            "available_accounts": [get_account_identifier(a) for a in account_queue],
-            "in_use_accounts": list(in_use_accounts.keys()),
-        }
-
-
-# ----------------------------------------------------------------------
-# 账号选择与释放 - 轮询(Round-Robin)策略
-# ----------------------------------------------------------------------
-def choose_new_account(exclude_ids=None, target_id=None):
-    """轮询选择策略：
-    1. 使用线程锁保证并发安全
-    2. 如果指定了 target_id，优先尝试获取该账号
-    3. 优先选择队首的有 token 账号
-    4. 从队列头部取出账号（FIFO）
-    5. 请求完成后调用 release_account 将账号放回队尾
-    """
-    if exclude_ids is None:
-        exclude_ids = []
-
-    with _queue_lock:
-        # 0. 如果指定了目标账号，优先尝试获取
-        if target_id:
-            for i in range(len(account_queue)):
-                acc = account_queue[i]
-                acc_id = get_account_identifier(acc)
-                if acc_id == target_id:
-                    selected = account_queue.pop(i)
-                    in_use_accounts[acc_id] = selected
-                    logger.info(f"[choose_new_account] 指定选择: {acc_id} | 队列剩余: {len(account_queue)}")
-                    return selected
-            # 如果队列中没找到，且不在 in_use 中，说明账号不存在
-            if target_id not in in_use_accounts:
-                logger.warning(f"[choose_new_account] 指定账号不存在: {target_id}")
-            else:
-                logger.warning(f"[choose_new_account] 指定账号正忙: {target_id}")
-            return None
-
-        # 第一轮：优先选择已有 token 的账号
-        for i in range(len(account_queue)):
-            acc = account_queue[i]
-            acc_id = get_account_identifier(acc)
-            if acc_id and acc_id not in exclude_ids:
-                if acc.get("token", "").strip():  # 已有 token
-                    selected = account_queue.pop(i)
-                    in_use_accounts[acc_id] = selected
-                    logger.info(f"[choose_new_account] 轮询选择(有token): {acc_id} | 队列剩余: {len(account_queue)}")
-                    return selected
-
-        # 第二轮：选择任意账号（需要登录）
-        for i in range(len(account_queue)):
-            acc = account_queue[i]
-            acc_id = get_account_identifier(acc)
-            if acc_id and acc_id not in exclude_ids:
-                selected = account_queue.pop(i)
-                in_use_accounts[acc_id] = selected
-                logger.info(f"[choose_new_account] 轮询选择(需登录): {acc_id} | 队列剩余: {len(account_queue)}")
-                return selected
-
-        logger.warning(f"[choose_new_account] 没有可用账号 | 队列: {len(account_queue)}, 使用中: {len(in_use_accounts)}")
-        return None
-
-
-def release_account(account: dict):
-    """将账号重新加入队列末尾（轮询核心：用完放队尾）"""
-    if not account:
-        return
-    
-    acc_id = get_account_identifier(account)
-    with _queue_lock:
-        # 从使用中移除
-        if acc_id in in_use_accounts:
-            del in_use_accounts[acc_id]
-            # 放回队尾
-            account_queue.append(account)
-            logger.debug(f"[release_account] 释放账号: {acc_id} | 队列长度: {len(account_queue)}")
-        else:
-            logger.warning(f"[release_account] 账号 {acc_id} 不在使用列表中 (可能是因为重置了队列)，跳过释放")
-
-
-# ----------------------------------------------------------------------
-# Claude API key 管理函数（简化版本）
-# ----------------------------------------------------------------------
-def choose_claude_api_key():
-    """选择一个可用的Claude API key - 现在直接由用户提供"""
-    return None
-
-
-def release_claude_api_key(api_key):
-    """释放Claude API key - 现在无需操作"""
-    pass
-
-
-# ----------------------------------------------------------------------
-# 判断调用模式：配置模式 vs 用户自带 token
-# ----------------------------------------------------------------------
-def determine_mode_and_token(request: Request):
-    """
-    根据请求头 Authorization 判断使用哪种模式：
-    - 如果 Bearer token 出现在 CONFIG["keys"] 中，则为配置模式，从 CONFIG["accounts"] 中随机选择一个账号（排除已尝试账号），
-      检查该账号是否已有 token，否则调用登录接口获取；
-    - 否则，直接使用请求中的 Bearer 值作为 DeepSeek token。
-    结果存入 request.state.deepseek_token；配置模式下同时存入 request.state.account 与 request.state.tried_accounts。
-    """
-    auth_header = request.headers.get("Authorization", "")
-    if not auth_header.startswith("Bearer "):
-        raise HTTPException(
-            status_code=401, detail="Unauthorized: missing Bearer token."
-        )
-    caller_key = auth_header.replace("Bearer ", "", 1).strip()
-    config_keys = CONFIG.get("keys", [])
-    if caller_key in config_keys:
-        request.state.use_config_token = True
-        request.state.tried_accounts = []  # 初始化已尝试账号
-        
-        target_account = request.headers.get("X-Ds2-Target-Account")
-        selected_account = choose_new_account(target_id=target_account)
-        
-        if not selected_account:
-            detail_msg = "No accounts configured or all accounts are busy."
-            if target_account:
-                detail_msg = f"Target account {target_account} is busy or not found."
-            raise HTTPException(
-                status_code=429,
-                detail=detail_msg,
-            )
-        if not selected_account.get("token", "").strip():
-            try:
-                login_deepseek_via_account(selected_account)
-            except Exception as e:
-                logger.error(
-                    f"[determine_mode_and_token] 账号 {get_account_identifier(selected_account)} 登录失败：{e}"
-                )
-                raise HTTPException(status_code=500, detail="Account login failed.")
-
-        request.state.deepseek_token = selected_account.get("token")
-        request.state.account = selected_account
-
-    else:
-        request.state.use_config_token = False
-        request.state.deepseek_token = caller_key
-
-
-def get_auth_headers(request: Request) -> dict:
-    """返回 DeepSeek 请求所需的公共请求头"""
-    return {**BASE_HEADERS, "authorization": f"Bearer {request.state.deepseek_token}"}
-
-
-# determine_claude_mode_and_token 已移除（直接使用 determine_mode_and_token）
-
-
-# ----------------------------------------------------------------------
-# Token 刷新机制
-# ----------------------------------------------------------------------
-def refresh_account_token(request: Request) -> bool:
-    """当 token 过期时，刷新账号 token。
-    
-    返回 True 表示刷新成功，False 表示刷新失败。
-    调用后 request.state.deepseek_token 会被更新。
-    """
-    if not getattr(request.state, 'use_config_token', False):
-        # 用户自带 token，无法刷新
-        return False
-    
-    account = getattr(request.state, 'account', None)
-    if not account:
-        return False
-    
-    acc_id = get_account_identifier(account)
-    logger.info(f"[refresh_account_token] 尝试刷新账号 {acc_id} 的 token")
-    
-    try:
-        # 清除旧 token
-        account["token"] = ""
-        # 重新登录
-        login_deepseek_via_account(account)
-        # 更新 request 状态
-        request.state.deepseek_token = account.get("token")
-        logger.info(f"[refresh_account_token] 账号 {acc_id} token 刷新成功")
-        return True
-    except Exception as e:
-        logger.error(f"[refresh_account_token] 账号 {acc_id} token 刷新失败: {e}")
-        return False
-
-
-def mark_token_invalid(request: Request):
-    """标记当前账号的 token 为无效，清除它以便下次重新登录"""
-    if not getattr(request.state, 'use_config_token', False):
-        return
-    
-    account = getattr(request.state, 'account', None)
-    if account:
-        acc_id = get_account_identifier(account)
-        logger.warning(f"[mark_token_invalid] 标记账号 {acc_id} 的 token 为无效")
-        account["token"] = ""
-
--- a/core/config.py
+++ b/core/config.py
@@ -1,111 +0,0 @@
-# -*- coding: utf-8 -*-
-"""配置管理模块"""
-import base64
-import json
-import logging
-import os
-import sys
-
-import transformers
-
-# -------------------------- 获取项目根目录 --------------------------
-BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
-IS_VERCEL = bool(os.getenv("VERCEL")) or bool(os.getenv("NOW_REGION"))
-
-
-def resolve_path(env_key: str, default_rel: str) -> str:
-    """解析路径，支持环境变量覆盖"""
-    raw = os.getenv(env_key)
-    if raw:
-        return raw if os.path.isabs(raw) else os.path.join(BASE_DIR, raw)
-    return os.path.join(BASE_DIR, default_rel)
-
-
-# -------------------------- 日志配置 --------------------------
-logging.basicConfig(
-    level=os.getenv("LOG_LEVEL", "INFO").upper(),
-    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
-    handlers=[logging.StreamHandler(sys.stdout)],
-    force=True,
-)
-logger = logging.getLogger("ds2api")
-
-# -------------------------- 初始化 tokenizer --------------------------
-chat_tokenizer_dir = resolve_path("DS2API_TOKENIZER_DIR", "")
-# 抑制 Mistral tokenizer regex 警告（不影响 DeepSeek tokenization）
-_tf_logger = logging.getLogger("transformers")
-_tf_log_level = _tf_logger.level
-_tf_logger.setLevel(logging.ERROR)
-tokenizer = transformers.AutoTokenizer.from_pretrained(
-    chat_tokenizer_dir, trust_remote_code=True
-)
-_tf_logger.setLevel(_tf_log_level)
-
-# ----------------------------------------------------------------------
-# 配置文件的读写函数
-# ----------------------------------------------------------------------
-CONFIG_PATH = resolve_path("DS2API_CONFIG_PATH", "config.json")
-
-
-def load_config() -> dict:
-    """加载配置。
-
-    优先从环境变量读取：
-      - DS2API_CONFIG_JSON / CONFIG_JSON: 直接 JSON 字符串，或 base64 编码后的 JSON
-
-    若未提供环境变量，再从 CONFIG_PATH 指向的文件读取。
-    """
-    raw_cfg = os.getenv("DS2API_CONFIG_JSON") or os.getenv("CONFIG_JSON")
-    if raw_cfg:
-        try:
-            return json.loads(raw_cfg)
-        except json.JSONDecodeError:
-            try:
-                decoded = base64.b64decode(raw_cfg).decode("utf-8")
-                return json.loads(decoded)
-            except Exception as e:
-                logger.warning(f"[load_config] 环境变量配置解析失败: {e}")
-                return {}
-
-    try:
-        with open(CONFIG_PATH, "r", encoding="utf-8") as f:
-            return json.load(f)
-    except Exception as e:
-        logger.warning(f"[load_config] 无法读取配置文件({CONFIG_PATH}): {e}")
-        return {}
-
-
-def save_config(cfg: dict) -> None:
-    """将配置写回 config.json。
-
-    Vercel 环境文件系统通常是只读的；且如果配置来自环境变量，也无法回写。
-    所以这里失败不应影响主流程。
-    """
-    if os.getenv("DS2API_CONFIG_JSON") or os.getenv("CONFIG_JSON"):
-        logger.info("[save_config] 配置来自环境变量，跳过写回")
-        return
-
-    try:
-        with open(CONFIG_PATH, "w", encoding="utf-8") as f:
-            json.dump(cfg, f, ensure_ascii=False, indent=2)
-    except PermissionError as e:
-        logger.warning(f"[save_config] 配置文件不可写({CONFIG_PATH}): {e}")
-    except Exception as e:
-        logger.exception(f"[save_config] 写入 config.json 失败: {e}")
-
-
-# 全局配置
-CONFIG = load_config()
-if not CONFIG:
-    logger.warning(
-        "[config] 未加载到有效配置，请提供 config.json（路径可用 DS2API_CONFIG_PATH 指定）或设置环境变量 DS2API_CONFIG_JSON"
-    )
-
-# WASM 模块文件路径
-WASM_PATH = resolve_path("DS2API_WASM_PATH", "sha3_wasm_bg.7b9ca65ddd.wasm")
-
-# 模板目录
-TEMPLATES_DIR = resolve_path("DS2API_TEMPLATES_DIR", "templates")
-
-# WebUI 静态文件目录
-STATIC_ADMIN_DIR = resolve_path("DS2API_STATIC_ADMIN_DIR", "static/admin")
--- a/core/constants.py
+++ b/core/constants.py
@@ -1,43 +0,0 @@
-# -*- coding: utf-8 -*-
-"""常量定义模块 - 统一管理项目中的所有常量"""
-
-# ----------------------------------------------------------------------
-# 网络和超时配置
-# ----------------------------------------------------------------------
-KEEP_ALIVE_TIMEOUT = 5  # 保活超时（秒）
-STREAM_IDLE_TIMEOUT = 30  # 流无新内容超时（秒）
-MAX_KEEPALIVE_COUNT = 10  # 最大连续 keepalive 次数
-
-# ----------------------------------------------------------------------
-# DeepSeek API 配置
-# ----------------------------------------------------------------------
-DEEPSEEK_HOST = "chat.deepseek.com"
-DEEPSEEK_LOGIN_URL = f"https://{DEEPSEEK_HOST}/api/v0/users/login"
-DEEPSEEK_CREATE_SESSION_URL = f"https://{DEEPSEEK_HOST}/api/v0/chat_session/create"
-DEEPSEEK_CREATE_POW_URL = f"https://{DEEPSEEK_HOST}/api/v0/chat/create_pow_challenge"
-DEEPSEEK_COMPLETION_URL = f"https://{DEEPSEEK_HOST}/api/v0/chat/completion"
-
-# ----------------------------------------------------------------------
-# 请求头配置
-# ----------------------------------------------------------------------
-BASE_HEADERS = {
-    "Host": "chat.deepseek.com",
-    "User-Agent": "DeepSeek/1.6.11 Android/35",
-    "Accept": "application/json",
-    "Accept-Encoding": "gzip",
-    "Content-Type": "application/json",
-    "x-client-platform": "android",
-    "x-client-version": "1.6.11",
-    "x-client-locale": "zh_CN",
-    "accept-charset": "UTF-8",
-}
-
-# ----------------------------------------------------------------------
-# SSE 解析配置
-# ----------------------------------------------------------------------
-# 跳过的路径模式（状态相关，不是内容）
-SKIP_PATTERNS = [
-    "quasi_status", "elapsed_secs", "token_usage", 
-    "pending_fragment", "conversation_mode",
-    "fragments/-1/status", "fragments/-2/status", "fragments/-3/status"
-]
--- a/core/deepseek.py
+++ b/core/deepseek.py
@@ -1,138 +0,0 @@
-# -*- coding: utf-8 -*-
-"""DeepSeek API 相关逻辑"""
-import time
-from curl_cffi import requests
-from fastapi import HTTPException
-
-from .config import CONFIG, save_config, logger
-from .utils import get_account_identifier
-from .constants import (
-    DEEPSEEK_HOST,
-    DEEPSEEK_LOGIN_URL,
-    DEEPSEEK_CREATE_SESSION_URL,
-    DEEPSEEK_CREATE_POW_URL,
-    DEEPSEEK_COMPLETION_URL,
-    BASE_HEADERS,
-)
-
-
-# get_account_identifier 已移至 core.utils
-
-
-
-
-# ----------------------------------------------------------------------
-# 登录函数：支持使用 email 或 mobile 登录
-# ----------------------------------------------------------------------
-def login_deepseek_via_account(account: dict) -> str:
-    """使用 account 中的 email 或 mobile 登录 DeepSeek，
-    成功后将返回的 token 写入 account 并保存至配置文件，返回新 token。
-    """
-    email = account.get("email", "").strip()
-    mobile = account.get("mobile", "").strip()
-    password = account.get("password", "").strip()
-    if not password or (not email and not mobile):
-        raise HTTPException(
-            status_code=400,
-            detail="账号缺少必要的登录信息（必须提供 email 或 mobile 以及 password）",
-        )
-    if email:
-        payload = {
-            "email": email,
-            "password": password,
-            "device_id": "deepseek_to_api",
-            "os": "android",
-        }
-    else:
-        payload = {
-            "mobile": mobile,
-            "area_code": None,
-            "password": password,
-            "device_id": "deepseek_to_api",
-            "os": "android",
-        }
-    try:
-        resp = requests.post(
-            DEEPSEEK_LOGIN_URL, headers=BASE_HEADERS, json=payload, impersonate="safari15_3"
-        )
-        resp.raise_for_status()
-    except Exception as e:
-        logger.error(f"[login_deepseek_via_account] 登录请求异常: {e}")
-        raise HTTPException(status_code=500, detail="Account login failed: 请求异常")
-    try:
-        logger.warning(f"[login_deepseek_via_account] {resp.text}")
-        data = resp.json()
-    except Exception as e:
-        logger.error(f"[login_deepseek_via_account] JSON解析失败: {e}")
-        raise HTTPException(
-            status_code=500, detail="Account login failed: invalid JSON response"
-        )
-    
-    # 检查 API 错误码
-    if data.get("code") != 0:
-        error_msg = data.get("msg", "Unknown error")
-        logger.error(f"[login_deepseek_via_account] API错误: {error_msg}")
-        raise HTTPException(
-            status_code=500, detail=f"Account login failed: {error_msg}"
-        )
-    
-    # 检查业务错误码
-    biz_code = data.get("data", {}).get("biz_code")
-    biz_msg = data.get("data", {}).get("biz_msg", "")
-    if biz_code != 0:
-        logger.error(f"[login_deepseek_via_account] 业务错误: {biz_msg}")
-        raise HTTPException(
-            status_code=500, detail=f"Account login failed: {biz_msg}"
-        )
-    
-    # 校验响应数据格式是否正确
-    if (
-        data.get("data") is None
-        or data["data"].get("biz_data") is None
-        or data["data"]["biz_data"].get("user") is None
-    ):
-        logger.error(f"[login_deepseek_via_account] 登录响应格式错误: {data}")
-        raise HTTPException(
-            status_code=500, detail="Account login failed: invalid response format"
-        )
-    new_token = data["data"]["biz_data"]["user"].get("token")
-    if not new_token:
-        logger.error(f"[login_deepseek_via_account] 登录响应中缺少 token: {data}")
-        raise HTTPException(
-            status_code=500, detail="Account login failed: missing token"
-        )
-    account["token"] = new_token
-    save_config(CONFIG)
-    return new_token
-
-
-# ----------------------------------------------------------------------
-# 封装对话接口调用的重试机制
-# ----------------------------------------------------------------------
-def call_completion_endpoint(payload: dict, headers: dict, max_attempts: int = 3):
-    """调用 DeepSeek 对话接口，支持重试"""
-    attempts = 0
-    while attempts < max_attempts:
-        try:
-            deepseek_resp = requests.post(
-                DEEPSEEK_COMPLETION_URL,
-                headers=headers,
-                json=payload,
-                stream=True,
-                impersonate="safari15_3",
-            )
-        except Exception as e:
-            logger.warning(f"[call_completion_endpoint] 请求异常: {e}")
-            time.sleep(1)
-            attempts += 1
-            continue
-        if deepseek_resp.status_code == 200:
-            return deepseek_resp
-        else:
-            logger.warning(
-                f"[call_completion_endpoint] 调用对话接口失败, 状态码: {deepseek_resp.status_code}"
-            )
-            deepseek_resp.close()
-            time.sleep(1)
-            attempts += 1
-    return None
--- a/core/messages.py
+++ b/core/messages.py
@@ -1,118 +0,0 @@
-# -*- coding: utf-8 -*-
-"""消息处理模块"""
-import re
-
-from .config import CONFIG, logger
-
-# Claude 默认模型
-CLAUDE_DEFAULT_MODEL = "claude-sonnet-4-20250514"
-
-# 预编译正则表达式（性能优化）
-_MARKDOWN_IMAGE_PATTERN = re.compile(r"!\[(.*?)\]\((.*?)\)")
-
-
-# ----------------------------------------------------------------------
-# 消息预处理函数，将多轮对话合并成最终 prompt
-# ----------------------------------------------------------------------
-def messages_prepare(messages: list) -> str:
-    """处理消息列表，合并连续相同角色的消息，并添加角色标签：
-    - 对于 assistant 消息，加上 <｜Assistant｜> 前缀及 <｜end▁of▁sentence｜> 结束标签；
-    - 对于 user/system 消息（除第一条外）加上 <｜User｜> 前缀；
-    - 如果消息 content 为数组，则提取其中 type 为 "text" 的部分；
-    - 最后移除 markdown 图片格式的内容。
-    """
-    processed = []
-    for m in messages:
-        role = m.get("role", "")
-        content = m.get("content", "")
-        if isinstance(content, list):
-            texts = [
-                item.get("text", "") for item in content if item.get("type") == "text"
-            ]
-            text = "\n".join(texts)
-        else:
-            text = str(content)
-        processed.append({"role": role, "text": text})
-    if not processed:
-        return ""
-    # 合并连续同一角色的消息
-    merged = [processed[0]]
-    for msg in processed[1:]:
-        if msg["role"] == merged[-1]["role"]:
-            merged[-1]["text"] += "\n\n" + msg["text"]
-        else:
-            merged.append(msg)
-    # 添加标签
-    parts = []
-    for idx, block in enumerate(merged):
-        role = block["role"]
-        text = block["text"]
-        if role == "assistant":
-            parts.append(f"<｜Assistant｜>{text}<｜end▁of▁sentence｜>")
-        elif role in ("user", "system"):
-            if idx > 0:
-                parts.append(f"<｜User｜>{text}")
-            else:
-                parts.append(text)
-        else:
-            parts.append(text)
-    final_prompt = "".join(parts)
-    # 仅移除 markdown 图片格式(不全部移除 !）- 使用预编译的正则表达式
-    final_prompt = _MARKDOWN_IMAGE_PATTERN.sub(r"[\1](\2)", final_prompt)
-    return final_prompt
-
-
-# ----------------------------------------------------------------------
-# OpenAI到Claude格式转换函数
-# ----------------------------------------------------------------------
-def convert_claude_to_deepseek(claude_request: dict) -> dict:
-    """将Claude格式的请求转换为DeepSeek格式（基于现有OpenAI接口）"""
-    messages = claude_request.get("messages", [])
-    model = claude_request.get("model", CLAUDE_DEFAULT_MODEL)
-
-    # 从配置文件读取Claude模型映射
-    claude_mapping = CONFIG.get(
-        "claude_model_mapping", {"fast": "deepseek-chat", "slow": "deepseek-chat"}
-    )
-
-    # Claude模型映射到DeepSeek模型 - 基于配置和模型特征判断
-    if (
-        "opus" in model.lower()
-        or "reasoner" in model.lower()
-        or "slow" in model.lower()
-    ):
-        deepseek_model = claude_mapping.get("slow", "deepseek-chat")
-    else:
-        deepseek_model = claude_mapping.get("fast", "deepseek-chat")
-
-    deepseek_request = {"model": deepseek_model, "messages": messages.copy()}
-
-    # 处理system消息 - 将system参数转换为system role消息
-    if "system" in claude_request:
-        system_msg = {"role": "system", "content": claude_request["system"]}
-        deepseek_request["messages"].insert(0, system_msg)
-
-    # 添加可选参数
-    if "temperature" in claude_request:
-        deepseek_request["temperature"] = claude_request["temperature"]
-    if "top_p" in claude_request:
-        deepseek_request["top_p"] = claude_request["top_p"]
-    if "stop_sequences" in claude_request:
-        deepseek_request["stop"] = claude_request["stop_sequences"]
-    if "stream" in claude_request:
-        deepseek_request["stream"] = claude_request["stream"]
-
-    return deepseek_request
-
-
-def convert_deepseek_to_claude_format(
-    deepseek_response: dict, original_claude_model: str = CLAUDE_DEFAULT_MODEL
-) -> dict:
-    """将DeepSeek响应转换为Claude格式的OpenAI响应"""
-    # DeepSeek响应已经是OpenAI格式，只需要修改模型名称
-    if isinstance(deepseek_response, dict):
-        claude_response = deepseek_response.copy()
-        claude_response["model"] = original_claude_model
-        return claude_response
-
-    return deepseek_response
--- a/core/models.py
+++ b/core/models.py
@@ -1,90 +0,0 @@
-# -*- coding: utf-8 -*-
-"""模型定义模块 - 集中管理所有支持的模型"""
-
-# DeepSeek 模型列表（官方模型名称）
-DEEPSEEK_MODELS = [
-    {
-        "id": "deepseek-chat",
-        "object": "model",
-        "created": 1677610602,
-        "owned_by": "deepseek",
-        "permission": [],
-    },
-    {
-        "id": "deepseek-reasoner",
-        "object": "model",
-        "created": 1677610602,
-        "owned_by": "deepseek",
-        "permission": [],
-    },
-    {
-        "id": "deepseek-chat-search",
-        "object": "model",
-        "created": 1677610602,
-        "owned_by": "deepseek",
-        "permission": [],
-    },
-    {
-        "id": "deepseek-reasoner-search",
-        "object": "model",
-        "created": 1677610602,
-        "owned_by": "deepseek",
-        "permission": [],
-    },
-]
-
-# Claude 模型映射列表
-CLAUDE_MODELS = [
-    {
-        "id": "claude-sonnet-4-20250514",
-        "object": "model",
-        "created": 1715635200,
-        "owned_by": "anthropic",
-    },
-    {
-        "id": "claude-sonnet-4-20250514-fast",
-        "object": "model",
-        "created": 1715635200,
-        "owned_by": "anthropic",
-    },
-    {
-        "id": "claude-sonnet-4-20250514-slow",
-        "object": "model",
-        "created": 1715635200,
-        "owned_by": "anthropic",
-    },
-]
-
-
-def get_model_config(model: str) -> tuple[bool, bool]:
-    """根据模型名称获取配置
-    
-    Args:
-        model: 模型名称
-        
-    Returns:
-        (thinking_enabled, search_enabled) 元组
-    """
-    model_lower = model.lower()
-    
-    if model_lower == "deepseek-chat":
-        return False, False
-    elif model_lower == "deepseek-reasoner":
-        return True, False
-    elif model_lower == "deepseek-chat-search":
-        return False, True
-    elif model_lower == "deepseek-reasoner-search":
-        return True, True
-    else:
-        return None, None  # 不支持的模型
-
-
-def get_openai_models_response() -> dict:
-    """获取 OpenAI 格式的模型列表响应"""
-    return {"object": "list", "data": DEEPSEEK_MODELS}
-
-
-def get_claude_models_response() -> dict:
-    """获取 Claude 格式的模型列表响应"""
-    return {"object": "list", "data": CLAUDE_MODELS}
-
--- a/core/pow.py
+++ b/core/pow.py
@@ -1,253 +0,0 @@
-# -*- coding: utf-8 -*-
-"""PoW (Proof of Work) 计算模块"""
-import base64
-import ctypes
-import json
-import struct
-import threading
-import time
-
-from curl_cffi import requests
-from wasmtime import Engine, Linker, Module, Store
-
-from .config import CONFIG, WASM_PATH, logger
-from .utils import get_account_identifier
-
-# ----------------------------------------------------------------------
-# WASM 模块缓存 - 避免每次请求都重新加载
-# ----------------------------------------------------------------------
-_wasm_cache_lock = threading.Lock()
-_wasm_engine = None
-_wasm_module = None
-
-
-def _get_cached_wasm_module(wasm_path: str):
-    """获取缓存的 WASM 模块，首次调用时加载"""
-    global _wasm_engine, _wasm_module
-    
-    if _wasm_module is not None:
-        return _wasm_engine, _wasm_module
-    
-    with _wasm_cache_lock:
-        # 双重检查锁定
-        if _wasm_module is not None:
-            return _wasm_engine, _wasm_module
-        
-        try:
-            with open(wasm_path, "rb") as f:
-                wasm_bytes = f.read()
-            _wasm_engine = Engine()
-            _wasm_module = Module(_wasm_engine, wasm_bytes)
-            logger.info(f"[WASM] 已缓存 WASM 模块: {wasm_path}")
-        except Exception as e:
-            logger.error(f"[WASM] 加载 WASM 模块失败: {e}")
-            raise RuntimeError(f"加载 wasm 文件失败: {wasm_path}, 错误: {e}")
-    
-    return _wasm_engine, _wasm_module
-
-
-# 启动时预加载 WASM 模块
-try:
-    _get_cached_wasm_module(WASM_PATH)
-except Exception as e:
-    logger.warning(f"[WASM] 启动时预加载失败（将在首次使用时重试）: {e}")
-
-# get_account_identifier 已移至 core.utils
-
-
-# ----------------------------------------------------------------------
-# 使用 WASM 模块计算 PoW 答案的辅助函数
-# ----------------------------------------------------------------------
-def compute_pow_answer(
-    algorithm: str,
-    challenge_str: str,
-    salt: str,
-    difficulty: int,
-    expire_at: int,
-    signature: str,
-    target_path: str,
-    wasm_path: str,
-) -> int:
-    """
-    使用 WASM 模块计算 DeepSeekHash 答案（answer）。
-    根据 JS 逻辑：
-      - 拼接前缀： "{salt}_{expire_at}_"
-      - 将 challenge 与前缀写入 wasm 内存后调用 wasm_solve 进行求解，
-      - 从 wasm 内存中读取状态与求解结果，
-      - 若状态非 0，则返回整数形式的答案，否则返回 None。
-    
-    优化：使用缓存的 WASM 模块，避免每次请求都重新加载文件。
-    """
-    if algorithm != "DeepSeekHashV1":
-        raise ValueError(f"不支持的算法：{algorithm}")
-    
-    prefix = f"{salt}_{expire_at}_"
-    
-    # 获取缓存的 WASM 模块（避免重复加载文件）
-    engine, module = _get_cached_wasm_module(wasm_path)
-    
-    # 每次调用创建新的 Store 和实例（必须的，因为 Store 不是线程安全的）
-    store = Store(engine)
-    linker = Linker(engine)
-    instance = linker.instantiate(store, module)
-    exports = instance.exports(store)
-    
-    try:
-        memory = exports["memory"]
-        add_to_stack = exports["__wbindgen_add_to_stack_pointer"]
-        alloc = exports["__wbindgen_export_0"]
-        wasm_solve = exports["wasm_solve"]
-    except KeyError as e:
-        raise RuntimeError(f"缺少 wasm 导出函数: {e}")
-
-    def write_memory(offset: int, data: bytes):
-        size = len(data)
-        base_addr = ctypes.cast(memory.data_ptr(store), ctypes.c_void_p).value
-        ctypes.memmove(base_addr + offset, data, size)
-
-    def read_memory(offset: int, size: int) -> bytes:
-        base_addr = ctypes.cast(memory.data_ptr(store), ctypes.c_void_p).value
-        return ctypes.string_at(base_addr + offset, size)
-
-    def encode_string(text: str):
-        data = text.encode("utf-8")
-        length = len(data)
-        ptr_val = alloc(store, length, 1)
-        ptr = int(ptr_val.value) if hasattr(ptr_val, "value") else int(ptr_val)
-        write_memory(ptr, data)
-        return ptr, length
-
-    # 1. 申请 16 字节栈空间
-    retptr = add_to_stack(store, -16)
-    # 2. 编码 challenge 与 prefix 到 wasm 内存中
-    ptr_challenge, len_challenge = encode_string(challenge_str)
-    ptr_prefix, len_prefix = encode_string(prefix)
-    # 3. 调用 wasm_solve（注意：difficulty 以 float 形式传入）
-    wasm_solve(
-        store,
-        retptr,
-        ptr_challenge,
-        len_challenge,
-        ptr_prefix,
-        len_prefix,
-        float(difficulty),
-    )
-    # 4. 从 retptr 处读取 4 字节状态和 8 字节求解结果
-    status_bytes = read_memory(retptr, 4)
-    if len(status_bytes) != 4:
-        add_to_stack(store, 16)
-        raise RuntimeError("读取状态字节失败")
-    status = struct.unpack("<i", status_bytes)[0]
-    value_bytes = read_memory(retptr + 8, 8)
-    if len(value_bytes) != 8:
-        add_to_stack(store, 16)
-        raise RuntimeError("读取结果字节失败")
-    value = struct.unpack("<d", value_bytes)[0]
-    # 5. 恢复栈指针
-    add_to_stack(store, 16)
-    if status == 0:
-        return None
-    return int(value)
-
-
-def get_pow_response(request, max_attempts: int = 3):
-    """获取 PoW 响应
-    
-    Args:
-        request: FastAPI 请求对象
-        max_attempts: 最大重试次数
-        
-    Returns:
-        Base64 编码的 PoW 响应，如果失败返回 None
-    """
-    from .auth import get_auth_headers, choose_new_account
-    from .deepseek import BASE_HEADERS, login_deepseek_via_account, DEEPSEEK_CREATE_POW_URL
-    
-    pow_url = DEEPSEEK_CREATE_POW_URL
-    
-    attempts = 0
-    while attempts < max_attempts:
-        headers = get_auth_headers(request)
-        try:
-            resp = requests.post(
-                pow_url,
-                headers=headers,
-                json={"target_path": "/api/v0/chat/completion"},
-                timeout=30,
-                impersonate="safari15_3",
-            )
-        except Exception as e:
-            logger.error(f"[get_pow_response] 请求异常: {e}")
-            attempts += 1
-            continue
-        try:
-            data = resp.json()
-        except Exception as e:
-            logger.error(f"[get_pow_response] JSON解析异常: {e}")
-            data = {}
-        if resp.status_code == 200 and data.get("code") == 0:
-            challenge = data["data"]["biz_data"]["challenge"]
-            difficulty = challenge.get("difficulty", 144000)
-            expire_at = challenge.get("expire_at", 1680000000)
-            try:
-                answer = compute_pow_answer(
-                    challenge["algorithm"],
-                    challenge["challenge"],
-                    challenge["salt"],
-                    difficulty,
-                    expire_at,
-                    challenge["signature"],
-                    challenge["target_path"],
-                    WASM_PATH,
-                )
-            except Exception as e:
-                logger.error(f"[get_pow_response] PoW 答案计算异常: {e}")
-                answer = None
-            if answer is None:
-                logger.warning("[get_pow_response] PoW 答案计算失败，重试中...")
-                resp.close()
-                attempts += 1
-                continue
-            pow_dict = {
-                "algorithm": challenge["algorithm"],
-                "challenge": challenge["challenge"],
-                "salt": challenge["salt"],
-                "answer": answer,
-                "signature": challenge["signature"],
-                "target_path": challenge["target_path"],
-            }
-            pow_str = json.dumps(pow_dict, separators=(",", ":"), ensure_ascii=False)
-            encoded = base64.b64encode(pow_str.encode("utf-8")).decode("utf-8").rstrip()
-            resp.close()
-            return encoded
-        else:
-            code = data.get("code")
-            logger.warning(
-                f"[get_pow_response] 获取 PoW 失败, code={code}, msg={data.get('msg')}"
-            )
-            resp.close()
-            if request.state.use_config_token:
-                current_id = get_account_identifier(request.state.account)
-                if not hasattr(request.state, "tried_accounts"):
-                    request.state.tried_accounts = []
-                if current_id not in request.state.tried_accounts:
-                    request.state.tried_accounts.append(current_id)
-                new_account = choose_new_account(request.state.tried_accounts)
-                if new_account is None:
-                    break
-                try:
-                    login_deepseek_via_account(new_account)
-                except Exception as e:
-                    logger.error(
-                        f"[get_pow_response] 账号 {get_account_identifier(new_account)} 登录失败：{e}"
-                    )
-                    attempts += 1
-                    continue
-                request.state.account = new_account
-                request.state.deepseek_token = new_account.get("token")
-            else:
-                attempts += 1
-                continue
-            attempts += 1
-    return None
-
--- a/core/session_manager.py
+++ b/core/session_manager.py
@@ -1,165 +0,0 @@
-# -*- coding: utf-8 -*-
-"""会话管理模块 - 封装公共的会话创建和 PoW 获取逻辑"""
-from curl_cffi import requests as cffi_requests
-from fastapi import HTTPException, Request
-
-from .config import logger
-from .utils import get_account_identifier
-from .models import get_model_config
-from .auth import (
-    get_auth_headers,
-    choose_new_account,
-    release_account,
-    refresh_account_token,
-)
-from .deepseek import (
-    DEEPSEEK_CREATE_SESSION_URL,
-    DEEPSEEK_CREATE_POW_URL,
-    login_deepseek_via_account,
-    call_completion_endpoint,
-)
-from .pow import get_pow_response
-
-
-def create_session(request: Request, max_attempts: int = 3) -> str | None:
-    """创建 DeepSeek 会话
-    
-    Args:
-        request: FastAPI 请求对象
-        max_attempts: 最大重试次数
-        
-    Returns:
-        会话 ID，如果失败返回 None
-    """
-    attempts = 0
-    token_refreshed = False  # 标记是否已尝试刷新 token
-    
-    while attempts < max_attempts:
-        headers = get_auth_headers(request)
-        try:
-            resp = cffi_requests.post(
-                DEEPSEEK_CREATE_SESSION_URL,
-                headers=headers,
-                json={"agent": "chat"},
-                impersonate="safari15_3",
-            )
-        except Exception as e:
-            logger.error(f"[create_session] 请求异常: {e}")
-            attempts += 1
-            continue
-        
-        try:
-            data = resp.json()
-        except Exception as e:
-            logger.error(f"[create_session] JSON解析异常: {e}")
-            data = {}
-        
-        if resp.status_code == 200 and data.get("code") == 0:
-            session_id = data["data"]["biz_data"]["id"]
-            resp.close()
-            return session_id
-        else:
-            code = data.get("code")
-            msg = data.get("msg", "")
-            logger.warning(
-                f"[create_session] 创建会话失败, code={code}, msg={msg}"
-            )
-            resp.close()
-            
-            # 配置模式下尝试处理 token 问题
-            if request.state.use_config_token:
-                # token 无效（认证失败）时，先尝试刷新当前账号的 token
-                if code in [40001, 40002, 40003] or "token" in msg.lower() or "unauthorized" in msg.lower():
-                    if not token_refreshed:
-                        logger.info("[create_session] 检测到 token 可能过期，尝试刷新")
-                        if refresh_account_token(request):
-                            token_refreshed = True
-                            continue  # 使用新 token 重试
-                        else:
-                            logger.warning("[create_session] token 刷新失败，尝试切换账号")
-                
-                # token 刷新失败或其他错误，尝试切换账号
-                current_id = get_account_identifier(request.state.account)
-                if not hasattr(request.state, "tried_accounts"):
-                    request.state.tried_accounts = []
-                if current_id not in request.state.tried_accounts:
-                    request.state.tried_accounts.append(current_id)
-                new_account = choose_new_account(request.state.tried_accounts)
-                if new_account is None:
-                    break
-                try:
-                    login_deepseek_via_account(new_account)
-                except Exception as e:
-                    logger.error(
-                        f"[create_session] 账号 {get_account_identifier(new_account)} 登录失败：{e}"
-                    )
-                    attempts += 1
-                    continue
-                request.state.account = new_account
-                request.state.deepseek_token = new_account.get("token")
-                token_refreshed = False  # 新账号重置刷新标记
-            else:
-                attempts += 1
-                continue
-        attempts += 1
-    return None
-
-
-def get_pow(request: Request, max_attempts: int = 3) -> str | None:
-    """获取 PoW 响应的包装函数
-    
-    Args:
-        request: FastAPI 请求对象
-        max_attempts: 最大重试次数
-        
-    Returns:
-        Base64 编码的 PoW 响应，如果失败返回 None
-    """
-    return get_pow_response(request, max_attempts)
-
-
-def prepare_completion_request(
-    request: Request,
-    session_id: str,
-    prompt: str,
-    thinking_enabled: bool = False,
-    search_enabled: bool = False,
-    max_attempts: int = 3,
-):
-    """准备并执行对话补全请求
-    
-    Args:
-        request: FastAPI 请求对象
-        session_id: 会话 ID
-        prompt: 处理后的提示词
-        thinking_enabled: 是否启用思考模式
-        search_enabled: 是否启用搜索
-        max_attempts: 最大重试次数
-        
-    Returns:
-        DeepSeek 响应对象，如果失败返回 None
-    """
-    pow_resp = get_pow(request, max_attempts)
-    if not pow_resp:
-        return None
-    
-    headers = {**get_auth_headers(request), "x-ds-pow-response": pow_resp}
-    payload = {
-        "chat_session_id": session_id,
-        "parent_message_id": None,
-        "prompt": prompt,
-        "ref_file_ids": [],
-        "thinking_enabled": thinking_enabled,
-        "search_enabled": search_enabled,
-    }
-    
-    return call_completion_endpoint(payload, headers, max_attempts)
-
-
-# get_model_config 已移至 core.models
-
-
-def cleanup_account(request: Request):
-    """清理账号资源（将账号放回队列）"""
-    if getattr(request.state, "use_config_token", False) and hasattr(request.state, "account"):
-        release_account(request.state.account)
--- a/core/sse_parser.py
+++ b/core/sse_parser.py
@@ -1,470 +0,0 @@
-# -*- coding: utf-8 -*-
-"""DeepSeek SSE 流解析模块
-
-这个模块包含解析 DeepSeek SSE 响应的公共逻辑，供 openai.py、claude.py 和 accounts.py 共用。
-合并了原 sse_parser.py 和 stream_parser.py 的功能。
-"""
-import json
-import re
-from typing import List, Tuple, Optional, Dict, Any, Generator
-
-from .config import logger
-from .constants import SKIP_PATTERNS
-
-# 预编译正则表达式
-_TOOL_CALL_PATTERN = re.compile(r'\{\s*["\']tool_calls["\']\s*:\s*\[(.*?)\]\s*\}', re.DOTALL)
-_CITATION_PATTERN = re.compile(r"^\[citation:")
-
-
-# ----------------------------------------------------------------------
-# 基础解析函数
-# ----------------------------------------------------------------------
-
-def parse_deepseek_sse_line(raw_line: bytes) -> Optional[Dict[str, Any]]:
-    """解析 DeepSeek SSE 行
-    
-    Args:
-        raw_line: 原始字节行
-        
-    Returns:
-        解析后的 chunk 字典，如果解析失败或应跳过则返回 None
-    """
-    try:
-        line = raw_line.decode("utf-8")
-    except Exception as e:
-        logger.warning(f"[parse_deepseek_sse_line] 解码失败: {e}")
-        return None
-    
-    if not line or not line.startswith("data:"):
-        return None
-    
-    data_str = line[5:].strip()
-    
-    if data_str == "[DONE]":
-        return {"type": "done"}
-    
-    try:
-        chunk = json.loads(data_str)
-        return chunk
-    except json.JSONDecodeError as e:
-        logger.warning(f"[parse_deepseek_sse_line] JSON解析失败: {e}")
-        return None
-
-
-def should_skip_chunk(chunk_path: str) -> bool:
-    """判断是否应该跳过这个 chunk（状态相关，不是内容）"""
-    if chunk_path == "response/search_status":
-        return True
-    return any(kw in chunk_path for kw in SKIP_PATTERNS)
-
-
-def is_response_finished(chunk_path: str, v_value: Any) -> bool:
-    """判断是否是响应结束信号"""
-    return chunk_path == "response/status" and isinstance(v_value, str) and v_value == "FINISHED"
-
-
-def is_finished_signal(chunk_path: str, v_value: str) -> bool:
-    """判断字符串 v_value 是否是结束信号"""
-    return v_value == "FINISHED" and (not chunk_path or chunk_path == "status")
-
-
-def is_search_result(item: dict) -> bool:
-    """判断是否是搜索结果项（url/title/snippet）"""
-    return "url" in item and "title" in item
-
-
-# ----------------------------------------------------------------------
-# 内容提取函数
-# ----------------------------------------------------------------------
-
-def extract_content_from_item(item: dict, default_type: str = "text") -> Optional[Tuple[str, str]]:
-    """从包含 content 和 type 的项中提取内容
-    
-    返回 (content, content_type) 或 None
-    """
-    if "content" in item and "type" in item:
-        inner_type = item.get("type", "").upper()
-        content = item.get("content", "")
-        if content:
-            if inner_type == "THINK" or inner_type == "THINKING":
-                return (content, "thinking")
-            elif inner_type == "RESPONSE":
-                return (content, "text")
-            else:
-                return (content, default_type)
-    return None
-
-
-def extract_content_recursive(items: List[Dict], default_type: str = "text") -> Optional[List[Tuple[str, str]]]:
-    """递归提取列表中的内容
-    
-    返回 [(content, content_type), ...] 列表，
-    如果遇到 FINISHED 信号返回 None
-    """
-    extracted: List[Tuple[str, str]] = []
-    for item in items:
-        if not isinstance(item, dict):
-            continue
-        
-        item_p = item.get("p", "")
-        item_v = item.get("v")
-        
-        # 跳过搜索结果项
-        if is_search_result(item):
-            continue
-        
-        # 只有当 p="status" (精确匹配) 且 v="FINISHED" 才认为是真正结束
-        if item_p == "status" and item_v == "FINISHED":
-            return None  # 信号结束
-        
-        # 跳过状态相关
-        if should_skip_chunk(item_p):
-            continue
-        
-        # 直接处理包含 content 和 type 的项
-        result = extract_content_from_item(item, default_type)
-        if result:
-            extracted.append(result)
-            continue
-        
-        # 确定类型（基于 p 字段）
-        if "thinking" in item_p:
-            content_type = "thinking"
-        elif "content" in item_p or item_p == "response" or item_p == "fragments":
-            content_type = "text"
-        else:
-            content_type = default_type
-        
-        # 处理不同的 v 类型
-        if isinstance(item_v, str):
-            if item_v and item_v != "FINISHED":
-                extracted.append((item_v, content_type))
-        elif isinstance(item_v, list):
-            # 内层可能是 [{"content": "text", "type": "THINK/RESPONSE", ...}] 格式
-            for inner in item_v:
-                if isinstance(inner, dict):
-                    # 检查内层的 type 字段
-                    inner_type = inner.get("type", "").upper()
-                    # DeepSeek 使用 THINK 而不是 THINKING
-                    if inner_type == "THINK" or inner_type == "THINKING":
-                        final_type = "thinking"
-                    elif inner_type == "RESPONSE":
-                        final_type = "text"
-                    else:
-                        final_type = content_type  # 继承外层类型
-                    
-                    content = inner.get("content", "")
-                    if content:
-                        extracted.append((content, final_type))
-                elif isinstance(inner, str) and inner:
-                    extracted.append((inner, content_type))
-    return extracted
-
-
-# ----------------------------------------------------------------------
-# 高级解析函数
-# ----------------------------------------------------------------------
-
-def parse_sse_chunk_for_content(
-    chunk: Dict[str, Any], 
-    thinking_enabled: bool = False, 
-    current_fragment_type: str = "thinking"
-) -> Tuple[List[Tuple[str, str]], bool, str]:
-    """解析单个 SSE chunk 并提取内容
-    
-    Args:
-        chunk: 解析后的 JSON chunk
-        thinking_enabled: 是否启用思考模式
-        current_fragment_type: 当前活跃的 fragment 类型 ("thinking" 或 "text")
-                              用于处理没有明确路径的空 p 字段内容
-    
-    Returns:
-        (contents, is_finished, new_fragment_type)
-        - contents: [(content, content_type), ...] 列表
-        - is_finished: 是否是结束信号
-        - new_fragment_type: 更新后的 fragment 类型，供下一个 chunk 使用
-    """
-    if "v" not in chunk:
-        return ([], False, current_fragment_type)
-    
-    v_value = chunk["v"]
-    chunk_path = chunk.get("p", "")
-    contents: List[Tuple[str, str]] = []
-    new_fragment_type = current_fragment_type
-    
-    # 跳过状态相关 chunk
-    if should_skip_chunk(chunk_path):
-        return ([], False, current_fragment_type)
-    
-    # 检查是否是真正的响应结束信号
-    if is_response_finished(chunk_path, v_value):
-        return ([], True, current_fragment_type)
-    
-    # 检测 fragment 类型变化（来自 APPEND 操作）
-    # 格式: {'p': 'response', 'o': 'BATCH', 'v': [{'p': 'fragments', 'o': 'APPEND', 'v': [{'type': 'THINK/RESPONSE', ...}]}]}
-    if chunk_path == "response" and isinstance(v_value, list):
-        for batch_item in v_value:
-            if isinstance(batch_item, dict) and batch_item.get("p") == "fragments" and batch_item.get("o") == "APPEND":
-                fragments = batch_item.get("v", [])
-                for frag in fragments:
-                    if isinstance(frag, dict):
-                        frag_type = frag.get("type", "").upper()
-                        if frag_type == "THINK" or frag_type == "THINKING":
-                            new_fragment_type = "thinking"
-                        elif frag_type == "RESPONSE":
-                            new_fragment_type = "text"
-    
-    # 也检测直接的 fragments 路径
-    if "response/fragments" in chunk_path and isinstance(v_value, list):
-        for frag in v_value:
-            if isinstance(frag, dict):
-                frag_type = frag.get("type", "").upper()
-                if frag_type == "THINK" or frag_type == "THINKING":
-                    new_fragment_type = "thinking"
-                elif frag_type == "RESPONSE":
-                    new_fragment_type = "text"
-    
-    # 确定当前内容类型
-    if chunk_path == "response/thinking_content":
-        ptype = "thinking"
-    elif chunk_path == "response/content":
-        ptype = "text"
-    elif "response/fragments" in chunk_path and "/content" in chunk_path:
-        # 如 response/fragments/-1/content - 使用当前 fragment 类型
-        ptype = new_fragment_type
-    elif not chunk_path:
-        # 空路径内容：使用当前活跃的 fragment 类型
-        if thinking_enabled:
-            ptype = new_fragment_type
-        else:
-            ptype = "text"
-    else:
-        ptype = "text"
-    
-    # 处理字符串值
-    if isinstance(v_value, str):
-        if is_finished_signal(chunk_path, v_value):
-            return ([], True, new_fragment_type)
-        if v_value:
-            contents.append((v_value, ptype))
-    
-    # 处理列表值
-    elif isinstance(v_value, list):
-        result = extract_content_recursive(v_value, ptype)
-        if result is None:
-            return ([], True, new_fragment_type)
-        contents.extend(result)
-    
-    # 处理字典值（初始响应 chunk，包含 response.fragments）
-    elif isinstance(v_value, dict):
-        response_obj = v_value.get("response", v_value)
-        fragments = response_obj.get("fragments", [])
-        if isinstance(fragments, list):
-            for frag in fragments:
-                if isinstance(frag, dict):
-                    frag_type = frag.get("type", "").upper()
-                    frag_content = frag.get("content", "")
-                    if frag_type == "THINK" or frag_type == "THINKING":
-                        new_fragment_type = "thinking"
-                        if frag_content:
-                            contents.append((frag_content, "thinking"))
-                    elif frag_type == "RESPONSE":
-                        new_fragment_type = "text"
-                        if frag_content:
-                            contents.append((frag_content, "text"))
-                    elif frag_content:
-                        contents.append((frag_content, ptype))
-    
-    return (contents, False, new_fragment_type)
-
-
-def extract_content_from_chunk(chunk: Dict[str, Any]) -> Tuple[str, str, bool]:
-    """从 DeepSeek chunk 中提取内容（简化版本，兼容旧接口）
-    
-    Args:
-        chunk: 解析后的 chunk 字典
-        
-    Returns:
-        (content, content_type, is_finished) 元组
-        content_type 为 "thinking" 或 "text"
-        is_finished 为 True 表示响应结束
-    """
-    if chunk.get("type") == "done":
-        return "", "text", True
-    
-    # 检测内容审核/敏感词阻止
-    if "error" in chunk or chunk.get("code") == "content_filter":
-        logger.warning(f"[extract_content_from_chunk] 检测到内容过滤: {chunk}")
-        return "", "text", True
-    
-    if "v" not in chunk:
-        return "", "text", False
-    
-    v_value = chunk["v"]
-    ptype = "text"
-    
-    # 检查路径确定类型
-    path = chunk.get("p", "")
-    if path == "response/search_status":
-        return "", "text", False  # 跳过搜索状态
-    elif path == "response/thinking_content":
-        ptype = "thinking"
-    elif path == "response/content":
-        ptype = "text"
-    
-    if isinstance(v_value, str):
-        if v_value == "FINISHED":
-            return "", ptype, True
-        return v_value, ptype, False
-    elif isinstance(v_value, list):
-        for item in v_value:
-            if isinstance(item, dict):
-                if item.get("p") == "status" and item.get("v") == "FINISHED":
-                    return "", ptype, True
-        return "", ptype, False
-    
-    return "", ptype, False
-
-
-# ----------------------------------------------------------------------
-# 响应收集函数
-# ----------------------------------------------------------------------
-
-def collect_deepseek_response(response: Any) -> Tuple[str, str]:
-    """收集 DeepSeek 流响应的完整内容
-    
-    Args:
-        response: DeepSeek 流响应对象
-        
-    Returns:
-        (reasoning_content, text_content) 元组
-    """
-    thinking_parts: List[str] = []
-    text_parts: List[str] = []
-    
-    try:
-        for raw_line in response.iter_lines():
-            chunk = parse_deepseek_sse_line(raw_line)
-            if not chunk:
-                continue
-            
-            content, content_type, is_finished = extract_content_from_chunk(chunk)
-            
-            if is_finished:
-                break
-            
-            if content:
-                if content_type == "thinking":
-                    thinking_parts.append(content)
-                else:
-                    text_parts.append(content)
-    except Exception as e:
-        logger.error(f"[collect_deepseek_response] 收集响应失败: {e}")
-    finally:
-        try:
-            response.close()
-        except Exception:
-            pass
-    
-    return "".join(thinking_parts), "".join(text_parts)
-
-
-# ----------------------------------------------------------------------
-# 工具调用解析
-# ----------------------------------------------------------------------
-
-def parse_tool_calls(text: str, tools_requested: List[Dict]) -> List[Dict[str, Any]]:
-    """从响应文本中解析工具调用
-    
-    Args:
-        text: 响应文本
-        tools_requested: 请求中定义的工具列表
-        
-    Returns:
-        检测到的工具调用列表，每项包含 name 和 input
-    """
-    detected_tools: List[Dict[str, Any]] = []
-    cleaned_text = text.strip()
-    
-    # 尝试直接解析完整 JSON
-    if cleaned_text.startswith('{"tool_calls":') and cleaned_text.endswith("]}"):
-        try:
-            tool_data = json.loads(cleaned_text)
-            for tool_call in tool_data.get("tool_calls", []):
-                tool_name = tool_call.get("name")
-                tool_input = tool_call.get("input", {})
-                if any(tool.get("name") == tool_name for tool in tools_requested):
-                    detected_tools.append({"name": tool_name, "input": tool_input})
-            if detected_tools:
-                return detected_tools
-        except json.JSONDecodeError:
-            pass
-    
-    # 使用正则匹配
-    matches = _TOOL_CALL_PATTERN.findall(cleaned_text)
-    for match in matches:
-        try:
-            tool_calls_json = f'{{"tool_calls": [{match}]}}'
-            tool_data = json.loads(tool_calls_json)
-            for tool_call in tool_data.get("tool_calls", []):
-                tool_name = tool_call.get("name")
-                tool_input = tool_call.get("input", {})
-                if any(tool.get("name") == tool_name for tool in tools_requested):
-                    detected_tools.append({"name": tool_name, "input": tool_input})
-        except json.JSONDecodeError:
-            continue
-    
-    return detected_tools
-
-
-# ----------------------------------------------------------------------
-# 引用过滤
-# ----------------------------------------------------------------------
-
-def should_filter_citation(text: str, search_enabled: bool) -> bool:
-    """检查是否应该过滤引用内容
-    
-    Args:
-        text: 内容文本
-        search_enabled: 是否启用搜索
-        
-    Returns:
-        是否应该过滤
-    """
-    if not search_enabled:
-        return False
-    return _CITATION_PATTERN.match(text) is not None
-
-
-# ----------------------------------------------------------------------
-# 工具调用格式化
-# ----------------------------------------------------------------------
-
-def format_openai_tool_calls(
-    detected_tools: List[Dict[str, Any]], 
-    base_id: str = ""
-) -> List[Dict[str, Any]]:
-    """将检测到的工具调用格式化为 OpenAI API 格式
-    
-    Args:
-        detected_tools: parse_tool_calls 返回的工具调用列表
-        base_id: 用于生成唯一 ID 的基础字符串（可选）
-        
-    Returns:
-        OpenAI 格式的 tool_calls 数组，例如：
-        [{"id": "call_xxx", "type": "function", "function": {"name": "...", "arguments": "..."}}]
-    """
-    import random
-    import time
-    
-    tool_calls_data = []
-    for idx, tool_info in enumerate(detected_tools):
-        tool_calls_data.append({
-            "id": f"call_{base_id or int(time.time())}_{random.randint(1000,9999)}_{idx}",
-            "type": "function",
-            "function": {
-                "name": tool_info["name"],
-                "arguments": json.dumps(tool_info.get("input", {}), ensure_ascii=False)
-            }
-        })
-    return tool_calls_data
--- a/core/utils.py
+++ b/core/utils.py
@@ -1,29 +0,0 @@
-# -*- coding: utf-8 -*-
-"""公共工具函数模块"""
-
-
-def get_account_identifier(account: dict) -> str:
-    """返回账号的唯一标识，优先使用 email，否则使用 mobile"""
-    return account.get("email", "").strip() or account.get("mobile", "").strip()
-
-
-def estimate_tokens(text) -> int:
-    """估算文本的 token 数量（简单估算：字符数/4）
-    
-    Args:
-        text: 字符串或其他类型
-        
-    Returns:
-        估算的 token 数量，最小为 1
-    """
-    if isinstance(text, str):
-        return max(1, len(text) // 4)
-    elif isinstance(text, list):
-        return sum(
-            estimate_tokens(item.get("text", ""))
-            if isinstance(item, dict)
-            else estimate_tokens(str(item))
-            for item in text
-        )
-    else:
-        return max(1, len(str(text)) // 4)
--- a/dev.py
+++ b/dev.py
@@ -1,151 +0,0 @@
-#!/usr/bin/env python3
-# -*- coding: utf-8 -*-
-"""
-DS2API 开发服务器 - 统一启动后端和前端
-
-使用方法:
-    python dev.py             # 同时启动后端和前端
-    python dev.py --backend   # 仅启动后端
-    python dev.py --frontend  # 仅启动前端
-    python dev.py --install   # 安装所有依赖
-
-环境变量:
-    PORT - 后端服务端口，默认 5001
-    LOG_LEVEL - 日志级别，默认 INFO
-"""
-import os
-import sys
-import signal
-import subprocess
-import time
-from pathlib import Path
-
-# 配置
-BACKEND_PORT = int(os.getenv("PORT", "5001"))
-FRONTEND_PORT = 5173
-HOST = os.getenv("HOST", "0.0.0.0")
-LOG_LEVEL = os.getenv("LOG_LEVEL", "info").lower()
-PROJECT_DIR = Path(__file__).parent
-WEBUI_DIR = PROJECT_DIR / "webui"
-REQUIREMENTS_FILE = PROJECT_DIR / "requirements.txt"
-
-processes = []
-
-
-def install_dependencies():
-    """安装所有 Python 和 Node.js 依赖"""
-    print("\n📦 安装 Python 依赖...")
-    subprocess.run([
-        sys.executable, "-m", "pip", "install", "-r", str(REQUIREMENTS_FILE), "-q"
-    ], check=True)
-    print("✅ Python 依赖安装完成")
-    
-    if WEBUI_DIR.exists():
-        print("\n📦 安装前端依赖...")
-        subprocess.run(["npm", "install"], cwd=WEBUI_DIR, check=True)
-        print("✅ 前端依赖安装完成")
-    
-    print("\n🎉 所有依赖安装完成！运行 `python dev.py` 启动服务\n")
-
-
-def signal_handler(sig, frame):
-    """处理退出信号，终止所有子进程"""
-    print("\n\n🛑 正在关闭所有服务...")
-    for proc in processes:
-        if proc.poll() is None:
-            proc.terminate()
-            try:
-                proc.wait(timeout=3)
-            except subprocess.TimeoutExpired:
-                proc.kill()
-    print("👋 已退出\n")
-    sys.exit(0)
-
-
-def start_backend():
-    """启动后端服务"""
-    print(f"🚀 启动后端服务... http://localhost:{BACKEND_PORT}")
-    proc = subprocess.Popen(
-        [
-            sys.executable, "-m", "uvicorn",
-            "app:app",
-            "--host", HOST,
-            "--port", str(BACKEND_PORT),
-            "--reload",
-            "--reload-dir", str(PROJECT_DIR),
-            "--log-level", LOG_LEVEL,
-        ],
-        cwd=PROJECT_DIR,
-    )
-    processes.append(proc)
-    return proc
-
-
-def start_frontend():
-    """启动前端开发服务器"""
-    if not WEBUI_DIR.exists():
-        print("⚠️  webui 目录不存在，跳过前端启动")
-        return None
-    
-    node_modules = WEBUI_DIR / "node_modules"
-    if not node_modules.exists():
-        print("📦 安装前端依赖...")
-        subprocess.run(["npm", "install"], cwd=WEBUI_DIR, check=True)
-    
-    print(f"🎨 启动前端服务... http://localhost:{FRONTEND_PORT}")
-    proc = subprocess.Popen(
-        ["npm", "run", "dev"],
-        cwd=WEBUI_DIR,
-    )
-    processes.append(proc)
-    return proc
-
-
-def main():
-    # 解析参数
-    if "--install" in sys.argv or "-i" in sys.argv:
-        install_dependencies()
-        return
-    
-    backend_only = "--backend" in sys.argv or "-b" in sys.argv
-    frontend_only = "--frontend" in sys.argv or "-f" in sys.argv
-    
-    # 注册信号处理
-    signal.signal(signal.SIGINT, signal_handler)
-    signal.signal(signal.SIGTERM, signal_handler)
-    
-    print("\n" + "=" * 50)
-    print("       DS2API 开发服务器")
-    print("=" * 50)
-    
-    if frontend_only:
-        start_frontend()
-    elif backend_only:
-        start_backend()
-    else:
-        # 同时启动
-        start_backend()
-        time.sleep(1)  # 等待后端启动
-        start_frontend()
-    
-    print("\n" + "-" * 50)
-    if not frontend_only:
-        print(f"📡 后端 API:  http://localhost:{BACKEND_PORT}")
-    if not backend_only:
-        print(f"🎨 管理界面: http://localhost:{FRONTEND_PORT}")
-    print("-" * 50)
-    print("按 Ctrl+C 停止所有服务\n")
-    
-    # 等待进程结束
-    try:
-        while processes:
-            for proc in processes[:]:
-                if proc.poll() is not None:
-                    processes.remove(proc)
-            time.sleep(0.5)
-    except KeyboardInterrupt:
-        signal_handler(None, None)
-
-
-if __name__ == "__main__":
-    main()
--- a/docker-compose.dev.yml
+++ b/docker-compose.dev.yml
@@ -9,24 +9,14 @@

 services:
  ds2api:
-    build: .
+    build:
+      context: .
+      target: go-builder
    image: ds2api:dev
    container_name: ds2api-dev
-    command: [
-      "uvicorn",
-      "app:app",
-      "--host",
-      "0.0.0.0",
-      "--port",
-      "5001",
-      "--reload",
-      "--reload-dir",
-      "/app",
-      "--log-level",
-      "debug"
-    ]
+    command: ["go", "run", "./cmd/ds2api"]
    ports:
-      - "${PORT:-5001}:5001"
+      - "${PORT:-5001}:${PORT:-5001}"
    env_file:
      - .env
    environment:
@@ -34,10 +24,7 @@ services:
      - LOG_LEVEL=DEBUG
    volumes:
      # 源代码挂载（开发时实时生效）
-      - ./app.py:/app/app.py:ro
-      - ./core:/app/core:ro
-      - ./routes:/app/routes:ro
-      - ./static:/app/static:ro
+      - ./:/app
      # 配置文件挂载（便于本地修改）
      - ./config.json:/app/config.json
    restart: "no"
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,28 +1,17 @@
-# DS2API 生产环境配置
-# 使用说明：
-#   1. 复制 .env.example 为 .env 并填写配置
-#   2. docker-compose up -d
-#   3. 主代码更新后：docker-compose up -d --build
-#
-# 设计原则：
-#   - 零侵入：所有项目配置通过 .env 文件传递
-#   - 易维护：主代码更新只需重新构建镜像
-
 services:
  ds2api:
    build: .
    image: ds2api:latest
    container_name: ds2api
    ports:
-      - "${PORT:-5001}:5001"
+      - "${PORT:-5001}:${PORT:-5001}"
    env_file:
      - .env
    environment:
-      # 确保容器内使用正确的主机绑定
      - HOST=0.0.0.0
    restart: unless-stopped
    healthcheck:
-      test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:5001/v1/models')"]
+      test: ["CMD", "wget", "-qO-", "http://localhost:${PORT:-5001}/healthz"]
      interval: 30s
      timeout: 10s
      retries: 3
--- a/go.mod
+++ b/go.mod
@@ -0,0 +1,17 @@
+module ds2api
+
+go 1.24
+
+require (
+	github.com/andybalholm/brotli v1.0.6
+	github.com/go-chi/chi/v5 v5.2.3
+	github.com/google/uuid v1.6.0
+	github.com/refraction-networking/utls v1.8.1
+	github.com/tetratelabs/wazero v1.9.0
+)
+
+require (
+	github.com/klauspost/compress v1.17.4 // indirect
+	golang.org/x/crypto v0.36.0 // indirect
+	golang.org/x/sys v0.31.0 // indirect
+)
--- a/go.sum
+++ b/go.sum
@@ -0,0 +1,16 @@
+github.com/andybalholm/brotli v1.0.6 h1:Yf9fFpf49Zrxb9NlQaluyE92/+X7UVHlhMNJN2sxfOI=
+github.com/andybalholm/brotli v1.0.6/go.mod h1:fO7iG3H7G2nSZ7m0zPUDn85XEX2GTukHGRSepvi9Eig=
+github.com/go-chi/chi/v5 v5.2.3 h1:WQIt9uxdsAbgIYgid+BpYc+liqQZGMHRaUwp0JUcvdE=
+github.com/go-chi/chi/v5 v5.2.3/go.mod h1:L2yAIGWB3H+phAw1NxKwWM+7eUH/lU8pOMm5hHcoops=
+github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
+github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
+github.com/klauspost/compress v1.17.4 h1:Ej5ixsIri7BrIjBkRZLTo6ghwrEtHFk7ijlczPW4fZ4=
+github.com/klauspost/compress v1.17.4/go.mod h1:/dCuZOvVtNoHsyb+cuJD3itjs3NbnF6KH9zAO4BDxPM=
+github.com/refraction-networking/utls v1.8.1 h1:yNY1kapmQU8JeM1sSw2H2asfTIwWxIkrMJI0pRUOCAo=
+github.com/refraction-networking/utls v1.8.1/go.mod h1:jkSOEkLqn+S/jtpEHPOsVv/4V4EVnelwbMQl4vCWXAM=
+github.com/tetratelabs/wazero v1.9.0 h1:IcZ56OuxrtaEz8UYNRHBrUa9bYeX9oVY93KspZZBf/I=
+github.com/tetratelabs/wazero v1.9.0/go.mod h1:TSbcXCfFP0L2FGkRPxHphadXPjo1T6W+CseNNY7EkjM=
+golang.org/x/crypto v0.36.0 h1:AnAEvhDddvBdpY+uR+MyHmuZzzNqXSe/GvuDeob5L34=
+golang.org/x/crypto v0.36.0/go.mod h1:Y4J0ReaxCR1IMaabaSMugxJES1EpwhBHhv2bDHklZvc=
+golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
+golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
--- a/internal/account/pool.go
+++ b/internal/account/pool.go
@@ -0,0 +1,302 @@
+package account
+
+import (
+	"context"
+	"os"
+	"sort"
+	"strconv"
+	"strings"
+	"sync"
+
+	"ds2api/internal/config"
+)
+
+type Pool struct {
+	store                  *config.Store
+	mu                     sync.Mutex
+	queue                  []string
+	inUse                  map[string]int
+	waiters                []chan struct{}
+	maxInflightPerAccount  int
+	recommendedConcurrency int
+	maxQueueSize           int
+}
+
+func NewPool(store *config.Store) *Pool {
+	p := &Pool{
+		store:                 store,
+		inUse:                 map[string]int{},
+		maxInflightPerAccount: maxInflightFromEnv(),
+	}
+	p.Reset()
+	return p
+}
+
+func (p *Pool) Reset() {
+	accounts := p.store.Accounts()
+	sort.SliceStable(accounts, func(i, j int) bool {
+		iHas := accounts[i].Token != ""
+		jHas := accounts[j].Token != ""
+		if iHas == jHas {
+			return i < j
+		}
+		return iHas
+	})
+	ids := make([]string, 0, len(accounts))
+	for _, a := range accounts {
+		id := a.Identifier()
+		if id != "" {
+			ids = append(ids, id)
+		}
+	}
+	recommended := defaultRecommendedConcurrency(len(ids), p.maxInflightPerAccount)
+	queueLimit := maxQueueFromEnv(recommended)
+	p.mu.Lock()
+	defer p.mu.Unlock()
+	p.drainWaitersLocked()
+	p.queue = ids
+	p.inUse = map[string]int{}
+	p.recommendedConcurrency = recommended
+	p.maxQueueSize = queueLimit
+	config.Logger.Info(
+		"[init_account_queue] initialized",
+		"total", len(ids),
+		"max_inflight_per_account", p.maxInflightPerAccount,
+		"recommended_concurrency", p.recommendedConcurrency,
+		"max_queue_size", p.maxQueueSize,
+	)
+}
+
+func (p *Pool) Acquire(target string, exclude map[string]bool) (config.Account, bool) {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+	return p.acquireLocked(target, normalizeExclude(exclude))
+}
+
+func (p *Pool) AcquireWait(ctx context.Context, target string, exclude map[string]bool) (config.Account, bool) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+	exclude = normalizeExclude(exclude)
+	for {
+		if ctx.Err() != nil {
+			return config.Account{}, false
+		}
+
+		p.mu.Lock()
+		if acc, ok := p.acquireLocked(target, exclude); ok {
+			p.mu.Unlock()
+			return acc, true
+		}
+		if !p.canQueueLocked(target, exclude) {
+			p.mu.Unlock()
+			return config.Account{}, false
+		}
+		waiter := make(chan struct{})
+		p.waiters = append(p.waiters, waiter)
+		p.mu.Unlock()
+
+		select {
+		case <-ctx.Done():
+			p.mu.Lock()
+			p.removeWaiterLocked(waiter)
+			p.mu.Unlock()
+			return config.Account{}, false
+		case <-waiter:
+		}
+	}
+}
+
+func (p *Pool) acquireLocked(target string, exclude map[string]bool) (config.Account, bool) {
+	if target != "" {
+		if exclude[target] || p.inUse[target] >= p.maxInflightPerAccount {
+			return config.Account{}, false
+		}
+		acc, ok := p.store.FindAccount(target)
+		if !ok {
+			return config.Account{}, false
+		}
+		p.inUse[target]++
+		p.bumpQueue(target)
+		return acc, true
+	}
+
+	if acc, ok := p.tryAcquire(exclude, true); ok {
+		return acc, true
+	}
+	if acc, ok := p.tryAcquire(exclude, false); ok {
+		return acc, true
+	}
+	return config.Account{}, false
+}
+
+func (p *Pool) tryAcquire(exclude map[string]bool, requireToken bool) (config.Account, bool) {
+	for i := 0; i < len(p.queue); i++ {
+		id := p.queue[i]
+		if exclude[id] || p.inUse[id] >= p.maxInflightPerAccount {
+			continue
+		}
+		acc, ok := p.store.FindAccount(id)
+		if !ok {
+			continue
+		}
+		if requireToken && acc.Token == "" {
+			continue
+		}
+		p.inUse[id]++
+		p.bumpQueue(id)
+		return acc, true
+	}
+	return config.Account{}, false
+}
+
+func (p *Pool) bumpQueue(accountID string) {
+	for i, id := range p.queue {
+		if id != accountID {
+			continue
+		}
+		p.queue = append(p.queue[:i], p.queue[i+1:]...)
+		p.queue = append(p.queue, accountID)
+		return
+	}
+}
+
+func (p *Pool) Release(accountID string) {
+	if accountID == "" {
+		return
+	}
+	p.mu.Lock()
+	defer p.mu.Unlock()
+	count := p.inUse[accountID]
+	if count <= 0 {
+		return
+	}
+	if count == 1 {
+		delete(p.inUse, accountID)
+		p.notifyWaiterLocked()
+		return
+	}
+	p.inUse[accountID] = count - 1
+	p.notifyWaiterLocked()
+}
+
+func (p *Pool) Status() map[string]any {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+	available := make([]string, 0, len(p.queue))
+	inUseAccounts := make([]string, 0, len(p.inUse))
+	inUseSlots := 0
+	for _, id := range p.queue {
+		if p.inUse[id] < p.maxInflightPerAccount {
+			available = append(available, id)
+		}
+	}
+	for id, count := range p.inUse {
+		if count > 0 {
+			inUseAccounts = append(inUseAccounts, id)
+			inUseSlots += count
+		}
+	}
+	sort.Strings(inUseAccounts)
+	return map[string]any{
+		"available":                len(available),
+		"in_use":                   inUseSlots,
+		"total":                    len(p.store.Accounts()),
+		"available_accounts":       available,
+		"in_use_accounts":          inUseAccounts,
+		"max_inflight_per_account": p.maxInflightPerAccount,
+		"recommended_concurrency":  p.recommendedConcurrency,
+		"waiting":                  len(p.waiters),
+		"max_queue_size":           p.maxQueueSize,
+	}
+}
+
+func maxInflightFromEnv() int {
+	for _, key := range []string{"DS2API_ACCOUNT_MAX_INFLIGHT", "DS2API_ACCOUNT_CONCURRENCY"} {
+		raw := strings.TrimSpace(os.Getenv(key))
+		if raw == "" {
+			continue
+		}
+		n, err := strconv.Atoi(raw)
+		if err == nil && n > 0 {
+			return n
+		}
+	}
+	return 2
+}
+
+func defaultRecommendedConcurrency(accountCount, maxInflightPerAccount int) int {
+	if accountCount <= 0 {
+		return 0
+	}
+	if maxInflightPerAccount <= 0 {
+		maxInflightPerAccount = 2
+	}
+	return accountCount * maxInflightPerAccount
+}
+
+func normalizeExclude(exclude map[string]bool) map[string]bool {
+	if exclude == nil {
+		return map[string]bool{}
+	}
+	return exclude
+}
+
+func (p *Pool) canQueueLocked(target string, exclude map[string]bool) bool {
+	if target != "" {
+		if exclude[target] {
+			return false
+		}
+		if _, ok := p.store.FindAccount(target); !ok {
+			return false
+		}
+	}
+	if p.maxQueueSize <= 0 {
+		return false
+	}
+	return len(p.waiters) < p.maxQueueSize
+}
+
+func (p *Pool) notifyWaiterLocked() {
+	if len(p.waiters) == 0 {
+		return
+	}
+	waiter := p.waiters[0]
+	p.waiters = p.waiters[1:]
+	close(waiter)
+}
+
+func (p *Pool) removeWaiterLocked(waiter chan struct{}) bool {
+	for i, w := range p.waiters {
+		if w != waiter {
+			continue
+		}
+		p.waiters = append(p.waiters[:i], p.waiters[i+1:]...)
+		return true
+	}
+	return false
+}
+
+func (p *Pool) drainWaitersLocked() {
+	for _, waiter := range p.waiters {
+		close(waiter)
+	}
+	p.waiters = nil
+}
+
+func maxQueueFromEnv(defaultSize int) int {
+	for _, key := range []string{"DS2API_ACCOUNT_MAX_QUEUE", "DS2API_ACCOUNT_QUEUE_SIZE"} {
+		raw := strings.TrimSpace(os.Getenv(key))
+		if raw == "" {
+			continue
+		}
+		n, err := strconv.Atoi(raw)
+		if err == nil && n >= 0 {
+			return n
+		}
+	}
+	if defaultSize < 0 {
+		return 0
+	}
+	return defaultSize
+}
--- a/internal/account/pool_test.go
+++ b/internal/account/pool_test.go
@@ -0,0 +1,296 @@
+package account
+
+import (
+	"context"
+	"sync"
+	"testing"
+	"time"
+
+	"ds2api/internal/config"
+)
+
+func newPoolForTest(t *testing.T, maxInflight string) *Pool {
+	t.Helper()
+	t.Setenv("DS2API_ACCOUNT_MAX_INFLIGHT", maxInflight)
+	t.Setenv("DS2API_ACCOUNT_CONCURRENCY", "")
+	t.Setenv("DS2API_ACCOUNT_MAX_QUEUE", "")
+	t.Setenv("DS2API_ACCOUNT_QUEUE_SIZE", "")
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["k1"],
+		"accounts":[
+			{"email":"acc1@example.com","token":"token1"},
+			{"email":"acc2@example.com","token":"token2"}
+		]
+	}`)
+	store := config.LoadStore()
+	return NewPool(store)
+}
+
+func newSingleAccountPoolForTest(t *testing.T, maxInflight string) *Pool {
+	t.Helper()
+	t.Setenv("DS2API_ACCOUNT_MAX_INFLIGHT", maxInflight)
+	t.Setenv("DS2API_ACCOUNT_CONCURRENCY", "")
+	t.Setenv("DS2API_ACCOUNT_MAX_QUEUE", "")
+	t.Setenv("DS2API_ACCOUNT_QUEUE_SIZE", "")
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["k1"],
+		"accounts":[{"email":"acc1@example.com","token":"token1"}]
+	}`)
+	return NewPool(config.LoadStore())
+}
+
+func waitForWaitingCount(t *testing.T, pool *Pool, want int) {
+	t.Helper()
+	deadline := time.Now().Add(800 * time.Millisecond)
+	for time.Now().Before(deadline) {
+		status := pool.Status()
+		if got, ok := status["waiting"].(int); ok && got == want {
+			return
+		}
+		time.Sleep(10 * time.Millisecond)
+	}
+	status := pool.Status()
+	t.Fatalf("waiting count did not reach %d, current status=%v", want, status)
+}
+
+func TestPoolRoundRobinWithConcurrentSlots(t *testing.T) {
+	pool := newPoolForTest(t, "2")
+
+	order := make([]string, 0, 4)
+	for i := 0; i < 4; i++ {
+		acc, ok := pool.Acquire("", nil)
+		if !ok {
+			t.Fatalf("expected acquire success at step %d", i+1)
+		}
+		order = append(order, acc.Identifier())
+	}
+	want := []string{"acc1@example.com", "acc2@example.com", "acc1@example.com", "acc2@example.com"}
+	for i := range want {
+		if order[i] != want[i] {
+			t.Fatalf("unexpected order at %d: got %q want %q (full=%v)", i, order[i], want[i], order)
+		}
+	}
+
+	if _, ok := pool.Acquire("", nil); ok {
+		t.Fatalf("expected acquire to fail when all inflight slots are occupied")
+	}
+
+	pool.Release("acc1@example.com")
+	acc, ok := pool.Acquire("", nil)
+	if !ok || acc.Identifier() != "acc1@example.com" {
+		t.Fatalf("expected reacquire acc1 after releasing one slot, got ok=%v id=%q", ok, acc.Identifier())
+	}
+}
+
+func TestPoolTargetAccountInflightLimit(t *testing.T) {
+	pool := newPoolForTest(t, "2")
+
+	for i := 0; i < 2; i++ {
+		if _, ok := pool.Acquire("acc1@example.com", nil); !ok {
+			t.Fatalf("expected target acquire success at step %d", i+1)
+		}
+	}
+	if _, ok := pool.Acquire("acc1@example.com", nil); ok {
+		t.Fatalf("expected third acquire on same target to fail due to inflight limit")
+	}
+}
+
+func TestPoolConcurrentAcquireDistribution(t *testing.T) {
+	pool := newPoolForTest(t, "2")
+
+	start := make(chan struct{})
+	results := make(chan string, 6)
+	var wg sync.WaitGroup
+	for i := 0; i < 6; i++ {
+		wg.Add(1)
+		go func() {
+			defer wg.Done()
+			<-start
+			acc, ok := pool.Acquire("", nil)
+			if !ok {
+				results <- "FAIL"
+				return
+			}
+			results <- acc.Identifier()
+		}()
+	}
+
+	close(start)
+	wg.Wait()
+	close(results)
+
+	success := 0
+	fail := 0
+	perAccount := map[string]int{}
+	for id := range results {
+		if id == "FAIL" {
+			fail++
+			continue
+		}
+		success++
+		perAccount[id]++
+	}
+	if success != 4 || fail != 2 {
+		t.Fatalf("unexpected concurrent acquire result: success=%d fail=%d perAccount=%v", success, fail, perAccount)
+	}
+	for id, n := range perAccount {
+		if n > 2 {
+			t.Fatalf("account %s exceeded inflight limit: %d", id, n)
+		}
+	}
+}
+
+func TestPoolStatusRecommendedConcurrencyDefault(t *testing.T) {
+	pool := newPoolForTest(t, "")
+	status := pool.Status()
+
+	if got, ok := status["max_inflight_per_account"].(int); !ok || got != 2 {
+		t.Fatalf("unexpected max_inflight_per_account: %#v", status["max_inflight_per_account"])
+	}
+	if got, ok := status["recommended_concurrency"].(int); !ok || got != 4 {
+		t.Fatalf("unexpected recommended_concurrency: %#v", status["recommended_concurrency"])
+	}
+	if got, ok := status["max_queue_size"].(int); !ok || got != 4 {
+		t.Fatalf("unexpected max_queue_size: %#v", status["max_queue_size"])
+	}
+}
+
+func TestPoolStatusRecommendedConcurrencyRespectsOverride(t *testing.T) {
+	pool := newPoolForTest(t, "3")
+	status := pool.Status()
+
+	if got, ok := status["max_inflight_per_account"].(int); !ok || got != 3 {
+		t.Fatalf("unexpected max_inflight_per_account: %#v", status["max_inflight_per_account"])
+	}
+	if got, ok := status["recommended_concurrency"].(int); !ok || got != 6 {
+		t.Fatalf("unexpected recommended_concurrency: %#v", status["recommended_concurrency"])
+	}
+	if got, ok := status["max_queue_size"].(int); !ok || got != 6 {
+		t.Fatalf("unexpected max_queue_size: %#v", status["max_queue_size"])
+	}
+}
+
+func TestPoolAccountConcurrencyAliasEnv(t *testing.T) {
+	t.Setenv("DS2API_ACCOUNT_MAX_INFLIGHT", "")
+	t.Setenv("DS2API_ACCOUNT_CONCURRENCY", "4")
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["k1"],
+		"accounts":[
+			{"email":"acc1@example.com","token":"token1"},
+			{"email":"acc2@example.com","token":"token2"}
+		]
+	}`)
+
+	pool := NewPool(config.LoadStore())
+	status := pool.Status()
+	if got, ok := status["max_inflight_per_account"].(int); !ok || got != 4 {
+		t.Fatalf("unexpected max_inflight_per_account: %#v", status["max_inflight_per_account"])
+	}
+	if got, ok := status["recommended_concurrency"].(int); !ok || got != 8 {
+		t.Fatalf("unexpected recommended_concurrency: %#v", status["recommended_concurrency"])
+	}
+	if got, ok := status["max_queue_size"].(int); !ok || got != 8 {
+		t.Fatalf("unexpected max_queue_size: %#v", status["max_queue_size"])
+	}
+}
+
+func TestPoolSupportsTokenOnlyAccount(t *testing.T) {
+	t.Setenv("DS2API_ACCOUNT_MAX_INFLIGHT", "1")
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["k1"],
+		"accounts":[{"token":"token-only-account"}]
+	}`)
+
+	pool := NewPool(config.LoadStore())
+	status := pool.Status()
+	if got, ok := status["total"].(int); !ok || got != 1 {
+		t.Fatalf("unexpected total in pool status: %#v", status["total"])
+	}
+	if got, ok := status["available"].(int); !ok || got != 1 {
+		t.Fatalf("unexpected available in pool status: %#v", status["available"])
+	}
+
+	acc, ok := pool.Acquire("", nil)
+	if !ok {
+		t.Fatalf("expected acquire success for token-only account")
+	}
+	if acc.Token != "token-only-account" {
+		t.Fatalf("unexpected token on acquired account: %q", acc.Token)
+	}
+}
+
+func TestPoolAcquireWaitQueuesAndSucceedsAfterRelease(t *testing.T) {
+	pool := newSingleAccountPoolForTest(t, "1")
+	first, ok := pool.Acquire("", nil)
+	if !ok {
+		t.Fatal("expected first acquire to succeed")
+	}
+
+	type result struct {
+		id string
+		ok bool
+	}
+	resCh := make(chan result, 1)
+	ctx, cancel := context.WithTimeout(context.Background(), time.Second)
+	defer cancel()
+	go func() {
+		acc, ok := pool.AcquireWait(ctx, "", nil)
+		resCh <- result{id: acc.Identifier(), ok: ok}
+	}()
+
+	waitForWaitingCount(t, pool, 1)
+	pool.Release(first.Identifier())
+
+	select {
+	case res := <-resCh:
+		if !res.ok {
+			t.Fatal("expected queued acquire to succeed after release")
+		}
+		if res.id != "acc1@example.com" {
+			t.Fatalf("unexpected account id from queued acquire: %q", res.id)
+		}
+	case <-time.After(time.Second):
+		t.Fatal("timed out waiting for queued acquire result")
+	}
+}
+
+func TestPoolAcquireWaitQueueLimitReturnsFalse(t *testing.T) {
+	pool := newSingleAccountPoolForTest(t, "1")
+	first, ok := pool.Acquire("", nil)
+	if !ok {
+		t.Fatal("expected first acquire to succeed")
+	}
+
+	type result struct {
+		id string
+		ok bool
+	}
+	firstWaiter := make(chan result, 1)
+	ctx1, cancel1 := context.WithTimeout(context.Background(), 1200*time.Millisecond)
+	defer cancel1()
+	go func() {
+		acc, ok := pool.AcquireWait(ctx1, "", nil)
+		firstWaiter <- result{id: acc.Identifier(), ok: ok}
+	}()
+	waitForWaitingCount(t, pool, 1)
+
+	ctx2, cancel2 := context.WithTimeout(context.Background(), 500*time.Millisecond)
+	defer cancel2()
+	start := time.Now()
+	if _, ok := pool.AcquireWait(ctx2, "", nil); ok {
+		t.Fatal("expected second queued acquire to fail when queue is full")
+	}
+	if time.Since(start) > 120*time.Millisecond {
+		t.Fatalf("queue-full acquire should fail fast, took %s", time.Since(start))
+	}
+
+	pool.Release(first.Identifier())
+	select {
+	case res := <-firstWaiter:
+		if !res.ok {
+			t.Fatal("expected first queued acquire to succeed after release")
+		}
+	case <-time.After(time.Second):
+		t.Fatal("timed out waiting for first queued acquire")
+	}
+}
--- a/internal/adapter/claude/handler.go
+++ b/internal/adapter/claude/handler.go
@@ -0,0 +1,603 @@
+package claude
+
+import (
+	"encoding/json"
+	"fmt"
+	"io"
+	"net/http"
+	"strings"
+	"time"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/deepseek"
+	"ds2api/internal/sse"
+	"ds2api/internal/util"
+)
+
+// writeJSON is a package-internal alias to avoid mass-renaming all call-sites.
+var writeJSON = util.WriteJSON
+
+type Handler struct {
+	Store *config.Store
+	Auth  *auth.Resolver
+	DS    *deepseek.Client
+}
+
+var (
+	claudeStreamPingInterval    = time.Duration(deepseek.KeepAliveTimeout) * time.Second
+	claudeStreamIdleTimeout     = time.Duration(deepseek.StreamIdleTimeout) * time.Second
+	claudeStreamMaxKeepaliveCnt = deepseek.MaxKeepaliveCount
+)
+
+func RegisterRoutes(r chi.Router, h *Handler) {
+	r.Get("/anthropic/v1/models", h.ListModels)
+	r.Post("/anthropic/v1/messages", h.Messages)
+	r.Post("/anthropic/v1/messages/count_tokens", h.CountTokens)
+}
+
+func (h *Handler) ListModels(w http.ResponseWriter, _ *http.Request) {
+	writeJSON(w, http.StatusOK, config.ClaudeModelsResponse())
+}
+
+func (h *Handler) Messages(w http.ResponseWriter, r *http.Request) {
+	a, err := h.Auth.Determine(r)
+	if err != nil {
+		status := http.StatusUnauthorized
+		detail := err.Error()
+		if err == auth.ErrNoAccount {
+			status = http.StatusTooManyRequests
+		}
+		writeJSON(w, status, map[string]any{"error": map[string]any{"type": "invalid_request_error", "message": detail}})
+		return
+	}
+	defer h.Auth.Release(a)
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"error": map[string]any{"type": "invalid_request_error", "message": "invalid json"}})
+		return
+	}
+	model, _ := req["model"].(string)
+	messagesRaw, _ := req["messages"].([]any)
+	if model == "" || len(messagesRaw) == 0 {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"error": map[string]any{"type": "invalid_request_error", "message": "Request must include 'model' and 'messages'."}})
+		return
+	}
+
+	normalized := normalizeClaudeMessages(messagesRaw)
+	payload := cloneMap(req)
+	payload["messages"] = normalized
+	toolsRequested, _ := req["tools"].([]any)
+	if len(toolsRequested) > 0 && !hasSystemMessage(normalized) {
+		payload["messages"] = append([]any{map[string]any{"role": "system", "content": buildClaudeToolPrompt(toolsRequested)}}, normalized...)
+	}
+
+	dsPayload := util.ConvertClaudeToDeepSeek(payload, h.Store)
+	dsModel, _ := dsPayload["model"].(string)
+	thinkingEnabled, searchEnabled, ok := config.GetModelConfig(dsModel)
+	if !ok {
+		thinkingEnabled = false
+		searchEnabled = false
+	}
+	finalPrompt := util.MessagesPrepare(toMessageMaps(dsPayload["messages"]))
+
+	sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
+	if err != nil {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"error": map[string]any{"type": "api_error", "message": "invalid token."}})
+		return
+	}
+	pow, err := h.DS.GetPow(r.Context(), a, 3)
+	if err != nil {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"error": map[string]any{"type": "api_error", "message": "Failed to get PoW"}})
+		return
+	}
+	requestPayload := map[string]any{
+		"chat_session_id":   sessionID,
+		"parent_message_id": nil,
+		"prompt":            finalPrompt,
+		"ref_file_ids":      []any{},
+		"thinking_enabled":  thinkingEnabled,
+		"search_enabled":    searchEnabled,
+	}
+	resp, err := h.DS.CallCompletion(r.Context(), a, requestPayload, pow, 3)
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"error": map[string]any{"type": "api_error", "message": "Failed to get Claude response."}})
+		return
+	}
+	if resp.StatusCode != http.StatusOK {
+		defer resp.Body.Close()
+		body, _ := io.ReadAll(resp.Body)
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"error": map[string]any{"type": "api_error", "message": string(body)}})
+		return
+	}
+
+	toolNames := extractClaudeToolNames(toolsRequested)
+	if util.ToBool(req["stream"]) {
+		h.handleClaudeStreamRealtime(w, r, resp, model, normalized, thinkingEnabled, searchEnabled, toolNames)
+		return
+	}
+	result := sse.CollectStream(resp, thinkingEnabled, true)
+	fullText := result.Text
+	fullThinking := result.Thinking
+	detected := util.ParseToolCalls(fullText, toolNames)
+	content := make([]map[string]any, 0, 4)
+	if fullThinking != "" {
+		content = append(content, map[string]any{"type": "thinking", "thinking": fullThinking})
+	}
+	stopReason := "end_turn"
+	if len(detected) > 0 {
+		stopReason = "tool_use"
+		for i, tc := range detected {
+			content = append(content, map[string]any{
+				"type":  "tool_use",
+				"id":    fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), i),
+				"name":  tc.Name,
+				"input": tc.Input,
+			})
+		}
+	} else {
+		if fullText == "" {
+			fullText = "抱歉，没有生成有效的响应内容。"
+		}
+		content = append(content, map[string]any{"type": "text", "text": fullText})
+	}
+	writeJSON(w, http.StatusOK, map[string]any{
+		"id":            fmt.Sprintf("msg_%d", time.Now().UnixNano()),
+		"type":          "message",
+		"role":          "assistant",
+		"model":         model,
+		"content":       content,
+		"stop_reason":   stopReason,
+		"stop_sequence": nil,
+		"usage": map[string]any{
+			"input_tokens":  util.EstimateTokens(fmt.Sprintf("%v", normalized)),
+			"output_tokens": util.EstimateTokens(fullThinking) + util.EstimateTokens(fullText),
+		},
+	})
+}
+
+func (h *Handler) CountTokens(w http.ResponseWriter, r *http.Request) {
+	a, err := h.Auth.Determine(r)
+	if err != nil {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"error": err.Error()})
+		return
+	}
+	defer h.Auth.Release(a)
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"error": "invalid json"})
+		return
+	}
+	model, _ := req["model"].(string)
+	messages, _ := req["messages"].([]any)
+	if model == "" || len(messages) == 0 {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Request must include 'model' and 'messages'."})
+		return
+	}
+	inputTokens := 0
+	if sys, ok := req["system"].(string); ok {
+		inputTokens += util.EstimateTokens(sys)
+	}
+	for _, item := range messages {
+		msg, ok := item.(map[string]any)
+		if !ok {
+			continue
+		}
+		inputTokens += 2
+		inputTokens += util.EstimateTokens(extractMessageContent(msg["content"]))
+	}
+	if tools, ok := req["tools"].([]any); ok {
+		for _, t := range tools {
+			b, _ := json.Marshal(t)
+			inputTokens += util.EstimateTokens(string(b))
+		}
+	}
+	if inputTokens < 1 {
+		inputTokens = 1
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"input_tokens": inputTokens})
+}
+
+func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Request, resp *http.Response, model string, messages []any, thinkingEnabled, searchEnabled bool, toolNames []string) {
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"error": map[string]any{"type": "api_error", "message": string(body)}})
+		return
+	}
+
+	w.Header().Set("Content-Type", "text/event-stream")
+	w.Header().Set("Cache-Control", "no-cache, no-transform")
+	w.Header().Set("Connection", "keep-alive")
+	w.Header().Set("X-Accel-Buffering", "no")
+	rc := http.NewResponseController(w)
+	canFlush := rc.Flush() == nil
+	if !canFlush {
+		config.Logger.Warn("[claude_stream] response writer does not support flush; streaming may be buffered")
+	}
+	send := func(event string, v any) {
+		b, _ := json.Marshal(v)
+		_, _ = w.Write([]byte("event: "))
+		_, _ = w.Write([]byte(event))
+		_, _ = w.Write([]byte("\n"))
+		_, _ = w.Write([]byte("data: "))
+		_, _ = w.Write(b)
+		_, _ = w.Write([]byte("\n\n"))
+		if canFlush {
+			_ = rc.Flush()
+		}
+	}
+	sendError := func(message string) {
+		msg := strings.TrimSpace(message)
+		if msg == "" {
+			msg = "upstream stream error"
+		}
+		send("error", map[string]any{
+			"type": "error",
+			"error": map[string]any{
+				"type":    "api_error",
+				"message": msg,
+			},
+		})
+	}
+
+	messageID := fmt.Sprintf("msg_%d", time.Now().UnixNano())
+	inputTokens := util.EstimateTokens(fmt.Sprintf("%v", messages))
+	send("message_start", map[string]any{
+		"type": "message_start",
+		"message": map[string]any{
+			"id":            messageID,
+			"type":          "message",
+			"role":          "assistant",
+			"model":         model,
+			"content":       []any{},
+			"stop_reason":   nil,
+			"stop_sequence": nil,
+			"usage":         map[string]any{"input_tokens": inputTokens, "output_tokens": 0},
+		},
+	})
+
+	initialType := "text"
+	if thinkingEnabled {
+		initialType = "thinking"
+	}
+	parsedLines, done := sse.StartParsedLinePump(r.Context(), resp.Body, thinkingEnabled, initialType)
+	bufferToolContent := len(toolNames) > 0
+	hasContent := false
+	lastContent := time.Now()
+	keepaliveCount := 0
+
+	thinking := strings.Builder{}
+	text := strings.Builder{}
+
+	nextBlockIndex := 0
+	thinkingBlockOpen := false
+	thinkingBlockIndex := -1
+	textBlockOpen := false
+	textBlockIndex := -1
+	ended := false
+
+	closeThinkingBlock := func() {
+		if !thinkingBlockOpen {
+			return
+		}
+		send("content_block_stop", map[string]any{
+			"type":  "content_block_stop",
+			"index": thinkingBlockIndex,
+		})
+		thinkingBlockOpen = false
+		thinkingBlockIndex = -1
+	}
+	closeTextBlock := func() {
+		if !textBlockOpen {
+			return
+		}
+		send("content_block_stop", map[string]any{
+			"type":  "content_block_stop",
+			"index": textBlockIndex,
+		})
+		textBlockOpen = false
+		textBlockIndex = -1
+	}
+
+	finalize := func(stopReason string) {
+		if ended {
+			return
+		}
+		ended = true
+
+		closeThinkingBlock()
+		closeTextBlock()
+
+		finalThinking := thinking.String()
+		finalText := text.String()
+
+		if bufferToolContent {
+			detected := util.ParseToolCalls(finalText, toolNames)
+			if len(detected) > 0 {
+				stopReason = "tool_use"
+				for i, tc := range detected {
+					idx := nextBlockIndex + i
+					send("content_block_start", map[string]any{
+						"type":  "content_block_start",
+						"index": idx,
+						"content_block": map[string]any{
+							"type":  "tool_use",
+							"id":    fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), idx),
+							"name":  tc.Name,
+							"input": tc.Input,
+						},
+					})
+					send("content_block_stop", map[string]any{
+						"type":  "content_block_stop",
+						"index": idx,
+					})
+				}
+				nextBlockIndex += len(detected)
+			} else if finalText != "" {
+				idx := nextBlockIndex
+				nextBlockIndex++
+				send("content_block_start", map[string]any{
+					"type":  "content_block_start",
+					"index": idx,
+					"content_block": map[string]any{
+						"type": "text",
+						"text": "",
+					},
+				})
+				send("content_block_delta", map[string]any{
+					"type":  "content_block_delta",
+					"index": idx,
+					"delta": map[string]any{
+						"type": "text_delta",
+						"text": finalText,
+					},
+				})
+				send("content_block_stop", map[string]any{
+					"type":  "content_block_stop",
+					"index": idx,
+				})
+			}
+		}
+
+		outputTokens := util.EstimateTokens(finalThinking) + util.EstimateTokens(finalText)
+		send("message_delta", map[string]any{
+			"type": "message_delta",
+			"delta": map[string]any{
+				"stop_reason":   stopReason,
+				"stop_sequence": nil,
+			},
+			"usage": map[string]any{
+				"output_tokens": outputTokens,
+			},
+		})
+		send("message_stop", map[string]any{"type": "message_stop"})
+	}
+
+	pingTicker := time.NewTicker(claudeStreamPingInterval)
+	defer pingTicker.Stop()
+
+	for {
+		select {
+		case <-r.Context().Done():
+			return
+		case <-pingTicker.C:
+			if !hasContent {
+				keepaliveCount++
+				if keepaliveCount >= claudeStreamMaxKeepaliveCnt {
+					finalize("end_turn")
+					return
+				}
+			}
+			if hasContent && time.Since(lastContent) > claudeStreamIdleTimeout {
+				finalize("end_turn")
+				return
+			}
+			send("ping", map[string]any{"type": "ping"})
+		case parsed, ok := <-parsedLines:
+			if !ok {
+				if err := <-done; err != nil {
+					sendError(err.Error())
+					return
+				}
+				finalize("end_turn")
+				return
+			}
+			if !parsed.Parsed {
+				continue
+			}
+			if parsed.ErrorMessage != "" {
+				sendError(parsed.ErrorMessage)
+				return
+			}
+			if parsed.Stop {
+				finalize("end_turn")
+				return
+			}
+
+			for _, p := range parsed.Parts {
+				if p.Text == "" {
+					continue
+				}
+				if p.Type != "thinking" && searchEnabled && sse.IsCitation(p.Text) {
+					continue
+				}
+
+				hasContent = true
+				lastContent = time.Now()
+				keepaliveCount = 0
+
+				if p.Type == "thinking" {
+					if !thinkingEnabled {
+						continue
+					}
+					thinking.WriteString(p.Text)
+					closeTextBlock()
+					if !thinkingBlockOpen {
+						thinkingBlockIndex = nextBlockIndex
+						nextBlockIndex++
+						send("content_block_start", map[string]any{
+							"type":  "content_block_start",
+							"index": thinkingBlockIndex,
+							"content_block": map[string]any{
+								"type":     "thinking",
+								"thinking": "",
+							},
+						})
+						thinkingBlockOpen = true
+					}
+					send("content_block_delta", map[string]any{
+						"type":  "content_block_delta",
+						"index": thinkingBlockIndex,
+						"delta": map[string]any{
+							"type":     "thinking_delta",
+							"thinking": p.Text,
+						},
+					})
+					continue
+				}
+
+				text.WriteString(p.Text)
+				if bufferToolContent {
+					continue
+				}
+				closeThinkingBlock()
+				if !textBlockOpen {
+					textBlockIndex = nextBlockIndex
+					nextBlockIndex++
+					send("content_block_start", map[string]any{
+						"type":  "content_block_start",
+						"index": textBlockIndex,
+						"content_block": map[string]any{
+							"type": "text",
+							"text": "",
+						},
+					})
+					textBlockOpen = true
+				}
+				send("content_block_delta", map[string]any{
+					"type":  "content_block_delta",
+					"index": textBlockIndex,
+					"delta": map[string]any{
+						"type": "text_delta",
+						"text": p.Text,
+					},
+				})
+			}
+		}
+	}
+}
+
+func normalizeClaudeMessages(messages []any) []any {
+	out := make([]any, 0, len(messages))
+	for _, m := range messages {
+		msg, ok := m.(map[string]any)
+		if !ok {
+			continue
+		}
+		copied := cloneMap(msg)
+		switch content := msg["content"].(type) {
+		case []any:
+			parts := make([]string, 0, len(content))
+			for _, block := range content {
+				b, ok := block.(map[string]any)
+				if !ok {
+					continue
+				}
+				typeStr, _ := b["type"].(string)
+				if typeStr == "text" {
+					if t, ok := b["text"].(string); ok {
+						parts = append(parts, t)
+					}
+				}
+				if typeStr == "tool_result" {
+					parts = append(parts, fmt.Sprintf("%v", b["content"]))
+				}
+			}
+			copied["content"] = strings.Join(parts, "\n")
+		}
+		out = append(out, copied)
+	}
+	return out
+}
+
+func buildClaudeToolPrompt(tools []any) string {
+	parts := []string{"You are Claude, a helpful AI assistant. You have access to these tools:"}
+	for _, t := range tools {
+		m, ok := t.(map[string]any)
+		if !ok {
+			continue
+		}
+		name, _ := m["name"].(string)
+		desc, _ := m["description"].(string)
+		schema, _ := json.Marshal(m["input_schema"])
+		parts = append(parts, fmt.Sprintf("Tool: %s\nDescription: %s\nParameters: %s", name, desc, schema))
+	}
+	parts = append(parts, "When you need to use tools, you can call multiple tools in one response. Output ONLY JSON like {\"tool_calls\":[{\"name\":\"tool\",\"input\":{}}]}")
+	return strings.Join(parts, "\n\n")
+}
+
+func hasSystemMessage(messages []any) bool {
+	for _, m := range messages {
+		msg, ok := m.(map[string]any)
+		if ok && msg["role"] == "system" {
+			return true
+		}
+	}
+	return false
+}
+
+func extractClaudeToolNames(tools []any) []string {
+	out := make([]string, 0, len(tools))
+	for _, t := range tools {
+		m, ok := t.(map[string]any)
+		if !ok {
+			continue
+		}
+		if name, ok := m["name"].(string); ok && name != "" {
+			out = append(out, name)
+		}
+	}
+	return out
+}
+
+func toMessageMaps(v any) []map[string]any {
+	arr, ok := v.([]any)
+	if !ok {
+		return nil
+	}
+	out := make([]map[string]any, 0, len(arr))
+	for _, item := range arr {
+		if m, ok := item.(map[string]any); ok {
+			out = append(out, m)
+		}
+	}
+	return out
+}
+
+func extractMessageContent(v any) string {
+	switch x := v.(type) {
+	case string:
+		return x
+	case []any:
+		parts := make([]string, 0, len(x))
+		for _, it := range x {
+			parts = append(parts, fmt.Sprintf("%v", it))
+		}
+		return strings.Join(parts, "\n")
+	default:
+		return fmt.Sprintf("%v", x)
+	}
+}
+
+func cloneMap(in map[string]any) map[string]any {
+	out := make(map[string]any, len(in))
+	for k, v := range in {
+		out[k] = v
+	}
+	return out
+}
--- a/internal/adapter/claude/handler_stream_test.go
+++ b/internal/adapter/claude/handler_stream_test.go
@@ -0,0 +1,257 @@
+package claude
+
+import (
+	"ds2api/internal/sse"
+	"encoding/json"
+	"io"
+	"net/http"
+	"net/http/httptest"
+	"strings"
+	"testing"
+	"time"
+)
+
+type claudeFrame struct {
+	Event   string
+	Payload map[string]any
+}
+
+func makeClaudeSSEHTTPResponse(lines ...string) *http.Response {
+	body := strings.Join(lines, "\n")
+	if !strings.HasSuffix(body, "\n") {
+		body += "\n"
+	}
+	return &http.Response{
+		StatusCode: http.StatusOK,
+		Header:     make(http.Header),
+		Body:       io.NopCloser(strings.NewReader(body)),
+	}
+}
+
+func parseClaudeFrames(t *testing.T, body string) []claudeFrame {
+	t.Helper()
+	chunks := strings.Split(body, "\n\n")
+	frames := make([]claudeFrame, 0, len(chunks))
+	for _, chunk := range chunks {
+		chunk = strings.TrimSpace(chunk)
+		if chunk == "" {
+			continue
+		}
+		lines := strings.Split(chunk, "\n")
+		eventName := ""
+		dataPayload := ""
+		for _, line := range lines {
+			line = strings.TrimSpace(line)
+			switch {
+			case strings.HasPrefix(line, "event:"):
+				eventName = strings.TrimSpace(strings.TrimPrefix(line, "event:"))
+			case strings.HasPrefix(line, "data:"):
+				dataPayload = strings.TrimSpace(strings.TrimPrefix(line, "data:"))
+			}
+		}
+		if eventName == "" || dataPayload == "" {
+			continue
+		}
+		var payload map[string]any
+		if err := json.Unmarshal([]byte(dataPayload), &payload); err != nil {
+			t.Fatalf("decode frame failed: %v, payload=%s", err, dataPayload)
+		}
+		frames = append(frames, claudeFrame{Event: eventName, Payload: payload})
+	}
+	return frames
+}
+
+func findClaudeFrames(frames []claudeFrame, event string) []claudeFrame {
+	out := make([]claudeFrame, 0)
+	for _, f := range frames {
+		if f.Event == event {
+			out = append(out, f)
+		}
+	}
+	return out
+}
+
+func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T) {
+	h := &Handler{}
+	resp := makeClaudeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"Hel"}`,
+		`data: {"p":"response/content","v":"lo"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+
+	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, false, false, nil)
+
+	body := rec.Body.String()
+	if !strings.Contains(body, "event: message_start") {
+		t.Fatalf("missing event header: message_start, body=%s", body)
+	}
+	if !strings.Contains(body, "event: content_block_delta") {
+		t.Fatalf("missing event header: content_block_delta, body=%s", body)
+	}
+	if !strings.Contains(body, "event: message_stop") {
+		t.Fatalf("missing event header: message_stop, body=%s", body)
+	}
+
+	frames := parseClaudeFrames(t, body)
+	deltas := findClaudeFrames(frames, "content_block_delta")
+	if len(deltas) < 2 {
+		t.Fatalf("expected at least 2 text deltas, got=%d body=%s", len(deltas), body)
+	}
+	combined := strings.Builder{}
+	for _, f := range deltas {
+		delta, _ := f.Payload["delta"].(map[string]any)
+		if delta["type"] == "text_delta" {
+			combined.WriteString(asString(delta["text"]))
+		}
+	}
+	if combined.String() != "Hello" {
+		t.Fatalf("unexpected combined text: %q body=%s", combined.String(), body)
+	}
+}
+
+func TestHandleClaudeStreamRealtimeThinkingDelta(t *testing.T) {
+	h := &Handler{}
+	resp := makeClaudeSSEHTTPResponse(
+		`data: {"p":"response/thinking_content","v":"思"}`,
+		`data: {"p":"response/thinking_content","v":"考"}`,
+		`data: {"p":"response/content","v":"ok"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+
+	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, true, false, nil)
+
+	frames := parseClaudeFrames(t, rec.Body.String())
+	foundThinkingDelta := false
+	for _, f := range findClaudeFrames(frames, "content_block_delta") {
+		delta, _ := f.Payload["delta"].(map[string]any)
+		if delta["type"] == "thinking_delta" {
+			foundThinkingDelta = true
+			break
+		}
+	}
+	if !foundThinkingDelta {
+		t.Fatalf("expected thinking_delta event, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleClaudeStreamRealtimeToolSafety(t *testing.T) {
+	h := &Handler{}
+	resp := makeClaudeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
+		`data: {"p":"response/content","v":",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+
+	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "use tool"}}, false, false, []string{"search"})
+
+	frames := parseClaudeFrames(t, rec.Body.String())
+	for _, f := range findClaudeFrames(frames, "content_block_delta") {
+		delta, _ := f.Payload["delta"].(map[string]any)
+		if delta["type"] == "text_delta" && strings.Contains(asString(delta["text"]), `"tool_calls"`) {
+			t.Fatalf("raw tool_calls JSON leaked in text delta: body=%s", rec.Body.String())
+		}
+	}
+
+	foundToolUse := false
+	for _, f := range findClaudeFrames(frames, "content_block_start") {
+		contentBlock, _ := f.Payload["content_block"].(map[string]any)
+		if contentBlock["type"] == "tool_use" {
+			foundToolUse = true
+			break
+		}
+	}
+	if !foundToolUse {
+		t.Fatalf("expected tool_use block in stream, body=%s", rec.Body.String())
+	}
+
+	foundToolUseStop := false
+	for _, f := range findClaudeFrames(frames, "message_delta") {
+		delta, _ := f.Payload["delta"].(map[string]any)
+		if delta["stop_reason"] == "tool_use" {
+			foundToolUseStop = true
+			break
+		}
+	}
+	if !foundToolUseStop {
+		t.Fatalf("expected stop_reason=tool_use, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleClaudeStreamRealtimeUpstreamErrorEvent(t *testing.T) {
+	h := &Handler{}
+	resp := makeClaudeSSEHTTPResponse(
+		`data: {"error":{"message":"boom"}}`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+
+	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, false, false, nil)
+
+	frames := parseClaudeFrames(t, rec.Body.String())
+	errFrames := findClaudeFrames(frames, "error")
+	if len(errFrames) == 0 {
+		t.Fatalf("expected error event frame, body=%s", rec.Body.String())
+	}
+	if errFrames[0].Payload["type"] != "error" {
+		t.Fatalf("expected error payload type, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleClaudeStreamRealtimePingEvent(t *testing.T) {
+	h := &Handler{}
+	oldPing := claudeStreamPingInterval
+	oldIdle := claudeStreamIdleTimeout
+	oldKeepalive := claudeStreamMaxKeepaliveCnt
+	claudeStreamPingInterval = 10 * time.Millisecond
+	claudeStreamIdleTimeout = 300 * time.Millisecond
+	claudeStreamMaxKeepaliveCnt = 50
+	defer func() {
+		claudeStreamPingInterval = oldPing
+		claudeStreamIdleTimeout = oldIdle
+		claudeStreamMaxKeepaliveCnt = oldKeepalive
+	}()
+
+	pr, pw := io.Pipe()
+	resp := &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: pr}
+	go func() {
+		time.Sleep(40 * time.Millisecond)
+		_, _ = io.WriteString(pw, "data: {\"p\":\"response/content\",\"v\":\"hi\"}\n")
+		_, _ = io.WriteString(pw, "data: [DONE]\n")
+		_ = pw.Close()
+	}()
+
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+	h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, false, false, nil)
+
+	frames := parseClaudeFrames(t, rec.Body.String())
+	if len(findClaudeFrames(frames, "ping")) == 0 {
+		t.Fatalf("expected ping event in stream, body=%s", rec.Body.String())
+	}
+}
+
+func TestCollectDeepSeekRegression(t *testing.T) {
+	resp := makeClaudeSSEHTTPResponse(
+		`data: {"p":"response/thinking_content","v":"想"}`,
+		`data: {"p":"response/content","v":"答"}`,
+		`data: [DONE]`,
+	)
+	result := sse.CollectStream(resp, true, true)
+	if result.Thinking != "想" {
+		t.Fatalf("unexpected thinking: %q", result.Thinking)
+	}
+	if result.Text != "答" {
+		t.Fatalf("unexpected text: %q", result.Text)
+	}
+}
+
+func asString(v any) string {
+	s, _ := v.(string)
+	return s
+}
--- a/internal/adapter/openai/handler.go
+++ b/internal/adapter/openai/handler.go
@@ -0,0 +1,487 @@
+package openai
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"io"
+	"net/http"
+	"strings"
+	"sync"
+	"time"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/deepseek"
+	"ds2api/internal/sse"
+	"ds2api/internal/util"
+)
+
+// writeJSON is a package-internal alias kept to avoid mass-renaming across
+// every call-site in this file. It delegates to the shared util version.
+var writeJSON = util.WriteJSON
+
+type Handler struct {
+	Store *config.Store
+	Auth  *auth.Resolver
+	DS    *deepseek.Client
+
+	leaseMu      sync.Mutex
+	streamLeases map[string]streamLease
+}
+
+type streamLease struct {
+	Auth      *auth.RequestAuth
+	ExpiresAt time.Time
+}
+
+func RegisterRoutes(r chi.Router, h *Handler) {
+	r.Get("/v1/models", h.ListModels)
+	r.Post("/v1/chat/completions", h.ChatCompletions)
+}
+
+func (h *Handler) ListModels(w http.ResponseWriter, _ *http.Request) {
+	writeJSON(w, http.StatusOK, config.OpenAIModelsResponse())
+}
+
+func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
+	if isVercelStreamReleaseRequest(r) {
+		h.handleVercelStreamRelease(w, r)
+		return
+	}
+	if isVercelStreamPrepareRequest(r) {
+		h.handleVercelStreamPrepare(w, r)
+		return
+	}
+
+	a, err := h.Auth.Determine(r)
+	if err != nil {
+		status := http.StatusUnauthorized
+		detail := err.Error()
+		if err == auth.ErrNoAccount {
+			status = http.StatusTooManyRequests
+		}
+		writeOpenAIError(w, status, detail)
+		return
+	}
+	defer h.Auth.Release(a)
+	r = r.WithContext(auth.WithAuth(r.Context(), a))
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
+		return
+	}
+	model, _ := req["model"].(string)
+	messagesRaw, _ := req["messages"].([]any)
+	if model == "" || len(messagesRaw) == 0 {
+		writeOpenAIError(w, http.StatusBadRequest, "Request must include 'model' and 'messages'.")
+		return
+	}
+	thinkingEnabled, searchEnabled, ok := config.GetModelConfig(model)
+	if !ok {
+		writeOpenAIError(w, http.StatusServiceUnavailable, fmt.Sprintf("Model '%s' is not available.", model))
+		return
+	}
+
+	messages := normalizeMessages(messagesRaw)
+	toolNames := []string{}
+	if tools, ok := req["tools"].([]any); ok && len(tools) > 0 {
+		messages, toolNames = injectToolPrompt(messages, tools)
+	}
+	finalPrompt := util.MessagesPrepare(messages)
+
+	sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
+	if err != nil {
+		if a.UseConfigToken {
+			writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
+		} else {
+			writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
+		}
+		return
+	}
+	pow, err := h.DS.GetPow(r.Context(), a, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
+		return
+	}
+	payload := map[string]any{
+		"chat_session_id":   sessionID,
+		"parent_message_id": nil,
+		"prompt":            finalPrompt,
+		"ref_file_ids":      []any{},
+		"thinking_enabled":  thinkingEnabled,
+		"search_enabled":    searchEnabled,
+	}
+	resp, err := h.DS.CallCompletion(r.Context(), a, payload, pow, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
+		return
+	}
+	if util.ToBool(req["stream"]) {
+		h.handleStream(w, r, resp, sessionID, model, finalPrompt, thinkingEnabled, searchEnabled, toolNames)
+		return
+	}
+	h.handleNonStream(w, r.Context(), resp, sessionID, model, finalPrompt, thinkingEnabled, searchEnabled, toolNames)
+}
+
+func (h *Handler) handleNonStream(w http.ResponseWriter, ctx context.Context, resp *http.Response, completionID, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
+	if resp.StatusCode != http.StatusOK {
+		defer resp.Body.Close()
+		body, _ := io.ReadAll(resp.Body)
+		writeOpenAIError(w, resp.StatusCode, string(body))
+		return
+	}
+	_ = ctx
+	result := sse.CollectStream(resp, thinkingEnabled, true)
+
+	finalThinking := result.Thinking
+	finalText := result.Text
+	detected := util.ParseToolCalls(finalText, toolNames)
+	finishReason := "stop"
+	messageObj := map[string]any{"role": "assistant", "content": finalText}
+	if thinkingEnabled && finalThinking != "" {
+		messageObj["reasoning_content"] = finalThinking
+	}
+	if len(detected) > 0 {
+		finishReason = "tool_calls"
+		messageObj["tool_calls"] = util.FormatOpenAIToolCalls(detected)
+		messageObj["content"] = nil
+	}
+	promptTokens := util.EstimateTokens(finalPrompt)
+	reasoningTokens := util.EstimateTokens(finalThinking)
+	completionTokens := util.EstimateTokens(finalText)
+
+	writeJSON(w, http.StatusOK, map[string]any{
+		"id":      completionID,
+		"object":  "chat.completion",
+		"created": time.Now().Unix(),
+		"model":   model,
+		"choices": []map[string]any{{"index": 0, "message": messageObj, "finish_reason": finishReason}},
+		"usage": map[string]any{
+			"prompt_tokens":     promptTokens,
+			"completion_tokens": reasoningTokens + completionTokens,
+			"total_tokens":      promptTokens + reasoningTokens + completionTokens,
+			"completion_tokens_details": map[string]any{
+				"reasoning_tokens": reasoningTokens,
+			},
+		},
+	})
+}
+
+func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *http.Response, completionID, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		writeOpenAIError(w, resp.StatusCode, string(body))
+		return
+	}
+	w.Header().Set("Content-Type", "text/event-stream")
+	w.Header().Set("Cache-Control", "no-cache, no-transform")
+	w.Header().Set("Connection", "keep-alive")
+	w.Header().Set("X-Accel-Buffering", "no")
+	rc := http.NewResponseController(w)
+	canFlush := rc.Flush() == nil
+	if !canFlush {
+		config.Logger.Warn("[stream] response writer does not support flush; streaming may be buffered")
+	}
+
+	created := time.Now().Unix()
+	firstChunkSent := false
+	bufferToolContent := len(toolNames) > 0
+	var toolSieve toolStreamSieveState
+	toolCallsEmitted := false
+	initialType := "text"
+	if thinkingEnabled {
+		initialType = "thinking"
+	}
+	parsedLines, done := sse.StartParsedLinePump(r.Context(), resp.Body, thinkingEnabled, initialType)
+	thinking := strings.Builder{}
+	text := strings.Builder{}
+	lastContent := time.Now()
+	hasContent := false
+	keepaliveTicker := time.NewTicker(time.Duration(deepseek.KeepAliveTimeout) * time.Second)
+	defer keepaliveTicker.Stop()
+	keepaliveCountWithoutContent := 0
+
+	sendChunk := func(v any) {
+		b, _ := json.Marshal(v)
+		_, _ = w.Write([]byte("data: "))
+		_, _ = w.Write(b)
+		_, _ = w.Write([]byte("\n\n"))
+		if canFlush {
+			_ = rc.Flush()
+		}
+	}
+	sendDone := func() {
+		_, _ = w.Write([]byte("data: [DONE]\n\n"))
+		if canFlush {
+			_ = rc.Flush()
+		}
+	}
+
+	finalize := func(finishReason string) {
+		finalThinking := thinking.String()
+		finalText := text.String()
+		detected := util.ParseToolCalls(finalText, toolNames)
+		if len(detected) > 0 && !toolCallsEmitted {
+			finishReason = "tool_calls"
+			delta := map[string]any{
+				"tool_calls": util.FormatOpenAIStreamToolCalls(detected),
+			}
+			if !firstChunkSent {
+				delta["role"] = "assistant"
+				firstChunkSent = true
+			}
+			sendChunk(map[string]any{
+				"id":      completionID,
+				"object":  "chat.completion.chunk",
+				"created": created,
+				"model":   model,
+				"choices": []map[string]any{{"delta": delta, "index": 0}},
+			})
+		} else if bufferToolContent {
+			for _, evt := range flushToolSieve(&toolSieve, toolNames) {
+				if evt.Content == "" {
+					continue
+				}
+				delta := map[string]any{
+					"content": evt.Content,
+				}
+				if !firstChunkSent {
+					delta["role"] = "assistant"
+					firstChunkSent = true
+				}
+				sendChunk(map[string]any{
+					"id":      completionID,
+					"object":  "chat.completion.chunk",
+					"created": created,
+					"model":   model,
+					"choices": []map[string]any{{"delta": delta, "index": 0}},
+				})
+			}
+		}
+		if len(detected) > 0 || toolCallsEmitted {
+			finishReason = "tool_calls"
+		}
+		promptTokens := util.EstimateTokens(finalPrompt)
+		reasoningTokens := util.EstimateTokens(finalThinking)
+		completionTokens := util.EstimateTokens(finalText)
+		sendChunk(map[string]any{
+			"id":      completionID,
+			"object":  "chat.completion.chunk",
+			"created": created,
+			"model":   model,
+			"choices": []map[string]any{{"delta": map[string]any{}, "index": 0, "finish_reason": finishReason}},
+			"usage": map[string]any{
+				"prompt_tokens":     promptTokens,
+				"completion_tokens": reasoningTokens + completionTokens,
+				"total_tokens":      promptTokens + reasoningTokens + completionTokens,
+				"completion_tokens_details": map[string]any{
+					"reasoning_tokens": reasoningTokens,
+				},
+			},
+		})
+		sendDone()
+	}
+
+	for {
+		select {
+		case <-r.Context().Done():
+			return
+		case <-keepaliveTicker.C:
+			if !hasContent {
+				keepaliveCountWithoutContent++
+				if keepaliveCountWithoutContent >= deepseek.MaxKeepaliveCount {
+					finalize("stop")
+					return
+				}
+			}
+			if hasContent && time.Since(lastContent) > time.Duration(deepseek.StreamIdleTimeout)*time.Second {
+				finalize("stop")
+				return
+			}
+			if canFlush {
+				_, _ = w.Write([]byte(": keep-alive\n\n"))
+				_ = rc.Flush()
+			}
+		case parsed, ok := <-parsedLines:
+			if !ok {
+				// Ensure scanner completion is observed only after all queued
+				// SSE lines are drained, avoiding early finalize races.
+				_ = <-done
+				finalize("stop")
+				return
+			}
+			if !parsed.Parsed {
+				continue
+			}
+			if parsed.ContentFilter || parsed.ErrorMessage != "" {
+				finalize("content_filter")
+				return
+			}
+			if parsed.Stop {
+				finalize("stop")
+				return
+			}
+			newChoices := make([]map[string]any, 0, len(parsed.Parts))
+			for _, p := range parsed.Parts {
+				if searchEnabled && sse.IsCitation(p.Text) {
+					continue
+				}
+				if p.Text == "" {
+					continue
+				}
+				hasContent = true
+				lastContent = time.Now()
+				keepaliveCountWithoutContent = 0
+				delta := map[string]any{}
+				if !firstChunkSent {
+					delta["role"] = "assistant"
+					firstChunkSent = true
+				}
+				if p.Type == "thinking" {
+					if thinkingEnabled {
+						thinking.WriteString(p.Text)
+						delta["reasoning_content"] = p.Text
+					}
+				} else {
+					text.WriteString(p.Text)
+					if !bufferToolContent {
+						delta["content"] = p.Text
+					} else {
+						events := processToolSieveChunk(&toolSieve, p.Text, toolNames)
+						if len(events) == 0 {
+							// Keep thinking delta only frame.
+						}
+						for _, evt := range events {
+							if len(evt.ToolCalls) > 0 {
+								toolCallsEmitted = true
+								tcDelta := map[string]any{
+									"tool_calls": util.FormatOpenAIStreamToolCalls(evt.ToolCalls),
+								}
+								if !firstChunkSent {
+									tcDelta["role"] = "assistant"
+									firstChunkSent = true
+								}
+								newChoices = append(newChoices, map[string]any{
+									"delta": tcDelta,
+									"index": 0,
+								})
+								continue
+							}
+							if evt.Content != "" {
+								contentDelta := map[string]any{
+									"content": evt.Content,
+								}
+								if !firstChunkSent {
+									contentDelta["role"] = "assistant"
+									firstChunkSent = true
+								}
+								newChoices = append(newChoices, map[string]any{
+									"delta": contentDelta,
+									"index": 0,
+								})
+							}
+						}
+					}
+				}
+				if len(delta) > 0 {
+					newChoices = append(newChoices, map[string]any{"delta": delta, "index": 0})
+				}
+			}
+			if len(newChoices) > 0 {
+				sendChunk(map[string]any{
+					"id":      completionID,
+					"object":  "chat.completion.chunk",
+					"created": created,
+					"model":   model,
+					"choices": newChoices,
+				})
+			}
+		}
+	}
+}
+
+func normalizeMessages(raw []any) []map[string]any {
+	out := make([]map[string]any, 0, len(raw))
+	for _, item := range raw {
+		m, ok := item.(map[string]any)
+		if ok {
+			out = append(out, m)
+		}
+	}
+	return out
+}
+
+func injectToolPrompt(messages []map[string]any, tools []any) ([]map[string]any, []string) {
+	toolSchemas := make([]string, 0, len(tools))
+	names := make([]string, 0, len(tools))
+	for _, t := range tools {
+		tool, ok := t.(map[string]any)
+		if !ok {
+			continue
+		}
+		fn, _ := tool["function"].(map[string]any)
+		if len(fn) == 0 {
+			fn = tool
+		}
+		name, _ := fn["name"].(string)
+		desc, _ := fn["description"].(string)
+		schema, _ := fn["parameters"].(map[string]any)
+		if name == "" {
+			name = "unknown"
+		}
+		names = append(names, name)
+		if desc == "" {
+			desc = "No description available"
+		}
+		b, _ := json.Marshal(schema)
+		toolSchemas = append(toolSchemas, fmt.Sprintf("Tool: %s\nDescription: %s\nParameters: %s", name, desc, string(b)))
+	}
+	if len(toolSchemas) == 0 {
+		return messages, names
+	}
+	toolPrompt := "You have access to these tools:\n\n" + strings.Join(toolSchemas, "\n\n") + "\n\nWhen you need to use tools, output ONLY this JSON format (no other text):\n{\"tool_calls\": [{\"name\": \"tool_name\", \"input\": {\"param\": \"value\"}}]}\n\nIMPORTANT: If calling tools, output ONLY the JSON. The response must start with { and end with }"
+
+	for i := range messages {
+		if messages[i]["role"] == "system" {
+			old, _ := messages[i]["content"].(string)
+			messages[i]["content"] = strings.TrimSpace(old + "\n\n" + toolPrompt)
+			return messages, names
+		}
+	}
+	messages = append([]map[string]any{{"role": "system", "content": toolPrompt}}, messages...)
+	return messages, names
+}
+
+func writeOpenAIError(w http.ResponseWriter, status int, message string) {
+	writeJSON(w, status, map[string]any{
+		"error": map[string]any{
+			"message": message,
+			"type":    openAIErrorType(status),
+		},
+	})
+}
+
+func openAIErrorType(status int) string {
+	switch status {
+	case http.StatusBadRequest:
+		return "invalid_request_error"
+	case http.StatusUnauthorized:
+		return "authentication_error"
+	case http.StatusForbidden:
+		return "permission_error"
+	case http.StatusTooManyRequests:
+		return "rate_limit_error"
+	case http.StatusServiceUnavailable:
+		return "service_unavailable_error"
+	default:
+		if status >= 500 {
+			return "api_error"
+		}
+		return "invalid_request_error"
+	}
+}
--- a/internal/adapter/openai/handler_toolcall_test.go
+++ b/internal/adapter/openai/handler_toolcall_test.go
@@ -0,0 +1,539 @@
+package openai
+
+import (
+	"context"
+	"encoding/json"
+	"io"
+	"net/http"
+	"net/http/httptest"
+	"strings"
+	"testing"
+)
+
+func makeSSEHTTPResponse(lines ...string) *http.Response {
+	body := strings.Join(lines, "\n")
+	if !strings.HasSuffix(body, "\n") {
+		body += "\n"
+	}
+	return &http.Response{
+		StatusCode: http.StatusOK,
+		Header:     make(http.Header),
+		Body:       io.NopCloser(strings.NewReader(body)),
+	}
+}
+
+func decodeJSONBody(t *testing.T, body string) map[string]any {
+	t.Helper()
+	var out map[string]any
+	if err := json.Unmarshal([]byte(body), &out); err != nil {
+		t.Fatalf("decode json failed: %v, body=%s", err, body)
+	}
+	return out
+}
+
+func parseSSEDataFrames(t *testing.T, body string) ([]map[string]any, bool) {
+	t.Helper()
+	lines := strings.Split(body, "\n")
+	frames := make([]map[string]any, 0, len(lines))
+	done := false
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		if !strings.HasPrefix(line, "data:") {
+			continue
+		}
+		payload := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
+		if payload == "" {
+			continue
+		}
+		if payload == "[DONE]" {
+			done = true
+			continue
+		}
+		var frame map[string]any
+		if err := json.Unmarshal([]byte(payload), &frame); err != nil {
+			t.Fatalf("decode sse frame failed: %v, payload=%s", err, payload)
+		}
+		frames = append(frames, frame)
+	}
+	return frames, done
+}
+
+func streamHasRawToolJSONContent(frames []map[string]any) bool {
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			content, _ := delta["content"].(string)
+			if strings.Contains(content, `"tool_calls"`) {
+				return true
+			}
+		}
+	}
+	return false
+}
+
+func streamHasToolCallsDelta(frames []map[string]any) bool {
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if _, ok := delta["tool_calls"]; ok {
+				return true
+			}
+		}
+	}
+	return false
+}
+
+func streamFinishReason(frames []map[string]any) string {
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			if reason, ok := choice["finish_reason"].(string); ok && reason != "" {
+				return reason
+			}
+		}
+	}
+	return ""
+}
+
+func TestHandleNonStreamToolCallInterceptsChatModel(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+
+	h.handleNonStream(rec, context.Background(), resp, "cid1", "deepseek-chat", "prompt", false, false, []string{"search"})
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d", rec.Code)
+	}
+
+	out := decodeJSONBody(t, rec.Body.String())
+	choices, _ := out["choices"].([]any)
+	if len(choices) != 1 {
+		t.Fatalf("unexpected choices: %#v", out["choices"])
+	}
+	choice, _ := choices[0].(map[string]any)
+	if choice["finish_reason"] != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
+	}
+	msg, _ := choice["message"].(map[string]any)
+	if msg["content"] != nil {
+		t.Fatalf("expected content nil, got %#v", msg["content"])
+	}
+	toolCalls, _ := msg["tool_calls"].([]any)
+	if len(toolCalls) != 1 {
+		t.Fatalf("expected 1 tool call, got %#v", msg["tool_calls"])
+	}
+}
+
+func TestHandleNonStreamToolCallInterceptsReasonerModel(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/thinking_content","v":"先想一下"}`,
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+
+	h.handleNonStream(rec, context.Background(), resp, "cid2", "deepseek-reasoner", "prompt", true, false, []string{"search"})
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d", rec.Code)
+	}
+
+	out := decodeJSONBody(t, rec.Body.String())
+	choices, _ := out["choices"].([]any)
+	choice, _ := choices[0].(map[string]any)
+	msg, _ := choice["message"].(map[string]any)
+	if msg["reasoning_content"] != "先想一下" {
+		t.Fatalf("expected reasoning_content, got %#v", msg["reasoning_content"])
+	}
+	if msg["content"] != nil {
+		t.Fatalf("expected content nil, got %#v", msg["content"])
+	}
+	if choice["finish_reason"] != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
+	}
+}
+
+func TestHandleNonStreamUnknownToolStillIntercepted(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"not_in_schema\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+
+	h.handleNonStream(rec, context.Background(), resp, "cid2b", "deepseek-chat", "prompt", false, false, []string{"search"})
+	if rec.Code != http.StatusOK {
+		t.Fatalf("unexpected status: %d", rec.Code)
+	}
+
+	out := decodeJSONBody(t, rec.Body.String())
+	choices, _ := out["choices"].([]any)
+	choice, _ := choices[0].(map[string]any)
+	if choice["finish_reason"] != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, got %#v", choice["finish_reason"])
+	}
+	msg, _ := choice["message"].(map[string]any)
+	if msg["content"] != nil {
+		t.Fatalf("expected content nil, got %#v", msg["content"])
+	}
+	toolCalls, _ := msg["tool_calls"].([]any)
+	if len(toolCalls) != 1 {
+		t.Fatalf("expected 1 tool call, got %#v", msg["tool_calls"])
+	}
+}
+
+func TestHandleStreamToolCallInterceptsWithoutRawContentLeak(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
+		`data: {"p":"response/content","v":",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid3", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if !streamHasToolCallsDelta(frames) {
+		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
+	}
+	foundToolIndex := false
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			toolCalls, _ := delta["tool_calls"].([]any)
+			for _, tc := range toolCalls {
+				tcm, _ := tc.(map[string]any)
+				if _, ok := tcm["index"].(float64); ok {
+					foundToolIndex = true
+				}
+			}
+		}
+	}
+	if !foundToolIndex {
+		t.Fatalf("expected stream tool_calls item with index, body=%s", rec.Body.String())
+	}
+	if streamHasRawToolJSONContent(frames) {
+		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
+	}
+	if streamFinishReason(frames) != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamReasonerToolCallInterceptsWithoutRawContentLeak(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/thinking_content","v":"思考中"}`,
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid4", "deepseek-reasoner", "prompt", true, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if !streamHasToolCallsDelta(frames) {
+		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
+	}
+	foundToolIndex := false
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			toolCalls, _ := delta["tool_calls"].([]any)
+			for _, tc := range toolCalls {
+				tcm, _ := tc.(map[string]any)
+				if _, ok := tcm["index"].(float64); ok {
+					foundToolIndex = true
+				}
+			}
+		}
+	}
+	if !foundToolIndex {
+		t.Fatalf("expected stream tool_calls item with index, body=%s", rec.Body.String())
+	}
+	if streamHasRawToolJSONContent(frames) {
+		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
+	}
+	if streamFinishReason(frames) != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
+	}
+
+	hasThinkingDelta := false
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if _, ok := delta["reasoning_content"]; ok {
+				hasThinkingDelta = true
+			}
+		}
+	}
+	if !hasThinkingDelta {
+		t.Fatalf("expected reasoning_content delta in reasoner stream: %s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamUnknownToolStillIntercepted(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"not_in_schema\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid5", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if !streamHasToolCallsDelta(frames) {
+		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
+	}
+	foundToolIndex := false
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			toolCalls, _ := delta["tool_calls"].([]any)
+			for _, tc := range toolCalls {
+				tcm, _ := tc.(map[string]any)
+				if _, ok := tcm["index"].(float64); ok {
+					foundToolIndex = true
+				}
+			}
+		}
+	}
+	if !foundToolIndex {
+		t.Fatalf("expected stream tool_calls item with index, body=%s", rec.Body.String())
+	}
+	if streamHasRawToolJSONContent(frames) {
+		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamToolsPlainTextStreamsBeforeFinish(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"你好，"}`,
+		`data: {"p":"response/content","v":"这是普通文本回复。"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid6", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if streamHasToolCallsDelta(frames) {
+		t.Fatalf("did not expect tool_calls delta for plain text: %s", rec.Body.String())
+	}
+	content := strings.Builder{}
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if c, ok := delta["content"].(string); ok {
+				content.WriteString(c)
+			}
+		}
+	}
+	if got := content.String(); got == "" {
+		t.Fatalf("expected streamed content in tool mode plain text, body=%s", rec.Body.String())
+	}
+	if streamFinishReason(frames) != "stop" {
+		t.Fatalf("expected finish_reason=stop, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamToolCallMixedWithPlainTextSegments(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"前置正文A。"}`,
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: {"p":"response/content","v":"后置正文B。"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid7", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if !streamHasToolCallsDelta(frames) {
+		t.Fatalf("expected tool_calls delta in mixed stream, body=%s", rec.Body.String())
+	}
+	if streamHasRawToolJSONContent(frames) {
+		t.Fatalf("raw tool_calls JSON leaked in mixed stream: %s", rec.Body.String())
+	}
+	content := strings.Builder{}
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if c, ok := delta["content"].(string); ok {
+				content.WriteString(c)
+			}
+		}
+	}
+	got := content.String()
+	if !strings.Contains(got, "前置正文A。") || !strings.Contains(got, "后置正文B。") {
+		t.Fatalf("expected pre/post plain text to pass sieve, got=%q", got)
+	}
+	if streamFinishReason(frames) != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamToolCallKeyAppearsLateStillNoPrefixLeak(t *testing.T) {
+	h := &Handler{}
+	spaces := strings.Repeat(" ", 200)
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{`+spaces+`"}`,
+		`data: {"p":"response/content","v":"\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"go\"}}]}"}`,
+		`data: {"p":"response/content","v":"后置正文C。"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid8", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if !streamHasToolCallsDelta(frames) {
+		t.Fatalf("expected tool_calls delta, body=%s", rec.Body.String())
+	}
+	if streamHasRawToolJSONContent(frames) {
+		t.Fatalf("raw tool_calls JSON leaked in content delta: %s", rec.Body.String())
+	}
+	content := strings.Builder{}
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if c, ok := delta["content"].(string); ok {
+				content.WriteString(c)
+			}
+		}
+	}
+	got := content.String()
+	if strings.Contains(got, "{") {
+		t.Fatalf("unexpected suspicious prefix leak in content: %q", got)
+	}
+	if !strings.Contains(got, "后置正文C。") {
+		t.Fatalf("expected stream to continue after tool json convergence, got=%q", got)
+	}
+	if streamFinishReason(frames) != "tool_calls" {
+		t.Fatalf("expected finish_reason=tool_calls, body=%s", rec.Body.String())
+	}
+}
+
+func TestHandleStreamInvalidToolJSONDoesNotLeakRawObject(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"前置正文D。"}`,
+		`data: {"p":"response/content","v":"{'tool_calls':[{'name':'search','input':{'q':'go'}}]}"}`,
+		`data: {"p":"response/content","v":"后置正文E。"}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid9", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if streamHasToolCallsDelta(frames) {
+		t.Fatalf("did not expect tool_calls delta for invalid json, body=%s", rec.Body.String())
+	}
+	content := strings.Builder{}
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if c, ok := delta["content"].(string); ok {
+				content.WriteString(c)
+			}
+		}
+	}
+	got := strings.ToLower(content.String())
+	if strings.Contains(got, "tool_calls") {
+		t.Fatalf("unexpected raw tool_calls leak in content: %q", content.String())
+	}
+	if !strings.Contains(content.String(), "前置正文D。") || !strings.Contains(content.String(), "后置正文E。") {
+		t.Fatalf("expected pre/post plain text to remain, got=%q", content.String())
+	}
+}
+
+func TestHandleStreamIncompleteCapturedToolJSONDoesNotLeakOnFinalize(t *testing.T) {
+	h := &Handler{}
+	resp := makeSSEHTTPResponse(
+		`data: {"p":"response/content","v":"{\"tool_calls\":[{\"name\":\"search\""}`,
+		`data: [DONE]`,
+	)
+	rec := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	h.handleStream(rec, req, resp, "cid10", "deepseek-chat", "prompt", false, false, []string{"search"})
+
+	frames, done := parseSSEDataFrames(t, rec.Body.String())
+	if !done {
+		t.Fatalf("expected [DONE], body=%s", rec.Body.String())
+	}
+	if streamHasToolCallsDelta(frames) {
+		t.Fatalf("did not expect tool_calls delta for incomplete json, body=%s", rec.Body.String())
+	}
+	content := strings.Builder{}
+	for _, frame := range frames {
+		choices, _ := frame["choices"].([]any)
+		for _, item := range choices {
+			choice, _ := item.(map[string]any)
+			delta, _ := choice["delta"].(map[string]any)
+			if c, ok := delta["content"].(string); ok {
+				content.WriteString(c)
+			}
+		}
+	}
+	if strings.Contains(strings.ToLower(content.String()), "tool_calls") || strings.Contains(content.String(), "{") {
+		t.Fatalf("unexpected incomplete tool json leak in content: %q", content.String())
+	}
+}
--- a/internal/adapter/openai/tool_sieve.go
+++ b/internal/adapter/openai/tool_sieve.go
@@ -0,0 +1,223 @@
+package openai
+
+import (
+	"strings"
+
+	"ds2api/internal/util"
+)
+
+type toolStreamSieveState struct {
+	pending   strings.Builder
+	capture   strings.Builder
+	capturing bool
+}
+
+type toolStreamEvent struct {
+	Content   string
+	ToolCalls []util.ParsedToolCall
+}
+
+func processToolSieveChunk(state *toolStreamSieveState, chunk string, toolNames []string) []toolStreamEvent {
+	if state == nil {
+		return nil
+	}
+	if chunk != "" {
+		state.pending.WriteString(chunk)
+	}
+	events := make([]toolStreamEvent, 0, 2)
+
+	for {
+		if state.capturing {
+			if state.pending.Len() > 0 {
+				state.capture.WriteString(state.pending.String())
+				state.pending.Reset()
+			}
+			prefix, calls, suffix, ready := consumeToolCapture(state.capture.String(), toolNames)
+			if !ready {
+				break
+			}
+			state.capture.Reset()
+			state.capturing = false
+			if prefix != "" {
+				events = append(events, toolStreamEvent{Content: prefix})
+			}
+			if len(calls) > 0 {
+				events = append(events, toolStreamEvent{ToolCalls: calls})
+			}
+			if suffix != "" {
+				state.pending.WriteString(suffix)
+			}
+			continue
+		}
+
+		pending := state.pending.String()
+		if pending == "" {
+			break
+		}
+		start := findToolSegmentStart(pending)
+		if start >= 0 {
+			prefix := pending[:start]
+			if prefix != "" {
+				events = append(events, toolStreamEvent{Content: prefix})
+			}
+			state.pending.Reset()
+			state.capture.WriteString(pending[start:])
+			state.capturing = true
+			continue
+		}
+
+		safe, hold := splitSafeContentForToolDetection(pending)
+		if safe == "" {
+			break
+		}
+		state.pending.Reset()
+		state.pending.WriteString(hold)
+		events = append(events, toolStreamEvent{Content: safe})
+	}
+
+	return events
+}
+
+func flushToolSieve(state *toolStreamSieveState, toolNames []string) []toolStreamEvent {
+	if state == nil {
+		return nil
+	}
+	events := processToolSieveChunk(state, "", toolNames)
+	if state.capturing {
+		consumedPrefix, consumedCalls, consumedSuffix, ready := consumeToolCapture(state.capture.String(), toolNames)
+		if ready {
+			if consumedPrefix != "" {
+				events = append(events, toolStreamEvent{Content: consumedPrefix})
+			}
+			if len(consumedCalls) > 0 {
+				events = append(events, toolStreamEvent{ToolCalls: consumedCalls})
+			}
+			if consumedSuffix != "" {
+				events = append(events, toolStreamEvent{Content: consumedSuffix})
+			}
+		} else {
+			// Incomplete captured tool JSON at stream end: suppress raw capture.
+		}
+		state.capture.Reset()
+		state.capturing = false
+	}
+	if state.pending.Len() > 0 {
+		events = append(events, toolStreamEvent{Content: state.pending.String()})
+		state.pending.Reset()
+	}
+	return events
+}
+
+func splitSafeContentForToolDetection(s string) (safe, hold string) {
+	if s == "" {
+		return "", ""
+	}
+	suspiciousStart := findSuspiciousPrefixStart(s)
+	if suspiciousStart < 0 {
+		return s, ""
+	}
+	if suspiciousStart > 0 {
+		return s[:suspiciousStart], s[suspiciousStart:]
+	}
+	// If suspicious content starts at position 0, keep holding until we can
+	// parse a complete tool JSON block or reach stream flush.
+	return "", s
+}
+
+func findSuspiciousPrefixStart(s string) int {
+	start := -1
+	indices := []int{
+		strings.LastIndex(s, "{"),
+		strings.LastIndex(s, "["),
+		strings.LastIndex(s, "```"),
+	}
+	for _, idx := range indices {
+		if idx > start {
+			start = idx
+		}
+	}
+	return start
+}
+
+func findToolSegmentStart(s string) int {
+	if s == "" {
+		return -1
+	}
+	lower := strings.ToLower(s)
+	keyIdx := strings.Index(lower, "tool_calls")
+	if keyIdx < 0 {
+		return -1
+	}
+	if start := strings.LastIndex(s[:keyIdx], "{"); start >= 0 {
+		return start
+	}
+	return keyIdx
+}
+
+func consumeToolCapture(captured string, toolNames []string) (prefix string, calls []util.ParsedToolCall, suffix string, ready bool) {
+	if captured == "" {
+		return "", nil, "", false
+	}
+	lower := strings.ToLower(captured)
+	keyIdx := strings.Index(lower, "tool_calls")
+	if keyIdx < 0 {
+		return "", nil, "", false
+	}
+	start := strings.LastIndex(captured[:keyIdx], "{")
+	if start < 0 {
+		return "", nil, "", false
+	}
+	obj, end, ok := extractJSONObjectFrom(captured, start)
+	if !ok {
+		return "", nil, "", false
+	}
+	parsed := util.ParseToolCalls(obj, toolNames)
+	if len(parsed) == 0 {
+		// `tool_calls` key exists but strict JSON parse failed.
+		// Drop the captured object body to avoid leaking raw tool JSON.
+		return captured[:start], nil, captured[end:], true
+	}
+	return captured[:start], parsed, captured[end:], true
+}
+
+func extractJSONObjectFrom(text string, start int) (string, int, bool) {
+	if start < 0 || start >= len(text) || text[start] != '{' {
+		return "", 0, false
+	}
+	depth := 0
+	quote := byte(0)
+	escaped := false
+	for i := start; i < len(text); i++ {
+		ch := text[i]
+		if quote != 0 {
+			if escaped {
+				escaped = false
+				continue
+			}
+			if ch == '\\' {
+				escaped = true
+				continue
+			}
+			if ch == quote {
+				quote = 0
+			}
+			continue
+		}
+		if ch == '"' || ch == '\'' {
+			quote = ch
+			continue
+		}
+		if ch == '{' {
+			depth++
+			continue
+		}
+		if ch == '}' {
+			depth--
+			if depth == 0 {
+				end := i + 1
+				return text[start:end], end, true
+			}
+		}
+	}
+	return "", 0, false
+}
--- a/internal/adapter/openai/vercel_prepare_test.go
+++ b/internal/adapter/openai/vercel_prepare_test.go
@@ -0,0 +1,83 @@
+package openai
+
+import (
+	"ds2api/internal/auth"
+	"net/http/httptest"
+	"testing"
+	"time"
+)
+
+func TestIsVercelStreamPrepareRequest(t *testing.T) {
+	req := httptest.NewRequest("POST", "/v1/chat/completions?__stream_prepare=1", nil)
+	if !isVercelStreamPrepareRequest(req) {
+		t.Fatalf("expected prepare request to be detected")
+	}
+
+	req2 := httptest.NewRequest("POST", "/v1/chat/completions", nil)
+	if isVercelStreamPrepareRequest(req2) {
+		t.Fatalf("expected non-prepare request")
+	}
+}
+
+func TestIsVercelStreamReleaseRequest(t *testing.T) {
+	req := httptest.NewRequest("POST", "/v1/chat/completions?__stream_release=1", nil)
+	if !isVercelStreamReleaseRequest(req) {
+		t.Fatalf("expected release request to be detected")
+	}
+
+	req2 := httptest.NewRequest("POST", "/v1/chat/completions", nil)
+	if isVercelStreamReleaseRequest(req2) {
+		t.Fatalf("expected non-release request")
+	}
+}
+
+func TestVercelInternalSecret(t *testing.T) {
+	t.Run("prefer explicit secret", func(t *testing.T) {
+		t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
+		t.Setenv("DS2API_ADMIN_KEY", "admin-fallback")
+		if got := vercelInternalSecret(); got != "stream-secret" {
+			t.Fatalf("expected explicit secret, got %q", got)
+		}
+	})
+
+	t.Run("fallback to admin key", func(t *testing.T) {
+		t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "")
+		t.Setenv("DS2API_ADMIN_KEY", "admin-fallback")
+		if got := vercelInternalSecret(); got != "admin-fallback" {
+			t.Fatalf("expected admin key fallback, got %q", got)
+		}
+	})
+
+	t.Run("default admin when env missing", func(t *testing.T) {
+		t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "")
+		t.Setenv("DS2API_ADMIN_KEY", "")
+		if got := vercelInternalSecret(); got != "admin" {
+			t.Fatalf("expected default admin fallback, got %q", got)
+		}
+	})
+}
+
+func TestStreamLeaseLifecycle(t *testing.T) {
+	h := &Handler{}
+	leaseID := h.holdStreamLease(&auth.RequestAuth{UseConfigToken: false})
+	if leaseID == "" {
+		t.Fatalf("expected non-empty lease id")
+	}
+	if ok := h.releaseStreamLease(leaseID); !ok {
+		t.Fatalf("expected lease release success")
+	}
+	if ok := h.releaseStreamLease(leaseID); ok {
+		t.Fatalf("expected duplicate release to fail")
+	}
+}
+
+func TestStreamLeaseTTL(t *testing.T) {
+	t.Setenv("DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS", "120")
+	if got := streamLeaseTTL(); got != 120*time.Second {
+		t.Fatalf("expected ttl=120s, got %v", got)
+	}
+	t.Setenv("DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS", "invalid")
+	if got := streamLeaseTTL(); got != 15*time.Minute {
+		t.Fatalf("expected default ttl on invalid value, got %v", got)
+	}
+}
--- a/internal/adapter/openai/vercel_stream.go
+++ b/internal/adapter/openai/vercel_stream.go
@@ -0,0 +1,275 @@
+package openai
+
+import (
+	"crypto/rand"
+	"crypto/subtle"
+	"encoding/hex"
+	"encoding/json"
+	"fmt"
+	"net/http"
+	"os"
+	"strconv"
+	"strings"
+	"time"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/util"
+)
+
+func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Request) {
+	if !config.IsVercel() {
+		http.NotFound(w, r)
+		return
+	}
+	h.sweepExpiredStreamLeases()
+	internalSecret := vercelInternalSecret()
+	internalToken := strings.TrimSpace(r.Header.Get("X-Ds2-Internal-Token"))
+	if internalSecret == "" || subtle.ConstantTimeCompare([]byte(internalToken), []byte(internalSecret)) != 1 {
+		writeOpenAIError(w, http.StatusUnauthorized, "unauthorized internal request")
+		return
+	}
+
+	a, err := h.Auth.Determine(r)
+	if err != nil {
+		status := http.StatusUnauthorized
+		if err == auth.ErrNoAccount {
+			status = http.StatusTooManyRequests
+		}
+		writeOpenAIError(w, status, err.Error())
+		return
+	}
+	leased := false
+	defer func() {
+		if !leased {
+			h.Auth.Release(a)
+		}
+	}()
+	r = r.WithContext(auth.WithAuth(r.Context(), a))
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
+		return
+	}
+	if !util.ToBool(req["stream"]) {
+		writeOpenAIError(w, http.StatusBadRequest, "stream must be true")
+		return
+	}
+	model, _ := req["model"].(string)
+	messagesRaw, _ := req["messages"].([]any)
+	if model == "" || len(messagesRaw) == 0 {
+		writeOpenAIError(w, http.StatusBadRequest, "Request must include 'model' and 'messages'.")
+		return
+	}
+	thinkingEnabled, searchEnabled, ok := config.GetModelConfig(model)
+	if !ok {
+		writeOpenAIError(w, http.StatusServiceUnavailable, fmt.Sprintf("Model '%s' is not available.", model))
+		return
+	}
+
+	messages := normalizeMessages(messagesRaw)
+	if tools, ok := req["tools"].([]any); ok && len(tools) > 0 {
+		messages, _ = injectToolPrompt(messages, tools)
+	}
+	finalPrompt := util.MessagesPrepare(messages)
+
+	sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
+	if err != nil {
+		if a.UseConfigToken {
+			writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
+		} else {
+			writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
+		}
+		return
+	}
+	powHeader, err := h.DS.GetPow(r.Context(), a, 3)
+	if err != nil {
+		writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
+		return
+	}
+	if strings.TrimSpace(a.DeepSeekToken) == "" {
+		writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
+		return
+	}
+
+	payload := map[string]any{
+		"chat_session_id":   sessionID,
+		"parent_message_id": nil,
+		"prompt":            finalPrompt,
+		"ref_file_ids":      []any{},
+		"thinking_enabled":  thinkingEnabled,
+		"search_enabled":    searchEnabled,
+	}
+	leaseID := h.holdStreamLease(a)
+	if leaseID == "" {
+		writeOpenAIError(w, http.StatusInternalServerError, "failed to create stream lease")
+		return
+	}
+	leased = true
+	writeJSON(w, http.StatusOK, map[string]any{
+		"session_id":       sessionID,
+		"lease_id":         leaseID,
+		"model":            model,
+		"final_prompt":     finalPrompt,
+		"thinking_enabled": thinkingEnabled,
+		"search_enabled":   searchEnabled,
+		"deepseek_token":   a.DeepSeekToken,
+		"pow_header":       powHeader,
+		"payload":          payload,
+	})
+}
+
+func (h *Handler) handleVercelStreamRelease(w http.ResponseWriter, r *http.Request) {
+	if !config.IsVercel() {
+		http.NotFound(w, r)
+		return
+	}
+	h.sweepExpiredStreamLeases()
+	internalSecret := vercelInternalSecret()
+	internalToken := strings.TrimSpace(r.Header.Get("X-Ds2-Internal-Token"))
+	if internalSecret == "" || subtle.ConstantTimeCompare([]byte(internalToken), []byte(internalSecret)) != 1 {
+		writeOpenAIError(w, http.StatusUnauthorized, "unauthorized internal request")
+		return
+	}
+
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeOpenAIError(w, http.StatusBadRequest, "invalid json")
+		return
+	}
+	leaseID, _ := req["lease_id"].(string)
+	leaseID = strings.TrimSpace(leaseID)
+	if leaseID == "" {
+		writeOpenAIError(w, http.StatusBadRequest, "lease_id is required")
+		return
+	}
+	if !h.releaseStreamLease(leaseID) {
+		writeOpenAIError(w, http.StatusNotFound, "stream lease not found")
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"success": true})
+}
+
+func isVercelStreamPrepareRequest(r *http.Request) bool {
+	if r == nil {
+		return false
+	}
+	return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1"
+}
+
+func isVercelStreamReleaseRequest(r *http.Request) bool {
+	if r == nil {
+		return false
+	}
+	return strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
+}
+
+func vercelInternalSecret() string {
+	if v := strings.TrimSpace(os.Getenv("DS2API_VERCEL_INTERNAL_SECRET")); v != "" {
+		return v
+	}
+	if v := strings.TrimSpace(os.Getenv("DS2API_ADMIN_KEY")); v != "" {
+		return v
+	}
+	return "admin"
+}
+
+func (h *Handler) holdStreamLease(a *auth.RequestAuth) string {
+	if a == nil {
+		return ""
+	}
+	now := time.Now()
+	ttl := streamLeaseTTL()
+	if ttl <= 0 {
+		ttl = 15 * time.Minute
+	}
+
+	h.leaseMu.Lock()
+	expired := h.popExpiredLeasesLocked(now)
+	if h.streamLeases == nil {
+		h.streamLeases = make(map[string]streamLease)
+	}
+	leaseID := newLeaseID()
+	h.streamLeases[leaseID] = streamLease{
+		Auth:      a,
+		ExpiresAt: now.Add(ttl),
+	}
+	h.leaseMu.Unlock()
+	h.releaseExpiredAuths(expired)
+	return leaseID
+}
+
+func (h *Handler) releaseStreamLease(leaseID string) bool {
+	leaseID = strings.TrimSpace(leaseID)
+	if leaseID == "" {
+		return false
+	}
+
+	h.leaseMu.Lock()
+	expired := h.popExpiredLeasesLocked(time.Now())
+	lease, ok := h.streamLeases[leaseID]
+	if ok {
+		delete(h.streamLeases, leaseID)
+	}
+	h.leaseMu.Unlock()
+	h.releaseExpiredAuths(expired)
+
+	if !ok {
+		return false
+	}
+	if h.Auth != nil {
+		h.Auth.Release(lease.Auth)
+	}
+	return true
+}
+
+func (h *Handler) popExpiredLeasesLocked(now time.Time) []*auth.RequestAuth {
+	if len(h.streamLeases) == 0 {
+		return nil
+	}
+	expired := make([]*auth.RequestAuth, 0)
+	for leaseID, lease := range h.streamLeases {
+		if now.After(lease.ExpiresAt) {
+			delete(h.streamLeases, leaseID)
+			expired = append(expired, lease.Auth)
+		}
+	}
+	return expired
+}
+
+func (h *Handler) releaseExpiredAuths(expired []*auth.RequestAuth) {
+	if h.Auth == nil || len(expired) == 0 {
+		return
+	}
+	for _, a := range expired {
+		h.Auth.Release(a)
+	}
+}
+
+func (h *Handler) sweepExpiredStreamLeases() {
+	h.leaseMu.Lock()
+	expired := h.popExpiredLeasesLocked(time.Now())
+	h.leaseMu.Unlock()
+	h.releaseExpiredAuths(expired)
+}
+
+func streamLeaseTTL() time.Duration {
+	raw := strings.TrimSpace(os.Getenv("DS2API_VERCEL_STREAM_LEASE_TTL_SECONDS"))
+	if raw == "" {
+		return 15 * time.Minute
+	}
+	seconds, err := strconv.Atoi(raw)
+	if err != nil || seconds <= 0 {
+		return 15 * time.Minute
+	}
+	return time.Duration(seconds) * time.Second
+}
+
+func newLeaseID() string {
+	buf := make([]byte, 16)
+	if _, err := rand.Read(buf); err == nil {
+		return hex.EncodeToString(buf)
+	}
+	return fmt.Sprintf("lease-%d", time.Now().UnixNano())
+}
--- a/internal/admin/handler.go
+++ b/internal/admin/handler.go
@@ -0,0 +1,39 @@
+package admin
+
+import (
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/account"
+	"ds2api/internal/config"
+	"ds2api/internal/deepseek"
+)
+
+type Handler struct {
+	Store *config.Store
+	Pool  *account.Pool
+	DS    *deepseek.Client
+}
+
+func RegisterRoutes(r chi.Router, h *Handler) {
+	r.Post("/login", h.login)
+	r.Get("/verify", h.verify)
+	r.Group(func(pr chi.Router) {
+		pr.Use(h.requireAdmin)
+		pr.Get("/vercel/config", h.getVercelConfig)
+		pr.Get("/config", h.getConfig)
+		pr.Post("/config", h.updateConfig)
+		pr.Post("/keys", h.addKey)
+		pr.Delete("/keys/{key}", h.deleteKey)
+		pr.Get("/accounts", h.listAccounts)
+		pr.Post("/accounts", h.addAccount)
+		pr.Delete("/accounts/{identifier}", h.deleteAccount)
+		pr.Get("/queue/status", h.queueStatus)
+		pr.Post("/accounts/test", h.testSingleAccount)
+		pr.Post("/accounts/test-all", h.testAllAccounts)
+		pr.Post("/import", h.batchImport)
+		pr.Post("/test", h.testAPI)
+		pr.Post("/vercel/sync", h.syncVercel)
+		pr.Get("/vercel/status", h.vercelStatus)
+		pr.Get("/export", h.exportConfig)
+	})
+}
--- a/internal/admin/handler_accounts.go
+++ b/internal/admin/handler_accounts.go
@@ -0,0 +1,305 @@
+package admin
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"io"
+	"net/http"
+	"strings"
+	"sync"
+	"time"
+
+	"github.com/go-chi/chi/v5"
+
+	authn "ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/sse"
+)
+
+func (h *Handler) listAccounts(w http.ResponseWriter, r *http.Request) {
+	page := intFromQuery(r, "page", 1)
+	pageSize := intFromQuery(r, "page_size", 10)
+	if page < 1 {
+		page = 1
+	}
+	if pageSize < 1 {
+		pageSize = 1
+	}
+	if pageSize > 100 {
+		pageSize = 100
+	}
+	accounts := h.Store.Snapshot().Accounts
+	total := len(accounts)
+	reverseAccounts(accounts)
+	totalPages := 1
+	if total > 0 {
+		totalPages = (total + pageSize - 1) / pageSize
+	}
+	start := (page - 1) * pageSize
+	if start > total {
+		start = total
+	}
+	end := start + pageSize
+	if end > total {
+		end = total
+	}
+	items := make([]map[string]any, 0, end-start)
+	for _, acc := range accounts[start:end] {
+		token := strings.TrimSpace(acc.Token)
+		preview := ""
+		if token != "" {
+			if len(token) > 20 {
+				preview = token[:20] + "..."
+			} else {
+				preview = token
+			}
+		}
+		items = append(items, map[string]any{"email": acc.Email, "mobile": acc.Mobile, "has_password": acc.Password != "", "has_token": token != "", "token_preview": preview})
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"items": items, "total": total, "page": page, "page_size": pageSize, "total_pages": totalPages})
+}
+
+func (h *Handler) addAccount(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	acc := toAccount(req)
+	if acc.Identifier() == "" {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "需要 email 或 mobile"})
+		return
+	}
+	err := h.Store.Update(func(c *config.Config) error {
+		for _, a := range c.Accounts {
+			if acc.Email != "" && a.Email == acc.Email {
+				return fmt.Errorf("邮箱已存在")
+			}
+			if acc.Mobile != "" && a.Mobile == acc.Mobile {
+				return fmt.Errorf("手机号已存在")
+			}
+		}
+		c.Accounts = append(c.Accounts, acc)
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
+		return
+	}
+	h.Pool.Reset()
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "total_accounts": len(h.Store.Snapshot().Accounts)})
+}
+
+func (h *Handler) deleteAccount(w http.ResponseWriter, r *http.Request) {
+	identifier := chi.URLParam(r, "identifier")
+	err := h.Store.Update(func(c *config.Config) error {
+		idx := -1
+		for i, a := range c.Accounts {
+			if a.Email == identifier || a.Mobile == identifier {
+				idx = i
+				break
+			}
+		}
+		if idx < 0 {
+			return fmt.Errorf("账号不存在")
+		}
+		c.Accounts = append(c.Accounts[:idx], c.Accounts[idx+1:]...)
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusNotFound, map[string]any{"detail": err.Error()})
+		return
+	}
+	h.Pool.Reset()
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "total_accounts": len(h.Store.Snapshot().Accounts)})
+}
+
+func (h *Handler) queueStatus(w http.ResponseWriter, _ *http.Request) {
+	writeJSON(w, http.StatusOK, h.Pool.Status())
+}
+
+func (h *Handler) testSingleAccount(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	identifier, _ := req["identifier"].(string)
+	if strings.TrimSpace(identifier) == "" {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "需要账号标识（email 或 mobile）"})
+		return
+	}
+	acc, ok := h.Store.FindAccount(identifier)
+	if !ok {
+		writeJSON(w, http.StatusNotFound, map[string]any{"detail": "账号不存在"})
+		return
+	}
+	model, _ := req["model"].(string)
+	if model == "" {
+		model = "deepseek-chat"
+	}
+	message, _ := req["message"].(string)
+	result := h.testAccount(r.Context(), acc, model, message)
+	writeJSON(w, http.StatusOK, result)
+}
+
+func (h *Handler) testAllAccounts(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	model, _ := req["model"].(string)
+	if model == "" {
+		model = "deepseek-chat"
+	}
+	accounts := h.Store.Snapshot().Accounts
+	if len(accounts) == 0 {
+		writeJSON(w, http.StatusOK, map[string]any{"total": 0, "success": 0, "failed": 0, "results": []any{}})
+		return
+	}
+
+	// Concurrent testing with a semaphore to limit parallelism.
+	const maxConcurrency = 5
+	results := runAccountTestsConcurrently(accounts, maxConcurrency, func(_ int, account config.Account) map[string]any {
+		return h.testAccount(r.Context(), account, model, "")
+	})
+
+	success := 0
+	for _, res := range results {
+		if ok, _ := res["success"].(bool); ok {
+			success++
+		}
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"total": len(accounts), "success": success, "failed": len(accounts) - success, "results": results})
+}
+
+func runAccountTestsConcurrently(accounts []config.Account, maxConcurrency int, testFn func(int, config.Account) map[string]any) []map[string]any {
+	if maxConcurrency <= 0 {
+		maxConcurrency = 1
+	}
+	sem := make(chan struct{}, maxConcurrency)
+	results := make([]map[string]any, len(accounts))
+	var wg sync.WaitGroup
+	for i, acc := range accounts {
+		wg.Add(1)
+		go func(idx int, account config.Account) {
+			defer wg.Done()
+			sem <- struct{}{}        // acquire
+			defer func() { <-sem }() // release
+			results[idx] = testFn(idx, account)
+		}(i, acc)
+	}
+	wg.Wait()
+	return results
+}
+
+func (h *Handler) testAccount(ctx context.Context, acc config.Account, model, message string) map[string]any {
+	start := time.Now()
+	result := map[string]any{"account": acc.Identifier(), "success": false, "response_time": 0, "message": "", "model": model}
+	token := strings.TrimSpace(acc.Token)
+	if token == "" {
+		newToken, err := h.DS.Login(ctx, acc)
+		if err != nil {
+			result["message"] = "登录失败: " + err.Error()
+			return result
+		}
+		token = newToken
+		_ = h.Store.UpdateAccountToken(acc.Identifier(), token)
+	}
+	authCtx := &authn.RequestAuth{UseConfigToken: false, DeepSeekToken: token}
+	sessionID, err := h.DS.CreateSession(ctx, authCtx, 1)
+	if err != nil {
+		newToken, loginErr := h.DS.Login(ctx, acc)
+		if loginErr != nil {
+			result["message"] = "创建会话失败: " + err.Error()
+			return result
+		}
+		token = newToken
+		authCtx.DeepSeekToken = token
+		_ = h.Store.UpdateAccountToken(acc.Identifier(), token)
+		sessionID, err = h.DS.CreateSession(ctx, authCtx, 1)
+		if err != nil {
+			result["message"] = "创建会话失败: " + err.Error()
+			return result
+		}
+	}
+	if strings.TrimSpace(message) == "" {
+		result["success"] = true
+		result["message"] = "API 测试成功（仅会话创建）"
+		result["response_time"] = int(time.Since(start).Milliseconds())
+		return result
+	}
+	thinking, search, ok := config.GetModelConfig(model)
+	if !ok {
+		thinking, search = false, false
+	}
+	_ = search
+	pow, err := h.DS.GetPow(ctx, authCtx, 1)
+	if err != nil {
+		result["message"] = "获取 PoW 失败: " + err.Error()
+		return result
+	}
+	payload := map[string]any{"chat_session_id": sessionID, "prompt": "<｜User｜>" + message, "ref_file_ids": []any{}, "thinking_enabled": thinking, "search_enabled": search}
+	resp, err := h.DS.CallCompletion(ctx, authCtx, payload, pow, 1)
+	if err != nil {
+		result["message"] = "请求失败: " + err.Error()
+		return result
+	}
+	if resp.StatusCode != http.StatusOK {
+		defer resp.Body.Close()
+		result["message"] = fmt.Sprintf("请求失败: HTTP %d", resp.StatusCode)
+		return result
+	}
+	collected := sse.CollectStream(resp, thinking, true)
+	result["success"] = true
+	result["response_time"] = int(time.Since(start).Milliseconds())
+	if collected.Text != "" {
+		result["message"] = collected.Text
+	} else {
+		result["message"] = "（无回复内容）"
+	}
+	if collected.Thinking != "" {
+		result["thinking"] = collected.Thinking
+	}
+	return result
+}
+
+func (h *Handler) testAPI(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	model, _ := req["model"].(string)
+	message, _ := req["message"].(string)
+	apiKey, _ := req["api_key"].(string)
+	if model == "" {
+		model = "deepseek-chat"
+	}
+	if message == "" {
+		message = "你好"
+	}
+	if apiKey == "" {
+		keys := h.Store.Snapshot().Keys
+		if len(keys) == 0 {
+			writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "没有可用的 API Key"})
+			return
+		}
+		apiKey = keys[0]
+	}
+	host := r.Host
+	scheme := "http"
+	if strings.Contains(strings.ToLower(host), "vercel") || strings.Contains(strings.ToLower(r.Header.Get("X-Forwarded-Proto")), "https") {
+		scheme = "https"
+	}
+	payload := map[string]any{"model": model, "messages": []map[string]any{{"role": "user", "content": message}}, "stream": false}
+	b, _ := json.Marshal(payload)
+	request, _ := http.NewRequestWithContext(r.Context(), http.MethodPost, fmt.Sprintf("%s://%s/v1/chat/completions", scheme, host), bytes.NewReader(b))
+	request.Header.Set("Authorization", "Bearer "+apiKey)
+	request.Header.Set("Content-Type", "application/json")
+	resp, err := (&http.Client{Timeout: 60 * time.Second}).Do(request)
+	if err != nil {
+		writeJSON(w, http.StatusOK, map[string]any{"success": false, "error": err.Error()})
+		return
+	}
+	defer resp.Body.Close()
+	body, _ := io.ReadAll(resp.Body)
+	if resp.StatusCode == http.StatusOK {
+		var parsed any
+		_ = json.Unmarshal(body, &parsed)
+		writeJSON(w, http.StatusOK, map[string]any{"success": true, "status_code": resp.StatusCode, "response": parsed})
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"success": false, "status_code": resp.StatusCode, "response": string(body)})
+}
--- a/internal/admin/handler_auth.go
+++ b/internal/admin/handler_auth.go
@@ -0,0 +1,69 @@
+package admin
+
+import (
+	"encoding/json"
+	"net/http"
+	"os"
+	"strings"
+	"time"
+
+	authn "ds2api/internal/auth"
+)
+
+func (h *Handler) requireAdmin(next http.Handler) http.Handler {
+	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if err := authn.VerifyAdminRequest(r); err != nil {
+			writeJSON(w, http.StatusUnauthorized, map[string]any{"detail": err.Error()})
+			return
+		}
+		next.ServeHTTP(w, r)
+	})
+}
+
+func (h *Handler) login(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	adminKey, _ := req["admin_key"].(string)
+	expireHours := intFrom(req["expire_hours"])
+	if expireHours <= 0 {
+		expireHours = 24
+	}
+	if adminKey != authn.AdminKey() {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"detail": "Invalid admin key"})
+		return
+	}
+	token, err := authn.CreateJWT(expireHours)
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "token": token, "expires_in": expireHours * 3600})
+}
+
+func (h *Handler) verify(w http.ResponseWriter, r *http.Request) {
+	header := strings.TrimSpace(r.Header.Get("Authorization"))
+	if !strings.HasPrefix(strings.ToLower(header), "bearer ") {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"detail": "No credentials provided"})
+		return
+	}
+	token := strings.TrimSpace(header[7:])
+	payload, err := authn.VerifyJWT(token)
+	if err != nil {
+		writeJSON(w, http.StatusUnauthorized, map[string]any{"detail": err.Error()})
+		return
+	}
+	exp, _ := payload["exp"].(float64)
+	remaining := int64(exp) - time.Now().Unix()
+	if remaining < 0 {
+		remaining = 0
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"valid": true, "expires_at": int64(exp), "remaining_seconds": remaining})
+}
+
+func (h *Handler) getVercelConfig(w http.ResponseWriter, _ *http.Request) {
+	writeJSON(w, http.StatusOK, map[string]any{
+		"has_token":  strings.TrimSpace(os.Getenv("VERCEL_TOKEN")) != "",
+		"project_id": strings.TrimSpace(os.Getenv("VERCEL_PROJECT_ID")),
+		"team_id":    nilIfEmpty(strings.TrimSpace(os.Getenv("VERCEL_TEAM_ID"))),
+	})
+}
--- a/internal/admin/handler_config.go
+++ b/internal/admin/handler_config.go
@@ -0,0 +1,240 @@
+package admin
+
+import (
+	"crypto/md5"
+	"encoding/json"
+	"fmt"
+	"net/http"
+	"sort"
+	"strings"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/config"
+)
+
+func (h *Handler) getConfig(w http.ResponseWriter, _ *http.Request) {
+	snap := h.Store.Snapshot()
+	safe := map[string]any{
+		"keys":     snap.Keys,
+		"accounts": []map[string]any{},
+		"claude_mapping": func() map[string]string {
+			if len(snap.ClaudeMapping) > 0 {
+				return snap.ClaudeMapping
+			}
+			return snap.ClaudeModelMap
+		}(),
+	}
+	accounts := make([]map[string]any, 0, len(snap.Accounts))
+	for _, acc := range snap.Accounts {
+		token := strings.TrimSpace(acc.Token)
+		preview := ""
+		if token != "" {
+			if len(token) > 20 {
+				preview = token[:20] + "..."
+			} else {
+				preview = token
+			}
+		}
+		accounts = append(accounts, map[string]any{
+			"email":         acc.Email,
+			"mobile":        acc.Mobile,
+			"has_password":  strings.TrimSpace(acc.Password) != "",
+			"has_token":     token != "",
+			"token_preview": preview,
+		})
+	}
+	safe["accounts"] = accounts
+	writeJSON(w, http.StatusOK, safe)
+}
+
+func (h *Handler) updateConfig(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "invalid json"})
+		return
+	}
+	old := h.Store.Snapshot()
+	err := h.Store.Update(func(c *config.Config) error {
+		if keys, ok := toStringSlice(req["keys"]); ok {
+			c.Keys = keys
+		}
+		if accountsRaw, ok := req["accounts"].([]any); ok {
+			existing := map[string]config.Account{}
+			for _, a := range old.Accounts {
+				existing[a.Identifier()] = a
+			}
+			accounts := make([]config.Account, 0, len(accountsRaw))
+			for _, item := range accountsRaw {
+				m, ok := item.(map[string]any)
+				if !ok {
+					continue
+				}
+				acc := toAccount(m)
+				id := acc.Identifier()
+				if prev, ok := existing[id]; ok {
+					if strings.TrimSpace(acc.Password) == "" {
+						acc.Password = prev.Password
+					}
+					if strings.TrimSpace(acc.Token) == "" {
+						acc.Token = prev.Token
+					}
+				}
+				accounts = append(accounts, acc)
+			}
+			c.Accounts = accounts
+		}
+		if m, ok := req["claude_mapping"].(map[string]any); ok {
+			newMap := map[string]string{}
+			for k, v := range m {
+				newMap[k] = fmt.Sprintf("%v", v)
+			}
+			c.ClaudeMapping = newMap
+		}
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
+		return
+	}
+	h.Pool.Reset()
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "message": "配置已更新"})
+}
+
+func (h *Handler) addKey(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	_ = json.NewDecoder(r.Body).Decode(&req)
+	key, _ := req["key"].(string)
+	key = strings.TrimSpace(key)
+	if key == "" {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "Key 不能为空"})
+		return
+	}
+	err := h.Store.Update(func(c *config.Config) error {
+		for _, k := range c.Keys {
+			if k == key {
+				return fmt.Errorf("Key 已存在")
+			}
+		}
+		c.Keys = append(c.Keys, key)
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "total_keys": len(h.Store.Snapshot().Keys)})
+}
+
+func (h *Handler) deleteKey(w http.ResponseWriter, r *http.Request) {
+	key := chi.URLParam(r, "key")
+	err := h.Store.Update(func(c *config.Config) error {
+		idx := -1
+		for i, k := range c.Keys {
+			if k == key {
+				idx = i
+				break
+			}
+		}
+		if idx < 0 {
+			return fmt.Errorf("Key 不存在")
+		}
+		c.Keys = append(c.Keys[:idx], c.Keys[idx+1:]...)
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusNotFound, map[string]any{"detail": err.Error()})
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "total_keys": len(h.Store.Snapshot().Keys)})
+}
+
+func (h *Handler) batchImport(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "无效的 JSON 格式"})
+		return
+	}
+	importedKeys, importedAccounts := 0, 0
+	err := h.Store.Update(func(c *config.Config) error {
+		if keys, ok := req["keys"].([]any); ok {
+			existing := map[string]bool{}
+			for _, k := range c.Keys {
+				existing[k] = true
+			}
+			for _, k := range keys {
+				key := strings.TrimSpace(fmt.Sprintf("%v", k))
+				if key == "" || existing[key] {
+					continue
+				}
+				c.Keys = append(c.Keys, key)
+				existing[key] = true
+				importedKeys++
+			}
+		}
+		if accounts, ok := req["accounts"].([]any); ok {
+			existing := map[string]bool{}
+			for _, a := range c.Accounts {
+				existing[a.Identifier()] = true
+			}
+			for _, item := range accounts {
+				m, ok := item.(map[string]any)
+				if !ok {
+					continue
+				}
+				acc := toAccount(m)
+				id := acc.Identifier()
+				if id == "" || existing[id] {
+					continue
+				}
+				c.Accounts = append(c.Accounts, acc)
+				existing[id] = true
+				importedAccounts++
+			}
+		}
+		return nil
+	})
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
+		return
+	}
+	h.Pool.Reset()
+	writeJSON(w, http.StatusOK, map[string]any{"success": true, "imported_keys": importedKeys, "imported_accounts": importedAccounts})
+}
+
+func (h *Handler) exportConfig(w http.ResponseWriter, _ *http.Request) {
+	jsonStr, b64, err := h.Store.ExportJSONAndBase64()
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
+		return
+	}
+	writeJSON(w, http.StatusOK, map[string]any{"json": jsonStr, "base64": b64})
+}
+
+func (h *Handler) computeSyncHash() string {
+	snap := h.Store.Snapshot()
+	syncable := map[string]any{"keys": snap.Keys, "accounts": []map[string]any{}}
+	accounts := make([]map[string]any, 0, len(snap.Accounts))
+	for _, a := range snap.Accounts {
+		m := map[string]any{}
+		if a.Email != "" {
+			m["email"] = a.Email
+		}
+		if a.Mobile != "" {
+			m["mobile"] = a.Mobile
+		}
+		if a.Password != "" {
+			m["password"] = a.Password
+		}
+		accounts = append(accounts, m)
+	}
+	sort.Slice(accounts, func(i, j int) bool {
+		ai := fmt.Sprintf("%v%v", accounts[i]["email"], accounts[i]["mobile"])
+		aj := fmt.Sprintf("%v%v", accounts[j]["email"], accounts[j]["mobile"])
+		return ai < aj
+	})
+	syncable["accounts"] = accounts
+	b, _ := json.Marshal(syncable)
+	sum := md5.Sum(b)
+	return fmt.Sprintf("%x", sum)
+}
--- a/internal/admin/handler_test.go
+++ b/internal/admin/handler_test.go
@@ -0,0 +1,93 @@
+package admin
+
+import (
+	"sync/atomic"
+	"testing"
+	"time"
+
+	"ds2api/internal/config"
+)
+
+func TestToAccountMissingFieldsRemainEmpty(t *testing.T) {
+	acc := toAccount(map[string]any{
+		"email":    "user@example.com",
+		"password": "secret",
+	})
+	if acc.Email != "user@example.com" {
+		t.Fatalf("unexpected email: %q", acc.Email)
+	}
+	if acc.Mobile != "" {
+		t.Fatalf("expected empty mobile, got %q", acc.Mobile)
+	}
+	if acc.Token != "" {
+		t.Fatalf("expected empty token, got %q", acc.Token)
+	}
+}
+
+func TestFieldStringNilToEmpty(t *testing.T) {
+	if got := fieldString(map[string]any{"token": nil}, "token"); got != "" {
+		t.Fatalf("expected empty string for nil field, got %q", got)
+	}
+	if got := fieldString(map[string]any{}, "token"); got != "" {
+		t.Fatalf("expected empty string for missing field, got %q", got)
+	}
+}
+
+func TestRunAccountTestsConcurrentlyKeepsInputOrder(t *testing.T) {
+	accounts := []config.Account{
+		{Email: "a@example.com"},
+		{Email: "b@example.com"},
+		{Email: "c@example.com"},
+	}
+	results := runAccountTestsConcurrently(accounts, 2, func(idx int, acc config.Account) map[string]any {
+		return map[string]any{
+			"idx":     idx,
+			"account": acc.Identifier(),
+		}
+	})
+	if len(results) != len(accounts) {
+		t.Fatalf("unexpected result length: got %d want %d", len(results), len(accounts))
+	}
+	for i := range accounts {
+		gotIdx, _ := results[i]["idx"].(int)
+		if gotIdx != i {
+			t.Fatalf("result index mismatch at %d: got %d", i, gotIdx)
+		}
+		gotID, _ := results[i]["account"].(string)
+		if gotID != accounts[i].Identifier() {
+			t.Fatalf("result order mismatch at %d: got %q want %q", i, gotID, accounts[i].Identifier())
+		}
+	}
+}
+
+func TestRunAccountTestsConcurrentlyRespectsLimit(t *testing.T) {
+	const limit = 3
+	accounts := []config.Account{
+		{Email: "1@example.com"},
+		{Email: "2@example.com"},
+		{Email: "3@example.com"},
+		{Email: "4@example.com"},
+		{Email: "5@example.com"},
+		{Email: "6@example.com"},
+	}
+	var current int32
+	var maxSeen int32
+	_ = runAccountTestsConcurrently(accounts, limit, func(_ int, _ config.Account) map[string]any {
+		c := atomic.AddInt32(&current, 1)
+		for {
+			m := atomic.LoadInt32(&maxSeen)
+			if c <= m || atomic.CompareAndSwapInt32(&maxSeen, m, c) {
+				break
+			}
+		}
+		time.Sleep(20 * time.Millisecond)
+		atomic.AddInt32(&current, -1)
+		return map[string]any{"success": true}
+	})
+	if maxSeen > limit {
+		t.Fatalf("concurrency exceeded limit: got %d > %d", maxSeen, limit)
+	}
+	if maxSeen < 2 {
+		t.Fatalf("expected concurrent execution, max seen %d", maxSeen)
+	}
+}
--- a/internal/admin/handler_vercel.go
+++ b/internal/admin/handler_vercel.go
@@ -0,0 +1,197 @@
+package admin
+
+import (
+	"bytes"
+	"context"
+	"encoding/base64"
+	"encoding/json"
+	"io"
+	"net/http"
+	"net/url"
+	"os"
+	"strings"
+	"time"
+)
+
+func (h *Handler) syncVercel(w http.ResponseWriter, r *http.Request) {
+	var req map[string]any
+	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "invalid json"})
+		return
+	}
+	vercelToken, _ := req["vercel_token"].(string)
+	projectID, _ := req["project_id"].(string)
+	teamID, _ := req["team_id"].(string)
+	autoValidate := true
+	if v, ok := req["auto_validate"].(bool); ok {
+		autoValidate = v
+	}
+	saveCreds := true
+	if v, ok := req["save_credentials"].(bool); ok {
+		saveCreds = v
+	}
+	usePreconfig := vercelToken == "__USE_PRECONFIG__" || strings.TrimSpace(vercelToken) == ""
+	if usePreconfig {
+		vercelToken = strings.TrimSpace(os.Getenv("VERCEL_TOKEN"))
+	}
+	if strings.TrimSpace(projectID) == "" {
+		projectID = strings.TrimSpace(os.Getenv("VERCEL_PROJECT_ID"))
+	}
+	if strings.TrimSpace(teamID) == "" {
+		teamID = strings.TrimSpace(os.Getenv("VERCEL_TEAM_ID"))
+	}
+	if vercelToken == "" || projectID == "" {
+		writeJSON(w, http.StatusBadRequest, map[string]any{"detail": "需要 Vercel Token 和 Project ID"})
+		return
+	}
+	validated, failed := 0, []string{}
+	if autoValidate {
+		for _, acc := range h.Store.Snapshot().Accounts {
+			if strings.TrimSpace(acc.Token) != "" {
+				continue
+			}
+			token, err := h.DS.Login(r.Context(), acc)
+			if err != nil {
+				failed = append(failed, acc.Identifier())
+			} else {
+				validated++
+				_ = h.Store.UpdateAccountToken(acc.Identifier(), token)
+			}
+			time.Sleep(500 * time.Millisecond)
+		}
+	}
+
+	cfgJSON, _, err := h.Store.ExportJSONAndBase64()
+	if err != nil {
+		writeJSON(w, http.StatusInternalServerError, map[string]any{"detail": err.Error()})
+		return
+	}
+	cfgB64 := base64.StdEncoding.EncodeToString([]byte(cfgJSON))
+	client := &http.Client{Timeout: 30 * time.Second}
+	params := url.Values{}
+	if teamID != "" {
+		params.Set("teamId", teamID)
+	}
+	headers := map[string]string{"Authorization": "Bearer " + vercelToken}
+	envResp, status, err := vercelRequest(r.Context(), client, http.MethodGet, "https://api.vercel.com/v9/projects/"+projectID+"/env", params, headers, nil)
+	if err != nil || status != http.StatusOK {
+		writeJSON(w, statusOr(status, http.StatusInternalServerError), map[string]any{"detail": "获取环境变量失败"})
+		return
+	}
+	envs, _ := envResp["envs"].([]any)
+	existingEnvID := findEnvID(envs, "DS2API_CONFIG_JSON")
+	if existingEnvID != "" {
+		_, status, err = vercelRequest(r.Context(), client, http.MethodPatch, "https://api.vercel.com/v9/projects/"+projectID+"/env/"+existingEnvID, params, headers, map[string]any{"value": cfgB64})
+	} else {
+		_, status, err = vercelRequest(r.Context(), client, http.MethodPost, "https://api.vercel.com/v10/projects/"+projectID+"/env", params, headers, map[string]any{"key": "DS2API_CONFIG_JSON", "value": cfgB64, "type": "encrypted", "target": []string{"production", "preview"}})
+	}
+	if err != nil || (status != http.StatusOK && status != http.StatusCreated) {
+		writeJSON(w, statusOr(status, http.StatusInternalServerError), map[string]any{"detail": "更新环境变量失败"})
+		return
+	}
+	savedCreds := []string{}
+	if saveCreds && !usePreconfig {
+		creds := [][2]string{{"VERCEL_TOKEN", vercelToken}, {"VERCEL_PROJECT_ID", projectID}}
+		if teamID != "" {
+			creds = append(creds, [2]string{"VERCEL_TEAM_ID", teamID})
+		}
+		for _, kv := range creds {
+			id := findEnvID(envs, kv[0])
+			if id != "" {
+				_, status, _ = vercelRequest(r.Context(), client, http.MethodPatch, "https://api.vercel.com/v9/projects/"+projectID+"/env/"+id, params, headers, map[string]any{"value": kv[1]})
+			} else {
+				_, status, _ = vercelRequest(r.Context(), client, http.MethodPost, "https://api.vercel.com/v10/projects/"+projectID+"/env", params, headers, map[string]any{"key": kv[0], "value": kv[1], "type": "encrypted", "target": []string{"production", "preview"}})
+			}
+			if status == http.StatusOK || status == http.StatusCreated {
+				savedCreds = append(savedCreds, kv[0])
+			}
+		}
+	}
+	projectResp, status, _ := vercelRequest(r.Context(), client, http.MethodGet, "https://api.vercel.com/v9/projects/"+projectID, params, headers, nil)
+	manual := true
+	deployURL := ""
+	if status == http.StatusOK {
+		if link, ok := projectResp["link"].(map[string]any); ok {
+			if linkType, _ := link["type"].(string); linkType == "github" {
+				repoID := intFrom(link["repoId"])
+				ref, _ := link["productionBranch"].(string)
+				if ref == "" {
+					ref = "main"
+				}
+				depResp, depStatus, _ := vercelRequest(r.Context(), client, http.MethodPost, "https://api.vercel.com/v13/deployments", params, headers, map[string]any{"name": projectID, "project": projectID, "target": "production", "gitSource": map[string]any{"type": "github", "repoId": repoID, "ref": ref}})
+				if depStatus == http.StatusOK || depStatus == http.StatusCreated {
+					deployURL, _ = depResp["url"].(string)
+					manual = false
+				}
+			}
+		}
+	}
+	_ = h.Store.SetVercelSync(h.computeSyncHash(), time.Now().Unix())
+	result := map[string]any{"success": true, "validated_accounts": validated}
+	if manual {
+		result["message"] = "配置已同步到 Vercel，请手动触发重新部署"
+		result["manual_deploy_required"] = true
+	} else {
+		result["message"] = "配置已同步，正在重新部署..."
+		result["deployment_url"] = deployURL
+	}
+	if len(failed) > 0 {
+		result["failed_accounts"] = failed
+	}
+	if len(savedCreds) > 0 {
+		result["saved_credentials"] = savedCreds
+	}
+	writeJSON(w, http.StatusOK, result)
+}
+
+func (h *Handler) vercelStatus(w http.ResponseWriter, _ *http.Request) {
+	snap := h.Store.Snapshot()
+	current := h.computeSyncHash()
+	synced := snap.VercelSyncHash != "" && snap.VercelSyncHash == current
+	writeJSON(w, http.StatusOK, map[string]any{"synced": synced, "last_sync_time": nilIfZero(snap.VercelSyncTime), "has_synced_before": snap.VercelSyncHash != ""})
+}
+
+func vercelRequest(ctx context.Context, client *http.Client, method, endpoint string, params url.Values, headers map[string]string, body any) (map[string]any, int, error) {
+	if len(params) > 0 {
+		endpoint += "?" + params.Encode()
+	}
+	var reader io.Reader
+	if body != nil {
+		b, _ := json.Marshal(body)
+		reader = bytes.NewReader(b)
+	}
+	req, err := http.NewRequestWithContext(ctx, method, endpoint, reader)
+	if err != nil {
+		return nil, 0, err
+	}
+	for k, v := range headers {
+		req.Header.Set(k, v)
+	}
+	req.Header.Set("Content-Type", "application/json")
+	resp, err := client.Do(req)
+	if err != nil {
+		return nil, 0, err
+	}
+	defer resp.Body.Close()
+	b, _ := io.ReadAll(resp.Body)
+	parsed := map[string]any{}
+	_ = json.Unmarshal(b, &parsed)
+	if len(parsed) == 0 {
+		parsed["raw"] = string(b)
+	}
+	return parsed, resp.StatusCode, nil
+}
+
+func findEnvID(envs []any, key string) string {
+	for _, item := range envs {
+		m, ok := item.(map[string]any)
+		if !ok {
+			continue
+		}
+		if k, _ := m["key"].(string); k == key {
+			id, _ := m["id"].(string)
+			return id
+		}
+	}
+	return ""
+}
--- a/internal/admin/helpers.go
+++ b/internal/admin/helpers.go
@@ -0,0 +1,83 @@
+package admin
+
+import (
+	"fmt"
+	"net/http"
+	"strconv"
+	"strings"
+
+	"ds2api/internal/config"
+	"ds2api/internal/util"
+)
+
+// writeJSON and intFrom are package-internal aliases for the shared util versions.
+var writeJSON = util.WriteJSON
+var intFrom = util.IntFrom
+
+func reverseAccounts(a []config.Account) {
+	for i, j := 0, len(a)-1; i < j; i, j = i+1, j-1 {
+		a[i], a[j] = a[j], a[i]
+	}
+}
+
+func intFromQuery(r *http.Request, key string, d int) int {
+	v := r.URL.Query().Get(key)
+	if v == "" {
+		return d
+	}
+	n, err := strconv.Atoi(v)
+	if err != nil {
+		return d
+	}
+	return n
+}
+
+func nilIfEmpty(s string) any {
+	if s == "" {
+		return nil
+	}
+	return s
+}
+
+func nilIfZero(v int64) any {
+	if v == 0 {
+		return nil
+	}
+	return v
+}
+
+func toStringSlice(v any) ([]string, bool) {
+	arr, ok := v.([]any)
+	if !ok {
+		return nil, false
+	}
+	out := make([]string, 0, len(arr))
+	for _, item := range arr {
+		out = append(out, strings.TrimSpace(fmt.Sprintf("%v", item)))
+	}
+	return out, true
+}
+
+func toAccount(m map[string]any) config.Account {
+	return config.Account{
+		Email:    fieldString(m, "email"),
+		Mobile:   fieldString(m, "mobile"),
+		Password: fieldString(m, "password"),
+		Token:    fieldString(m, "token"),
+	}
+}
+
+func fieldString(m map[string]any, key string) string {
+	v, ok := m[key]
+	if !ok || v == nil {
+		return ""
+	}
+	return strings.TrimSpace(fmt.Sprintf("%v", v))
+}
+
+func statusOr(v int, d int) int {
+	if v == 0 {
+		return d
+	}
+	return v
+}
--- a/internal/auth/admin.go
+++ b/internal/auth/admin.go
@@ -0,0 +1,120 @@
+package auth
+
+import (
+	"crypto/hmac"
+	"crypto/sha256"
+	"encoding/base64"
+	"encoding/json"
+	"errors"
+	"log/slog"
+	"net/http"
+	"os"
+	"strconv"
+	"strings"
+	"sync"
+	"time"
+)
+
+var warnOnce sync.Once
+
+func AdminKey() string {
+	if v := strings.TrimSpace(os.Getenv("DS2API_ADMIN_KEY")); v != "" {
+		return v
+	}
+	warnOnce.Do(func() {
+		slog.Warn("⚠️  DS2API_ADMIN_KEY is not set! Using insecure default \"admin\". Set a strong key in production!")
+	})
+	return "admin"
+}
+
+func jwtSecret() string {
+	if v := strings.TrimSpace(os.Getenv("DS2API_JWT_SECRET")); v != "" {
+		return v
+	}
+	return AdminKey()
+}
+
+func jwtExpireHours() int {
+	if v := strings.TrimSpace(os.Getenv("DS2API_JWT_EXPIRE_HOURS")); v != "" {
+		if n, err := strconv.Atoi(v); err == nil && n > 0 {
+			return n
+		}
+	}
+	return 24
+}
+
+func CreateJWT(expireHours int) (string, error) {
+	if expireHours <= 0 {
+		expireHours = jwtExpireHours()
+	}
+	header := map[string]any{"alg": "HS256", "typ": "JWT"}
+	payload := map[string]any{"iat": time.Now().Unix(), "exp": time.Now().Add(time.Duration(expireHours) * time.Hour).Unix(), "role": "admin"}
+	h, _ := json.Marshal(header)
+	p, _ := json.Marshal(payload)
+	headerB64 := rawB64Encode(h)
+	payloadB64 := rawB64Encode(p)
+	msg := headerB64 + "." + payloadB64
+	sig := signHS256(msg)
+	return msg + "." + rawB64Encode(sig), nil
+}
+
+func VerifyJWT(token string) (map[string]any, error) {
+	parts := strings.Split(token, ".")
+	if len(parts) != 3 {
+		return nil, errors.New("invalid token format")
+	}
+	msg := parts[0] + "." + parts[1]
+	expected := signHS256(msg)
+	actual, err := rawB64Decode(parts[2])
+	if err != nil {
+		return nil, errors.New("invalid signature")
+	}
+	if !hmac.Equal(expected, actual) {
+		return nil, errors.New("invalid signature")
+	}
+	payloadBytes, err := rawB64Decode(parts[1])
+	if err != nil {
+		return nil, errors.New("invalid payload")
+	}
+	var payload map[string]any
+	if err := json.Unmarshal(payloadBytes, &payload); err != nil {
+		return nil, errors.New("invalid payload")
+	}
+	exp, _ := payload["exp"].(float64)
+	if int64(exp) < time.Now().Unix() {
+		return nil, errors.New("token expired")
+	}
+	return payload, nil
+}
+
+func VerifyAdminRequest(r *http.Request) error {
+	authHeader := strings.TrimSpace(r.Header.Get("Authorization"))
+	if !strings.HasPrefix(strings.ToLower(authHeader), "bearer ") {
+		return errors.New("authentication required")
+	}
+	token := strings.TrimSpace(authHeader[7:])
+	if token == "" {
+		return errors.New("authentication required")
+	}
+	if token == AdminKey() {
+		return nil
+	}
+	if _, err := VerifyJWT(token); err == nil {
+		return nil
+	}
+	return errors.New("invalid credentials")
+}
+
+func signHS256(msg string) []byte {
+	h := hmac.New(sha256.New, []byte(jwtSecret()))
+	_, _ = h.Write([]byte(msg))
+	return h.Sum(nil)
+}
+
+func rawB64Encode(b []byte) string {
+	return base64.RawURLEncoding.EncodeToString(b)
+}
+
+func rawB64Decode(s string) ([]byte, error) {
+	return base64.RawURLEncoding.DecodeString(s)
+}
--- a/internal/auth/admin_test.go
+++ b/internal/auth/admin_test.go
@@ -0,0 +1,29 @@
+package auth
+
+import (
+	"net/http"
+	"testing"
+)
+
+func TestJWTCreateVerify(t *testing.T) {
+	token, err := CreateJWT(1)
+	if err != nil {
+		t.Fatalf("create jwt failed: %v", err)
+	}
+	payload, err := VerifyJWT(token)
+	if err != nil {
+		t.Fatalf("verify jwt failed: %v", err)
+	}
+	if payload["role"] != "admin" {
+		t.Fatalf("unexpected payload: %#v", payload)
+	}
+}
+
+func TestVerifyAdminRequest(t *testing.T) {
+	token, _ := CreateJWT(1)
+	req, _ := http.NewRequest(http.MethodGet, "/admin/config", nil)
+	req.Header.Set("Authorization", "Bearer "+token)
+	if err := VerifyAdminRequest(req); err != nil {
+		t.Fatalf("expected token accepted: %v", err)
+	}
+}
--- a/internal/auth/request.go
+++ b/internal/auth/request.go
@@ -0,0 +1,160 @@
+package auth
+
+import (
+	"context"
+	"errors"
+	"net/http"
+	"strings"
+
+	"ds2api/internal/account"
+	"ds2api/internal/config"
+)
+
+type ctxKey string
+
+const authCtxKey ctxKey = "auth_context"
+
+var (
+	ErrUnauthorized = errors.New("unauthorized: missing auth token")
+	ErrNoAccount    = errors.New("no accounts configured or all accounts are busy")
+)
+
+type RequestAuth struct {
+	UseConfigToken bool
+	DeepSeekToken  string
+	AccountID      string
+	Account        config.Account
+	TriedAccounts  map[string]bool
+	resolver       *Resolver
+}
+
+type LoginFunc func(ctx context.Context, acc config.Account) (string, error)
+
+type Resolver struct {
+	Store *config.Store
+	Pool  *account.Pool
+	Login LoginFunc
+}
+
+func NewResolver(store *config.Store, pool *account.Pool, login LoginFunc) *Resolver {
+	return &Resolver{Store: store, Pool: pool, Login: login}
+}
+
+func (r *Resolver) Determine(req *http.Request) (*RequestAuth, error) {
+	callerKey := extractCallerToken(req)
+	if callerKey == "" {
+		return nil, ErrUnauthorized
+	}
+	ctx := req.Context()
+	if !r.Store.HasAPIKey(callerKey) {
+		return &RequestAuth{UseConfigToken: false, DeepSeekToken: callerKey, resolver: r, TriedAccounts: map[string]bool{}}, nil
+	}
+	target := strings.TrimSpace(req.Header.Get("X-Ds2-Target-Account"))
+	acc, ok := r.Pool.AcquireWait(ctx, target, nil)
+	if !ok {
+		return nil, ErrNoAccount
+	}
+	a := &RequestAuth{
+		UseConfigToken: true,
+		AccountID:      acc.Identifier(),
+		Account:        acc,
+		TriedAccounts:  map[string]bool{},
+		resolver:       r,
+	}
+	if acc.Token == "" {
+		if err := r.loginAndPersist(ctx, a); err != nil {
+			r.Pool.Release(a.AccountID)
+			return nil, err
+		}
+	} else {
+		a.DeepSeekToken = acc.Token
+	}
+	return a, nil
+}
+
+func WithAuth(ctx context.Context, a *RequestAuth) context.Context {
+	return context.WithValue(ctx, authCtxKey, a)
+}
+
+func FromContext(ctx context.Context) (*RequestAuth, bool) {
+	v := ctx.Value(authCtxKey)
+	a, ok := v.(*RequestAuth)
+	return a, ok
+}
+
+func (r *Resolver) loginAndPersist(ctx context.Context, a *RequestAuth) error {
+	token, err := r.Login(ctx, a.Account)
+	if err != nil {
+		return err
+	}
+	a.Account.Token = token
+	a.DeepSeekToken = token
+	return r.Store.UpdateAccountToken(a.AccountID, token)
+}
+
+func (r *Resolver) RefreshToken(ctx context.Context, a *RequestAuth) bool {
+	if !a.UseConfigToken || a.AccountID == "" {
+		return false
+	}
+	_ = r.Store.UpdateAccountToken(a.AccountID, "")
+	a.Account.Token = ""
+	if err := r.loginAndPersist(ctx, a); err != nil {
+		config.Logger.Error("[refresh_token] failed", "account", a.AccountID, "error", err)
+		return false
+	}
+	return true
+}
+
+func (r *Resolver) MarkTokenInvalid(a *RequestAuth) {
+	if !a.UseConfigToken || a.AccountID == "" {
+		return
+	}
+	a.Account.Token = ""
+	a.DeepSeekToken = ""
+	_ = r.Store.UpdateAccountToken(a.AccountID, "")
+}
+
+func (r *Resolver) SwitchAccount(ctx context.Context, a *RequestAuth) bool {
+	if !a.UseConfigToken {
+		return false
+	}
+	if a.TriedAccounts == nil {
+		a.TriedAccounts = map[string]bool{}
+	}
+	if a.AccountID != "" {
+		a.TriedAccounts[a.AccountID] = true
+		r.Pool.Release(a.AccountID)
+	}
+	acc, ok := r.Pool.Acquire("", a.TriedAccounts)
+	if !ok {
+		return false
+	}
+	a.Account = acc
+	a.AccountID = acc.Identifier()
+	if acc.Token == "" {
+		if err := r.loginAndPersist(ctx, a); err != nil {
+			return false
+		}
+	} else {
+		a.DeepSeekToken = acc.Token
+	}
+	return true
+}
+
+func (r *Resolver) Release(a *RequestAuth) {
+	if a == nil || !a.UseConfigToken || a.AccountID == "" {
+		return
+	}
+	r.Pool.Release(a.AccountID)
+}
+
+func extractCallerToken(req *http.Request) string {
+	authHeader := strings.TrimSpace(req.Header.Get("Authorization"))
+	if strings.HasPrefix(strings.ToLower(authHeader), "bearer ") {
+		token := strings.TrimSpace(authHeader[7:])
+		if token != "" {
+			return token
+		}
+	}
+	return strings.TrimSpace(req.Header.Get("x-api-key"))
+}
--- a/internal/auth/request_test.go
+++ b/internal/auth/request_test.go
@@ -0,0 +1,74 @@
+package auth
+
+import (
+	"context"
+	"net/http"
+	"testing"
+
+	"ds2api/internal/account"
+	"ds2api/internal/config"
+)
+
+func newTestResolver(t *testing.T) *Resolver {
+	t.Helper()
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["managed-key"],
+		"accounts":[{"email":"acc@example.com","password":"pwd","token":"account-token"}]
+	}`)
+	store := config.LoadStore()
+	pool := account.NewPool(store)
+	return NewResolver(store, pool, func(_ context.Context, _ config.Account) (string, error) {
+		return "fresh-token", nil
+	})
+}
+
+func TestDetermineWithXAPIKeyUsesDirectToken(t *testing.T) {
+	r := newTestResolver(t)
+	req, _ := http.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+	req.Header.Set("x-api-key", "direct-token")
+
+	auth, err := r.Determine(req)
+	if err != nil {
+		t.Fatalf("determine failed: %v", err)
+	}
+	if auth.UseConfigToken {
+		t.Fatalf("expected direct token mode")
+	}
+	if auth.DeepSeekToken != "direct-token" {
+		t.Fatalf("unexpected token: %q", auth.DeepSeekToken)
+	}
+}
+
+func TestDetermineWithXAPIKeyManagedKeyAcquiresAccount(t *testing.T) {
+	r := newTestResolver(t)
+	req, _ := http.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
+	req.Header.Set("x-api-key", "managed-key")
+
+	auth, err := r.Determine(req)
+	if err != nil {
+		t.Fatalf("determine failed: %v", err)
+	}
+	defer r.Release(auth)
+	if !auth.UseConfigToken {
+		t.Fatalf("expected managed key mode")
+	}
+	if auth.AccountID != "acc@example.com" {
+		t.Fatalf("unexpected account id: %q", auth.AccountID)
+	}
+	if auth.DeepSeekToken != "account-token" {
+		t.Fatalf("unexpected account token: %q", auth.DeepSeekToken)
+	}
+}
+
+func TestDetermineMissingToken(t *testing.T) {
+	r := newTestResolver(t)
+	req, _ := http.NewRequest(http.MethodPost, "/v1/chat/completions", nil)
+
+	_, err := r.Determine(req)
+	if err == nil {
+		t.Fatal("expected unauthorized error")
+	}
+	if err != ErrUnauthorized {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -0,0 +1,415 @@
+package config
+
+import (
+	"crypto/sha256"
+	"encoding/base64"
+	"encoding/hex"
+	"encoding/json"
+	"errors"
+	"log/slog"
+	"os"
+	"path/filepath"
+	"slices"
+	"strings"
+	"sync"
+)
+
+var Logger = newLogger()
+
+func newLogger() *slog.Logger {
+	level := new(slog.LevelVar)
+	switch strings.ToUpper(strings.TrimSpace(os.Getenv("LOG_LEVEL"))) {
+	case "DEBUG":
+		level.Set(slog.LevelDebug)
+	case "WARN":
+		level.Set(slog.LevelWarn)
+	case "ERROR":
+		level.Set(slog.LevelError)
+	default:
+		level.Set(slog.LevelInfo)
+	}
+	h := slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{Level: level})
+	return slog.New(h)
+}
+
+type Account struct {
+	Email    string `json:"email,omitempty"`
+	Mobile   string `json:"mobile,omitempty"`
+	Password string `json:"password,omitempty"`
+	Token    string `json:"token,omitempty"`
+}
+
+func (a Account) Identifier() string {
+	if strings.TrimSpace(a.Email) != "" {
+		return strings.TrimSpace(a.Email)
+	}
+	if strings.TrimSpace(a.Mobile) != "" {
+		return strings.TrimSpace(a.Mobile)
+	}
+	// Backward compatibility: old configs may contain token-only accounts.
+	// Use a stable non-sensitive synthetic id so they can still join the pool.
+	token := strings.TrimSpace(a.Token)
+	if token == "" {
+		return ""
+	}
+	sum := sha256.Sum256([]byte(token))
+	return "token:" + hex.EncodeToString(sum[:8])
+}
+
+type Config struct {
+	Keys             []string          `json:"keys,omitempty"`
+	Accounts         []Account         `json:"accounts,omitempty"`
+	ClaudeMapping    map[string]string `json:"claude_mapping,omitempty"`
+	ClaudeModelMap   map[string]string `json:"claude_model_mapping,omitempty"`
+	VercelSyncHash   string            `json:"_vercel_sync_hash,omitempty"`
+	VercelSyncTime   int64             `json:"_vercel_sync_time,omitempty"`
+	AdditionalFields map[string]any    `json:"-"`
+}
+
+func (c Config) MarshalJSON() ([]byte, error) {
+	m := map[string]any{}
+	for k, v := range c.AdditionalFields {
+		m[k] = v
+	}
+	if len(c.Keys) > 0 {
+		m["keys"] = c.Keys
+	}
+	if len(c.Accounts) > 0 {
+		m["accounts"] = c.Accounts
+	}
+	if len(c.ClaudeMapping) > 0 {
+		m["claude_mapping"] = c.ClaudeMapping
+	}
+	if len(c.ClaudeModelMap) > 0 {
+		m["claude_model_mapping"] = c.ClaudeModelMap
+	}
+	if c.VercelSyncHash != "" {
+		m["_vercel_sync_hash"] = c.VercelSyncHash
+	}
+	if c.VercelSyncTime != 0 {
+		m["_vercel_sync_time"] = c.VercelSyncTime
+	}
+	return json.Marshal(m)
+}
+
+func (c *Config) UnmarshalJSON(b []byte) error {
+	raw := map[string]json.RawMessage{}
+	if err := json.Unmarshal(b, &raw); err != nil {
+		return err
+	}
+	c.AdditionalFields = map[string]any{}
+	for k, v := range raw {
+		switch k {
+		case "keys":
+			_ = json.Unmarshal(v, &c.Keys)
+		case "accounts":
+			_ = json.Unmarshal(v, &c.Accounts)
+		case "claude_mapping":
+			_ = json.Unmarshal(v, &c.ClaudeMapping)
+		case "claude_model_mapping":
+			_ = json.Unmarshal(v, &c.ClaudeModelMap)
+		case "_vercel_sync_hash":
+			_ = json.Unmarshal(v, &c.VercelSyncHash)
+		case "_vercel_sync_time":
+			_ = json.Unmarshal(v, &c.VercelSyncTime)
+		default:
+			var anyVal any
+			if err := json.Unmarshal(v, &anyVal); err == nil {
+				c.AdditionalFields[k] = anyVal
+			}
+		}
+	}
+	return nil
+}
+
+func (c Config) Clone() Config {
+	clone := Config{
+		Keys:             slices.Clone(c.Keys),
+		Accounts:         slices.Clone(c.Accounts),
+		ClaudeMapping:    cloneStringMap(c.ClaudeMapping),
+		ClaudeModelMap:   cloneStringMap(c.ClaudeModelMap),
+		VercelSyncHash:   c.VercelSyncHash,
+		VercelSyncTime:   c.VercelSyncTime,
+		AdditionalFields: map[string]any{},
+	}
+	for k, v := range c.AdditionalFields {
+		clone.AdditionalFields[k] = v
+	}
+	return clone
+}
+
+func cloneStringMap(in map[string]string) map[string]string {
+	if len(in) == 0 {
+		return nil
+	}
+	out := make(map[string]string, len(in))
+	for k, v := range in {
+		out[k] = v
+	}
+	return out
+}
+
+type Store struct {
+	mu      sync.RWMutex
+	cfg     Config
+	path    string
+	fromEnv bool
+	keyMap  map[string]struct{} // O(1) API key lookup index
+	accMap  map[string]int      // O(1) account lookup: identifier -> slice index
+}
+
+func BaseDir() string {
+	cwd, err := os.Getwd()
+	if err != nil {
+		return "."
+	}
+	return cwd
+}
+
+func IsVercel() bool {
+	return strings.TrimSpace(os.Getenv("VERCEL")) != "" || strings.TrimSpace(os.Getenv("NOW_REGION")) != ""
+}
+
+func ResolvePath(envKey, defaultRel string) string {
+	raw := strings.TrimSpace(os.Getenv(envKey))
+	if raw != "" {
+		if filepath.IsAbs(raw) {
+			return raw
+		}
+		return filepath.Join(BaseDir(), raw)
+	}
+	return filepath.Join(BaseDir(), defaultRel)
+}
+
+func ConfigPath() string {
+	return ResolvePath("DS2API_CONFIG_PATH", "config.json")
+}
+
+func WASMPath() string {
+	return ResolvePath("DS2API_WASM_PATH", "sha3_wasm_bg.7b9ca65ddd.wasm")
+}
+
+func StaticAdminDir() string {
+	return ResolvePath("DS2API_STATIC_ADMIN_DIR", "static/admin")
+}
+
+func LoadStore() *Store {
+	cfg, fromEnv, err := loadConfig()
+	if err != nil {
+		Logger.Warn("[config] load failed", "error", err)
+	}
+	if len(cfg.Keys) == 0 && len(cfg.Accounts) == 0 {
+		Logger.Warn("[config] empty config loaded")
+	}
+	s := &Store{cfg: cfg, path: ConfigPath(), fromEnv: fromEnv}
+	s.rebuildIndexes()
+	return s
+}
+
+// rebuildIndexes must be called with the lock already held (or during init).
+func (s *Store) rebuildIndexes() {
+	s.keyMap = make(map[string]struct{}, len(s.cfg.Keys))
+	for _, k := range s.cfg.Keys {
+		s.keyMap[k] = struct{}{}
+	}
+	s.accMap = make(map[string]int, len(s.cfg.Accounts))
+	for i, acc := range s.cfg.Accounts {
+		id := acc.Identifier()
+		if id != "" {
+			s.accMap[id] = i
+		}
+	}
+}
+
+func loadConfig() (Config, bool, error) {
+	rawCfg := strings.TrimSpace(os.Getenv("DS2API_CONFIG_JSON"))
+	if rawCfg == "" {
+		rawCfg = strings.TrimSpace(os.Getenv("CONFIG_JSON"))
+	}
+	if rawCfg != "" {
+		cfg, err := parseConfigString(rawCfg)
+		return cfg, true, err
+	}
+
+	content, err := os.ReadFile(ConfigPath())
+	if err != nil {
+		return Config{}, false, err
+	}
+	var cfg Config
+	if err := json.Unmarshal(content, &cfg); err != nil {
+		return Config{}, false, err
+	}
+	return cfg, false, nil
+}
+
+func parseConfigString(raw string) (Config, error) {
+	var cfg Config
+	if err := json.Unmarshal([]byte(raw), &cfg); err == nil {
+		return cfg, nil
+	}
+	decoded, err := base64.StdEncoding.DecodeString(raw)
+	if err != nil {
+		return Config{}, err
+	}
+	if err := json.Unmarshal(decoded, &cfg); err != nil {
+		return Config{}, err
+	}
+	return cfg, nil
+}
+
+func (s *Store) Snapshot() Config {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	return s.cfg.Clone()
+}
+
+func (s *Store) HasAPIKey(k string) bool {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	_, ok := s.keyMap[k]
+	return ok
+}
+
+func (s *Store) Keys() []string {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	return slices.Clone(s.cfg.Keys)
+}
+
+func (s *Store) Accounts() []Account {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	return slices.Clone(s.cfg.Accounts)
+}
+
+func (s *Store) FindAccount(identifier string) (Account, bool) {
+	identifier = strings.TrimSpace(identifier)
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	if idx, ok := s.findAccountIndexLocked(identifier); ok {
+		return s.cfg.Accounts[idx], true
+	}
+	return Account{}, false
+}
+
+func (s *Store) UpdateAccountToken(identifier, token string) error {
+	identifier = strings.TrimSpace(identifier)
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	idx, ok := s.findAccountIndexLocked(identifier)
+	if !ok {
+		return errors.New("account not found")
+	}
+	oldID := s.cfg.Accounts[idx].Identifier()
+	s.cfg.Accounts[idx].Token = token
+	newID := s.cfg.Accounts[idx].Identifier()
+	// Keep historical aliases usable for long-lived queues while also adding
+	// the latest identifier after token refresh.
+	if identifier != "" {
+		s.accMap[identifier] = idx
+	}
+	if oldID != "" {
+		s.accMap[oldID] = idx
+	}
+	if newID != "" {
+		s.accMap[newID] = idx
+	}
+	return s.saveLocked()
+}
+
+func (s *Store) Replace(cfg Config) error {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	s.cfg = cfg.Clone()
+	s.rebuildIndexes()
+	return s.saveLocked()
+}
+
+func (s *Store) Update(mutator func(*Config) error) error {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	cfg := s.cfg.Clone()
+	if err := mutator(&cfg); err != nil {
+		return err
+	}
+	s.cfg = cfg
+	s.rebuildIndexes()
+	return s.saveLocked()
+}
+
+func (s *Store) Save() error {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	if s.fromEnv {
+		Logger.Info("[save_config] source from env, skip write")
+		return nil
+	}
+	b, err := json.MarshalIndent(s.cfg, "", "  ")
+	if err != nil {
+		return err
+	}
+	return os.WriteFile(s.path, b, 0o644)
+}
+
+func (s *Store) saveLocked() error {
+	if s.fromEnv {
+		Logger.Info("[save_config] source from env, skip write")
+		return nil
+	}
+	b, err := json.MarshalIndent(s.cfg, "", "  ")
+	if err != nil {
+		return err
+	}
+	return os.WriteFile(s.path, b, 0o644)
+}
+
+// findAccountIndexLocked expects the store lock to already be held.
+func (s *Store) findAccountIndexLocked(identifier string) (int, bool) {
+	if idx, ok := s.accMap[identifier]; ok && idx >= 0 && idx < len(s.cfg.Accounts) {
+		return idx, true
+	}
+	// Fallback for token-only accounts whose derived identifier changed after
+	// a token refresh; this preserves correctness on map misses.
+	for i, acc := range s.cfg.Accounts {
+		if acc.Identifier() == identifier {
+			return i, true
+		}
+	}
+	return -1, false
+}
+
+func (s *Store) IsEnvBacked() bool {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	return s.fromEnv
+}
+
+func (s *Store) SetVercelSync(hash string, ts int64) error {
+	return s.Update(func(c *Config) error {
+		c.VercelSyncHash = hash
+		c.VercelSyncTime = ts
+		return nil
+	})
+}
+
+func (s *Store) ExportJSONAndBase64() (string, string, error) {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	b, err := json.Marshal(s.cfg)
+	if err != nil {
+		return "", "", err
+	}
+	return string(b), base64.StdEncoding.EncodeToString(b), nil
+}
+
+func (s *Store) ClaudeMapping() map[string]string {
+	s.mu.RLock()
+	defer s.mu.RUnlock()
+	if len(s.cfg.ClaudeModelMap) > 0 {
+		return cloneStringMap(s.cfg.ClaudeModelMap)
+	}
+	if len(s.cfg.ClaudeMapping) > 0 {
+		return cloneStringMap(s.cfg.ClaudeMapping)
+	}
+	return map[string]string{"fast": "deepseek-chat", "slow": "deepseek-reasoner"}
+}
--- a/internal/config/config_test.go
+++ b/internal/config/config_test.go
@@ -0,0 +1,72 @@
+package config
+
+import (
+	"strings"
+	"testing"
+)
+
+func TestAccountIdentifierFallsBackToTokenHash(t *testing.T) {
+	acc := Account{Token: "example-token-value"}
+	id := acc.Identifier()
+	if !strings.HasPrefix(id, "token:") {
+		t.Fatalf("expected token-prefixed identifier, got %q", id)
+	}
+	if len(id) != len("token:")+16 {
+		t.Fatalf("unexpected identifier length: %d (%q)", len(id), id)
+	}
+}
+
+func TestStoreFindAccountWithTokenOnlyIdentifier(t *testing.T) {
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"keys":["k1"],
+		"accounts":[{"token":"token-only-account"}]
+	}`)
+
+	store := LoadStore()
+	accounts := store.Accounts()
+	if len(accounts) != 1 {
+		t.Fatalf("expected 1 account, got %d", len(accounts))
+	}
+	id := accounts[0].Identifier()
+	if id == "" {
+		t.Fatalf("expected synthetic identifier for token-only account")
+	}
+	found, ok := store.FindAccount(id)
+	if !ok {
+		t.Fatalf("expected FindAccount to locate token-only account by synthetic id")
+	}
+	if found.Token != "token-only-account" {
+		t.Fatalf("unexpected token value: %q", found.Token)
+	}
+}
+
+func TestStoreUpdateAccountTokenKeepsOldAndNewIdentifierResolvable(t *testing.T) {
+	t.Setenv("DS2API_CONFIG_JSON", `{
+		"accounts":[{"token":"old-token"}]
+	}`)
+
+	store := LoadStore()
+	before := store.Accounts()
+	if len(before) != 1 {
+		t.Fatalf("expected 1 account, got %d", len(before))
+	}
+	oldID := before[0].Identifier()
+	if oldID == "" {
+		t.Fatal("expected old identifier")
+	}
+	if err := store.UpdateAccountToken(oldID, "new-token"); err != nil {
+		t.Fatalf("update token failed: %v", err)
+	}
+
+	after := store.Accounts()
+	newID := after[0].Identifier()
+	if newID == "" || newID == oldID {
+		t.Fatalf("expected changed identifier, old=%q new=%q", oldID, newID)
+	}
+	if got, ok := store.FindAccount(newID); !ok || got.Token != "new-token" {
+		t.Fatalf("expected find by new identifier")
+	}
+	if got, ok := store.FindAccount(oldID); !ok || got.Token != "new-token" {
+		t.Fatalf("expected find by old identifier alias")
+	}
+}
--- a/internal/config/models.go
+++ b/internal/config/models.go
@@ -0,0 +1,90 @@
+package config
+
+type ModelInfo struct {
+	ID         string `json:"id"`
+	Object     string `json:"object"`
+	Created    int64  `json:"created"`
+	OwnedBy    string `json:"owned_by"`
+	Permission []any  `json:"permission,omitempty"`
+}
+
+var DeepSeekModels = []ModelInfo{
+	{ID: "deepseek-chat", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
+	{ID: "deepseek-reasoner", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
+	{ID: "deepseek-chat-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
+	{ID: "deepseek-reasoner-search", Object: "model", Created: 1677610602, OwnedBy: "deepseek", Permission: []any{}},
+}
+
+var ClaudeModels = []ModelInfo{
+	// Current aliases
+	{ID: "claude-opus-4-6", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-sonnet-4-5", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-haiku-4-5", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+
+	// Current snapshots
+	{ID: "claude-opus-4-5-20251101", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-opus-4-1", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-opus-4-1-20250805", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-opus-4-0", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-opus-4-20250514", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-sonnet-4-5-20250929", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-sonnet-4-0", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-sonnet-4-20250514", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-haiku-4-5-20251001", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+
+	// Claude 3.x (legacy/deprecated snapshots and aliases)
+	{ID: "claude-3-7-sonnet-latest", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-7-sonnet-20250219", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-5-sonnet-latest", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-5-sonnet-20240620", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-5-sonnet-20241022", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-opus-20240229", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-sonnet-20240229", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-5-haiku-latest", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-5-haiku-20241022", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-3-haiku-20240307", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+
+	// Claude 2.x and 1.x (retired but accepted for compatibility)
+	{ID: "claude-2.1", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-2.0", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-1.3", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-1.2", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-1.1", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-1.0", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-instant-1.2", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-instant-1.1", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+	{ID: "claude-instant-1.0", Object: "model", Created: 1715635200, OwnedBy: "anthropic"},
+}
+
+func GetModelConfig(model string) (thinking bool, search bool, ok bool) {
+	switch lower(model) {
+	case "deepseek-chat":
+		return false, false, true
+	case "deepseek-reasoner":
+		return true, false, true
+	case "deepseek-chat-search":
+		return false, true, true
+	case "deepseek-reasoner-search":
+		return true, true, true
+	default:
+		return false, false, false
+	}
+}
+
+func lower(s string) string {
+	b := []byte(s)
+	for i, c := range b {
+		if c >= 'A' && c <= 'Z' {
+			b[i] = c + 32
+		}
+	}
+	return string(b)
+}
+
+func OpenAIModelsResponse() map[string]any {
+	return map[string]any{"object": "list", "data": DeepSeekModels}
+}
+
+func ClaudeModelsResponse() map[string]any {
+	return map[string]any{"object": "list", "data": ClaudeModels}
+}
--- a/internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm
+++ b/internal/deepseek/assets/sha3_wasm_bg.7b9ca65ddd.wasm
--- a/internal/deepseek/client.go
+++ b/internal/deepseek/client.go
@@ -0,0 +1,337 @@
+package deepseek
+
+import (
+	"bufio"
+	"bytes"
+	"compress/gzip"
+	"context"
+	"encoding/json"
+	"errors"
+	"fmt"
+	"io"
+	"net/http"
+	"strings"
+	"time"
+
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	trans "ds2api/internal/deepseek/transport"
+	"ds2api/internal/util"
+
+	"github.com/andybalholm/brotli"
+)
+
+// intFrom is a package-internal alias for the shared util version.
+var intFrom = util.IntFrom
+
+type Client struct {
+	Store      *config.Store
+	Auth       *auth.Resolver
+	regular    trans.Doer
+	stream     trans.Doer
+	fallback   *http.Client
+	fallbackS  *http.Client
+	powSolver  *PowSolver
+	maxRetries int
+}
+
+func NewClient(store *config.Store, resolver *auth.Resolver) *Client {
+	return &Client{
+		Store:      store,
+		Auth:       resolver,
+		regular:    trans.New(60 * time.Second),
+		stream:     trans.New(0),
+		fallback:   &http.Client{Timeout: 60 * time.Second},
+		fallbackS:  &http.Client{Timeout: 0},
+		powSolver:  NewPowSolver(config.WASMPath()),
+		maxRetries: 3,
+	}
+}
+
+func (c *Client) PreloadPow(ctx context.Context) error {
+	return c.powSolver.init(ctx)
+}
+
+func (c *Client) Login(ctx context.Context, acc config.Account) (string, error) {
+	payload := map[string]any{
+		"password":  strings.TrimSpace(acc.Password),
+		"device_id": "deepseek_to_api",
+		"os":        "android",
+	}
+	if email := strings.TrimSpace(acc.Email); email != "" {
+		payload["email"] = email
+	} else if mobile := strings.TrimSpace(acc.Mobile); mobile != "" {
+		payload["mobile"] = mobile
+		payload["area_code"] = nil
+	} else {
+		return "", errors.New("missing email/mobile")
+	}
+	resp, err := c.postJSON(ctx, c.regular, DeepSeekLoginURL, BaseHeaders, payload)
+	if err != nil {
+		return "", err
+	}
+	code := intFrom(resp["code"])
+	if code != 0 {
+		return "", fmt.Errorf("login failed: %v", resp["msg"])
+	}
+	data, _ := resp["data"].(map[string]any)
+	if intFrom(data["biz_code"]) != 0 {
+		return "", fmt.Errorf("login failed: %v", data["biz_msg"])
+	}
+	bizData, _ := data["biz_data"].(map[string]any)
+	user, _ := bizData["user"].(map[string]any)
+	token, _ := user["token"].(string)
+	if strings.TrimSpace(token) == "" {
+		return "", errors.New("missing login token")
+	}
+	return token, nil
+}
+
+func (c *Client) CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error) {
+	if maxAttempts <= 0 {
+		maxAttempts = c.maxRetries
+	}
+	attempts := 0
+	refreshed := false
+	for attempts < maxAttempts {
+		headers := c.authHeaders(a.DeepSeekToken)
+		resp, status, err := c.postJSONWithStatus(ctx, c.regular, DeepSeekCreateSessionURL, headers, map[string]any{"agent": "chat"})
+		if err != nil {
+			config.Logger.Warn("[create_session] request error", "error", err, "account", a.AccountID)
+			attempts++
+			continue
+		}
+		code := intFrom(resp["code"])
+		if status == http.StatusOK && code == 0 {
+			data, _ := resp["data"].(map[string]any)
+			bizData, _ := data["biz_data"].(map[string]any)
+			sessionID, _ := bizData["id"].(string)
+			if sessionID != "" {
+				return sessionID, nil
+			}
+		}
+		msg, _ := resp["msg"].(string)
+		config.Logger.Warn("[create_session] failed", "status", status, "code", code, "msg", msg, "use_config_token", a.UseConfigToken, "account", a.AccountID)
+		if a.UseConfigToken {
+			if isTokenInvalid(status, code, msg) && !refreshed {
+				if c.Auth.RefreshToken(ctx, a) {
+					refreshed = true
+					continue
+				}
+			}
+			if c.Auth.SwitchAccount(ctx, a) {
+				refreshed = false
+				attempts++
+				continue
+			}
+		}
+		attempts++
+	}
+	return "", errors.New("create session failed")
+}
+
+func (c *Client) GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error) {
+	if maxAttempts <= 0 {
+		maxAttempts = c.maxRetries
+	}
+	attempts := 0
+	for attempts < maxAttempts {
+		headers := c.authHeaders(a.DeepSeekToken)
+		resp, status, err := c.postJSONWithStatus(ctx, c.regular, DeepSeekCreatePowURL, headers, map[string]any{"target_path": "/api/v0/chat/completion"})
+		if err != nil {
+			config.Logger.Warn("[get_pow] request error", "error", err, "account", a.AccountID)
+			attempts++
+			continue
+		}
+		code := intFrom(resp["code"])
+		if status == http.StatusOK && code == 0 {
+			data, _ := resp["data"].(map[string]any)
+			bizData, _ := data["biz_data"].(map[string]any)
+			challenge, _ := bizData["challenge"].(map[string]any)
+			answer, err := c.powSolver.Compute(ctx, challenge)
+			if err != nil {
+				attempts++
+				continue
+			}
+			return BuildPowHeader(challenge, answer)
+		}
+		msg, _ := resp["msg"].(string)
+		config.Logger.Warn("[get_pow] failed", "status", status, "code", code, "msg", msg, "use_config_token", a.UseConfigToken, "account", a.AccountID)
+		if a.UseConfigToken {
+			if isTokenInvalid(status, code, msg) {
+				if c.Auth.RefreshToken(ctx, a) {
+					continue
+				}
+			}
+			if c.Auth.SwitchAccount(ctx, a) {
+				attempts++
+				continue
+			}
+		}
+		attempts++
+	}
+	return "", errors.New("get pow failed")
+}
+
+func (c *Client) CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error) {
+	if maxAttempts <= 0 {
+		maxAttempts = c.maxRetries
+	}
+	headers := c.authHeaders(a.DeepSeekToken)
+	headers["x-ds-pow-response"] = powResp
+	attempts := 0
+	for attempts < maxAttempts {
+		resp, err := c.streamPost(ctx, DeepSeekCompletionURL, headers, payload)
+		if err != nil {
+			attempts++
+			time.Sleep(time.Second)
+			continue
+		}
+		if resp.StatusCode == http.StatusOK {
+			return resp, nil
+		}
+		_ = resp.Body.Close()
+		attempts++
+		time.Sleep(time.Second)
+	}
+	return nil, errors.New("completion failed")
+}
+
+func (c *Client) postJSON(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any) (map[string]any, error) {
+	body, status, err := c.postJSONWithStatus(ctx, doer, url, headers, payload)
+	if err != nil {
+		return nil, err
+	}
+	if status == 0 {
+		return nil, errors.New("request failed")
+	}
+	return body, nil
+}
+
+func (c *Client) postJSONWithStatus(ctx context.Context, doer trans.Doer, url string, headers map[string]string, payload any) (map[string]any, int, error) {
+	b, err := json.Marshal(payload)
+	if err != nil {
+		return nil, 0, err
+	}
+	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
+	if err != nil {
+		return nil, 0, err
+	}
+	for k, v := range headers {
+		req.Header.Set(k, v)
+	}
+	resp, err := doer.Do(req)
+	if err != nil {
+		config.Logger.Warn("[deepseek] fingerprint request failed, fallback to std transport", "url", url, "error", err)
+		req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
+		if reqErr != nil {
+			return nil, 0, err
+		}
+		for k, v := range headers {
+			req2.Header.Set(k, v)
+		}
+		resp, err = c.fallback.Do(req2)
+		if err != nil {
+			return nil, 0, err
+		}
+	}
+	defer resp.Body.Close()
+	payloadBytes, err := readResponseBody(resp)
+	if err != nil {
+		return nil, resp.StatusCode, err
+	}
+	out := map[string]any{}
+	if len(payloadBytes) > 0 {
+		if err := json.Unmarshal(payloadBytes, &out); err != nil {
+			config.Logger.Warn("[deepseek] json parse failed", "url", url, "status", resp.StatusCode, "content_encoding", resp.Header.Get("Content-Encoding"), "preview", preview(payloadBytes))
+		}
+	}
+	return out, resp.StatusCode, nil
+}
+
+func (c *Client) streamPost(ctx context.Context, url string, headers map[string]string, payload any) (*http.Response, error) {
+	b, err := json.Marshal(payload)
+	if err != nil {
+		return nil, err
+	}
+	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
+	if err != nil {
+		return nil, err
+	}
+	for k, v := range headers {
+		req.Header.Set(k, v)
+	}
+	resp, err := c.stream.Do(req)
+	if err != nil {
+		config.Logger.Warn("[deepseek] fingerprint stream request failed, fallback to std transport", "url", url, "error", err)
+		req2, reqErr := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(b))
+		if reqErr != nil {
+			return nil, err
+		}
+		for k, v := range headers {
+			req2.Header.Set(k, v)
+		}
+		return c.fallbackS.Do(req2)
+	}
+	return resp, nil
+}
+
+func (c *Client) authHeaders(token string) map[string]string {
+	headers := make(map[string]string, len(BaseHeaders)+1)
+	for k, v := range BaseHeaders {
+		headers[k] = v
+	}
+	headers["authorization"] = "Bearer " + token
+	return headers
+}
+
+func isTokenInvalid(status int, code int, msg string) bool {
+	msg = strings.ToLower(msg)
+	if status == http.StatusUnauthorized || status == http.StatusForbidden {
+		return true
+	}
+	if code == 40001 || code == 40002 || code == 40003 {
+		return true
+	}
+	return strings.Contains(msg, "token") || strings.Contains(msg, "unauthorized")
+}
+
+func readResponseBody(resp *http.Response) ([]byte, error) {
+	encoding := strings.ToLower(strings.TrimSpace(resp.Header.Get("Content-Encoding")))
+	var reader io.Reader = resp.Body
+	switch encoding {
+	case "gzip":
+		gz, err := gzip.NewReader(resp.Body)
+		if err != nil {
+			return nil, err
+		}
+		defer gz.Close()
+		reader = gz
+	case "br":
+		reader = brotli.NewReader(resp.Body)
+	}
+	return io.ReadAll(reader)
+}
+
+func preview(b []byte) string {
+	s := strings.TrimSpace(string(b))
+	if len(s) > 160 {
+		return s[:160]
+	}
+	return s
+}
+
+func ScanSSELines(resp *http.Response, onLine func([]byte) bool) error {
+	scanner := bufio.NewScanner(resp.Body)
+	buf := make([]byte, 0, 64*1024)
+	scanner.Buffer(buf, 2*1024*1024)
+	for scanner.Scan() {
+		if !onLine(scanner.Bytes()) {
+			break
+		}
+	}
+	if err := scanner.Err(); err != nil {
+		return err
+	}
+	return nil
+}
--- a/internal/deepseek/constants.go
+++ b/internal/deepseek/constants.go
@@ -0,0 +1,26 @@
+package deepseek
+
+const (
+	DeepSeekHost             = "chat.deepseek.com"
+	DeepSeekLoginURL         = "https://chat.deepseek.com/api/v0/users/login"
+	DeepSeekCreateSessionURL = "https://chat.deepseek.com/api/v0/chat_session/create"
+	DeepSeekCreatePowURL     = "https://chat.deepseek.com/api/v0/chat/create_pow_challenge"
+	DeepSeekCompletionURL    = "https://chat.deepseek.com/api/v0/chat/completion"
+)
+
+var BaseHeaders = map[string]string{
+	"Host":              "chat.deepseek.com",
+	"User-Agent":        "DeepSeek/1.6.11 Android/35",
+	"Accept":            "application/json",
+	"Content-Type":      "application/json",
+	"x-client-platform": "android",
+	"x-client-version":  "1.6.11",
+	"x-client-locale":   "zh_CN",
+	"accept-charset":    "UTF-8",
+}
+
+const (
+	KeepAliveTimeout  = 5
+	StreamIdleTimeout = 30
+	MaxKeepaliveCount = 10
+)
--- a/internal/deepseek/embedded_pow.go
+++ b/internal/deepseek/embedded_pow.go
@@ -0,0 +1,6 @@
+package deepseek
+
+import _ "embed"
+
+//go:embed assets/sha3_wasm_bg.7b9ca65ddd.wasm
+var embeddedWASM []byte
--- a/internal/deepseek/pow.go
+++ b/internal/deepseek/pow.go
@@ -0,0 +1,288 @@
+package deepseek
+
+import (
+	"context"
+	"encoding/base64"
+	"encoding/binary"
+	"encoding/json"
+	"errors"
+	"math"
+	"os"
+	stdruntime "runtime"
+	"strconv"
+	"sync"
+
+	"ds2api/internal/config"
+
+	"github.com/tetratelabs/wazero"
+	"github.com/tetratelabs/wazero/api"
+)
+
+type PowSolver struct {
+	wasmPath string
+	once     sync.Once
+	err      error
+
+	runtime  wazero.Runtime
+	compiled wazero.CompiledModule
+	pool     chan *pooledModule
+	poolSize int
+}
+
+type pooledModule struct {
+	mod     api.Module
+	stackFn api.Function
+	allocFn api.Function
+	freeFn  api.Function
+	solveFn api.Function
+}
+
+func NewPowSolver(wasmPath string) *PowSolver {
+	return &PowSolver{wasmPath: wasmPath}
+}
+
+func (p *PowSolver) init(ctx context.Context) error {
+	p.once.Do(func() {
+		wasmBytes, err := os.ReadFile(p.wasmPath)
+		if err != nil {
+			if len(embeddedWASM) == 0 {
+				p.err = err
+				return
+			}
+			wasmBytes = embeddedWASM
+		}
+		p.runtime = wazero.NewRuntime(ctx)
+		p.compiled, p.err = p.runtime.CompileModule(ctx, wasmBytes)
+		if p.err == nil {
+			p.poolSize = powPoolSizeFromEnv()
+			p.pool = make(chan *pooledModule, p.poolSize)
+			for range p.poolSize {
+				inst, err := p.createModule(ctx)
+				if err != nil {
+					p.err = err
+					return
+				}
+				p.pool <- inst
+			}
+		}
+	})
+	return p.err
+}
+
+func (p *PowSolver) Compute(ctx context.Context, challenge map[string]any) (int64, error) {
+	if err := p.init(ctx); err != nil {
+		return 0, err
+	}
+	algo, _ := challenge["algorithm"].(string)
+	if algo != "DeepSeekHashV1" {
+		return 0, errors.New("unsupported algorithm")
+	}
+	challengeStr, _ := challenge["challenge"].(string)
+	salt, _ := challenge["salt"].(string)
+	signature, _ := challenge["signature"].(string)
+	targetPath, _ := challenge["target_path"].(string)
+	_ = signature
+	_ = targetPath
+
+	difficulty := toFloat64(challenge["difficulty"], 144000)
+	expireAt := toInt64(challenge["expire_at"], 1680000000)
+	prefix := salt + "_" + itoa(expireAt) + "_"
+
+	pm, err := p.acquireModule(ctx)
+	if err != nil {
+		return 0, err
+	}
+	defer p.releaseModule(pm)
+
+	mem := pm.mod.Memory()
+	if mem == nil {
+		return 0, errors.New("wasm memory missing")
+	}
+	retPtrs, err := pm.stackFn.Call(ctx, uint64(uint32(^uint32(15)))) // -16 i32
+	if err != nil || len(retPtrs) == 0 {
+		return 0, errors.New("stack alloc failed")
+	}
+	retptr := uint32(retPtrs[0])
+	defer func() {
+		_, _ = pm.stackFn.Call(context.Background(), 16)
+	}()
+
+	chPtr, chLen, err := writeUTF8(ctx, pm.allocFn, mem, challengeStr)
+	if err != nil {
+		return 0, err
+	}
+	defer freeUTF8(pm.freeFn, chPtr, chLen)
+
+	prefixPtr, prefixLen, err := writeUTF8(ctx, pm.allocFn, mem, prefix)
+	if err != nil {
+		return 0, err
+	}
+	defer freeUTF8(pm.freeFn, prefixPtr, prefixLen)
+
+	if _, err := pm.solveFn.Call(ctx,
+		uint64(retptr),
+		uint64(chPtr), uint64(chLen),
+		uint64(prefixPtr), uint64(prefixLen),
+		math.Float64bits(difficulty),
+	); err != nil {
+		return 0, err
+	}
+
+	statusBytes, ok := mem.Read(retptr, 4)
+	if !ok {
+		return 0, errors.New("read status failed")
+	}
+	status := int32(binary.LittleEndian.Uint32(statusBytes))
+	valueBytes, ok := mem.Read(retptr+8, 8)
+	if !ok {
+		return 0, errors.New("read value failed")
+	}
+	value := math.Float64frombits(binary.LittleEndian.Uint64(valueBytes))
+	if status == 0 {
+		return 0, errors.New("pow solve failed")
+	}
+	return int64(value), nil
+}
+
+func (p *PowSolver) createModule(ctx context.Context) (*pooledModule, error) {
+	mod, err := p.runtime.InstantiateModule(ctx, p.compiled, wazero.NewModuleConfig())
+	if err != nil {
+		return nil, err
+	}
+	stackFn := mod.ExportedFunction("__wbindgen_add_to_stack_pointer")
+	allocFn := mod.ExportedFunction("__wbindgen_export_0")
+	solveFn := mod.ExportedFunction("wasm_solve")
+	if stackFn == nil || allocFn == nil || solveFn == nil {
+		_ = mod.Close(context.Background())
+		return nil, errors.New("required wasm exports missing")
+	}
+	return &pooledModule{
+		mod:     mod,
+		stackFn: stackFn,
+		allocFn: allocFn,
+		freeFn:  mod.ExportedFunction("__wbindgen_export_2"),
+		solveFn: solveFn,
+	}, nil
+}
+
+func (p *PowSolver) acquireModule(ctx context.Context) (*pooledModule, error) {
+	if p.pool != nil {
+		for {
+			select {
+			case pm := <-p.pool:
+				if pm != nil {
+					return pm, nil
+				}
+			case <-ctx.Done():
+				return nil, ctx.Err()
+			}
+		}
+	}
+	return p.createModule(ctx)
+}
+
+func (p *PowSolver) releaseModule(pm *pooledModule) {
+	if pm == nil || pm.mod == nil {
+		return
+	}
+	if p.pool != nil {
+		select {
+		case p.pool <- pm:
+			return
+		default:
+		}
+	}
+	_ = pm.mod.Close(context.Background())
+}
+
+func writeUTF8(ctx context.Context, allocFn api.Function, mem api.Memory, text string) (uint32, uint32, error) {
+	data := []byte(text)
+	res, err := allocFn.Call(ctx, uint64(len(data)), 1)
+	if err != nil || len(res) == 0 {
+		return 0, 0, errors.New("alloc failed")
+	}
+	ptr := uint32(res[0])
+	if !mem.Write(ptr, data) {
+		return 0, 0, errors.New("mem write failed")
+	}
+	return ptr, uint32(len(data)), nil
+}
+
+func freeUTF8(freeFn api.Function, ptr, size uint32) {
+	if freeFn == nil || ptr == 0 || size == 0 {
+		return
+	}
+	_, _ = freeFn.Call(context.Background(), uint64(ptr), uint64(size), 1)
+}
+
+func BuildPowHeader(challenge map[string]any, answer int64) (string, error) {
+	payload := map[string]any{
+		"algorithm":   challenge["algorithm"],
+		"challenge":   challenge["challenge"],
+		"salt":        challenge["salt"],
+		"answer":      answer,
+		"signature":   challenge["signature"],
+		"target_path": challenge["target_path"],
+	}
+	b, err := json.Marshal(payload)
+	if err != nil {
+		return "", err
+	}
+	return base64.StdEncoding.EncodeToString(b), nil
+}
+
+func toFloat64(v any, d float64) float64 {
+	switch n := v.(type) {
+	case float64:
+		return n
+	case int:
+		return float64(n)
+	case int64:
+		return float64(n)
+	default:
+		return d
+	}
+}
+
+func toInt64(v any, d int64) int64 {
+	switch n := v.(type) {
+	case float64:
+		return int64(n)
+	case int:
+		return int64(n)
+	case int64:
+		return n
+	default:
+		return d
+	}
+}
+
+func itoa(n int64) string {
+	return strconv.FormatInt(n, 10)
+}
+
+func powPoolSizeFromEnv() int {
+	const fallback = 4
+	n := fallback
+	if cpus := stdruntime.GOMAXPROCS(0); cpus > 0 {
+		n = cpus
+	}
+	if raw := os.Getenv("DS2API_POW_POOL_SIZE"); raw != "" {
+		if v, err := strconv.Atoi(raw); err == nil && v > 0 {
+			n = v
+		}
+	}
+	if n > 64 {
+		return 64
+	}
+	return n
+}
+
+func PreloadWASM(wasmPath string) {
+	solver := NewPowSolver(wasmPath)
+	if err := solver.init(context.Background()); err != nil {
+		config.Logger.Warn("[WASM] preload failed", "error", err)
+		return
+	}
+	config.Logger.Info("[WASM] module preloaded", "path", wasmPath)
+}
--- a/internal/deepseek/pow_test.go
+++ b/internal/deepseek/pow_test.go
@@ -0,0 +1,68 @@
+package deepseek
+
+import (
+	"context"
+	"testing"
+	"time"
+)
+
+func TestPowPoolSizeFromEnv(t *testing.T) {
+	t.Setenv("DS2API_POW_POOL_SIZE", "3")
+	if got := powPoolSizeFromEnv(); got != 3 {
+		t.Fatalf("expected pool size 3, got %d", got)
+	}
+}
+
+func TestPowSolverAcquireReleaseReusesModule(t *testing.T) {
+	t.Setenv("DS2API_POW_POOL_SIZE", "1")
+	solver := NewPowSolver("missing-file.wasm")
+	if err := solver.init(context.Background()); err != nil {
+		t.Fatalf("init failed: %v", err)
+	}
+
+	pm1, err := solver.acquireModule(context.Background())
+	if err != nil {
+		t.Fatalf("acquire first module failed: %v", err)
+	}
+	solver.releaseModule(pm1)
+
+	pm2, err := solver.acquireModule(context.Background())
+	if err != nil {
+		t.Fatalf("acquire second module failed: %v", err)
+	}
+	if pm1 != pm2 {
+		t.Fatalf("expected pooled module reuse, got different instances")
+	}
+	solver.releaseModule(pm2)
+}
+
+func TestPowSolverAcquireHonorsContextWhenPoolExhausted(t *testing.T) {
+	t.Setenv("DS2API_POW_POOL_SIZE", "1")
+	solver := NewPowSolver("missing-file.wasm")
+	if err := solver.init(context.Background()); err != nil {
+		t.Fatalf("init failed: %v", err)
+	}
+
+	held, err := solver.acquireModule(context.Background())
+	if err != nil {
+		t.Fatalf("acquire held module failed: %v", err)
+	}
+	defer solver.releaseModule(held)
+
+	ctx, cancel := context.WithTimeout(context.Background(), 20*time.Millisecond)
+	defer cancel()
+	if _, err := solver.acquireModule(ctx); err == nil {
+		t.Fatalf("expected context cancellation while pool is exhausted")
+	}
+}
+
+func TestClientPreloadPowUsesClientSolver(t *testing.T) {
+	t.Setenv("DS2API_POW_POOL_SIZE", "1")
+	client := NewClient(nil, nil)
+	if err := client.PreloadPow(context.Background()); err != nil {
+		t.Fatalf("preload failed: %v", err)
+	}
+	if client.powSolver.runtime == nil || client.powSolver.compiled == nil {
+		t.Fatalf("expected client pow solver to be initialized")
+	}
+}
--- a/internal/deepseek/transport/transport.go
+++ b/internal/deepseek/transport/transport.go
@@ -0,0 +1,80 @@
+package transport
+
+import (
+	"context"
+	"crypto/tls"
+	"fmt"
+	"net"
+	"net/http"
+	"time"
+
+	utls "github.com/refraction-networking/utls"
+)
+
+type Doer interface {
+	Do(req *http.Request) (*http.Response, error)
+}
+
+type Client struct {
+	http *http.Client
+}
+
+func New(timeout time.Duration) *Client {
+	base := &http.Transport{
+		Proxy:               http.ProxyFromEnvironment,
+		ForceAttemptHTTP2:   false,
+		MaxIdleConns:        200,
+		MaxIdleConnsPerHost: 100,
+		IdleConnTimeout:     90 * time.Second,
+		DialContext:         (&net.Dialer{Timeout: 15 * time.Second, KeepAlive: 30 * time.Second}).DialContext,
+		DialTLSContext:      safariTLSDialer(),
+		TLSClientConfig:     &tls.Config{MinVersion: tls.VersionTLS12},
+	}
+	return &Client{http: &http.Client{Timeout: timeout, Transport: base}}
+}
+
+func (c *Client) Do(req *http.Request) (*http.Response, error) {
+	return c.http.Do(req)
+}
+
+func safariTLSDialer() func(ctx context.Context, network, addr string) (net.Conn, error) {
+	var dialer net.Dialer
+	return func(ctx context.Context, network, addr string) (net.Conn, error) {
+		plainConn, err := dialer.DialContext(ctx, network, addr)
+		if err != nil {
+			return nil, err
+		}
+		host, _, _ := net.SplitHostPort(addr)
+		uCfg := &utls.Config{ServerName: host}
+		uConn := utls.UClient(plainConn, uCfg, utls.HelloSafari_Auto)
+		if err := forceHTTP11ALPN(uConn); err != nil {
+			_ = plainConn.Close()
+			return nil, err
+		}
+		err = uConn.HandshakeContext(ctx)
+		if err != nil {
+			_ = plainConn.Close()
+			return nil, err
+		}
+		if negotiated := uConn.ConnectionState().NegotiatedProtocol; negotiated != "" && negotiated != "http/1.1" {
+			_ = uConn.Close()
+			return nil, fmt.Errorf("unexpected ALPN protocol negotiated: %s", negotiated)
+		}
+		return uConn, nil
+	}
+}
+
+func forceHTTP11ALPN(uConn *utls.UConn) error {
+	if err := uConn.BuildHandshakeState(); err != nil {
+		return err
+	}
+	for _, ext := range uConn.Extensions {
+		alpnExt, ok := ext.(*utls.ALPNExtension)
+		if !ok {
+			continue
+		}
+		alpnExt.AlpnProtocols = []string{"http/1.1"}
+		return nil
+	}
+	return nil
+}
--- a/internal/server/router.go
+++ b/internal/server/router.go
@@ -0,0 +1,108 @@
+package server
+
+import (
+	"context"
+	"encoding/json"
+	"net/http"
+	"strings"
+	"time"
+
+	"github.com/go-chi/chi/v5"
+	"github.com/go-chi/chi/v5/middleware"
+
+	"ds2api/internal/account"
+	"ds2api/internal/adapter/claude"
+	"ds2api/internal/adapter/openai"
+	"ds2api/internal/admin"
+	"ds2api/internal/auth"
+	"ds2api/internal/config"
+	"ds2api/internal/deepseek"
+	"ds2api/internal/webui"
+)
+
+type App struct {
+	Store    *config.Store
+	Pool     *account.Pool
+	Resolver *auth.Resolver
+	DS       *deepseek.Client
+	Router   http.Handler
+}
+
+func NewApp() *App {
+	store := config.LoadStore()
+	pool := account.NewPool(store)
+	var dsClient *deepseek.Client
+	resolver := auth.NewResolver(store, pool, func(ctx context.Context, acc config.Account) (string, error) {
+		return dsClient.Login(ctx, acc)
+	})
+	dsClient = deepseek.NewClient(store, resolver)
+	if err := dsClient.PreloadPow(context.Background()); err != nil {
+		config.Logger.Warn("[WASM] preload failed", "error", err)
+	} else {
+		config.Logger.Info("[WASM] module preloaded", "path", config.WASMPath())
+	}
+
+	openaiHandler := &openai.Handler{Store: store, Auth: resolver, DS: dsClient}
+	claudeHandler := &claude.Handler{Store: store, Auth: resolver, DS: dsClient}
+	adminHandler := &admin.Handler{Store: store, Pool: pool, DS: dsClient}
+	webuiHandler := webui.NewHandler()
+
+	r := chi.NewRouter()
+	r.Use(middleware.RequestID)
+	r.Use(middleware.RealIP)
+	r.Use(middleware.Logger)
+	r.Use(middleware.Recoverer)
+	r.Use(cors)
+	r.Use(timeout(0))
+
+	r.Get("/healthz", func(w http.ResponseWriter, _ *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.WriteHeader(http.StatusOK)
+		_, _ = w.Write([]byte(`{"status":"ok"}`))
+	})
+	r.Get("/readyz", func(w http.ResponseWriter, _ *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.WriteHeader(http.StatusOK)
+		_, _ = w.Write([]byte(`{"status":"ready"}`))
+	})
+	openai.RegisterRoutes(r, openaiHandler)
+	claude.RegisterRoutes(r, claudeHandler)
+	r.Route("/admin", func(ar chi.Router) {
+		admin.RegisterRoutes(ar, adminHandler)
+	})
+	webui.RegisterRoutes(r, webuiHandler)
+	r.NotFound(func(w http.ResponseWriter, req *http.Request) {
+		if strings.HasPrefix(req.URL.Path, "/admin/") && webuiHandler.HandleAdminFallback(w, req) {
+			return
+		}
+		http.NotFound(w, req)
+	})
+
+	return &App{Store: store, Pool: pool, Resolver: resolver, DS: dsClient, Router: r}
+}
+
+func timeout(d time.Duration) func(http.Handler) http.Handler {
+	if d <= 0 {
+		return func(next http.Handler) http.Handler { return next }
+	}
+	return middleware.Timeout(d)
+}
+
+func cors(next http.Handler) http.Handler {
+	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Access-Control-Allow-Origin", "*")
+		w.Header().Set("Access-Control-Allow-Methods", "GET, POST, OPTIONS, PUT, DELETE")
+		w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
+		if r.Method == http.MethodOptions {
+			w.WriteHeader(http.StatusNoContent)
+			return
+		}
+		next.ServeHTTP(w, r)
+	})
+}
+
+func WriteUnhandledError(w http.ResponseWriter, err error) {
+	w.Header().Set("Content-Type", "application/json")
+	w.WriteHeader(http.StatusInternalServerError)
+	_ = json.NewEncoder(w).Encode(map[string]any{"error": map[string]any{"type": "api_error", "message": "Internal Server Error", "detail": err.Error()}})
+}
--- a/internal/sse/consumer.go
+++ b/internal/sse/consumer.go
@@ -0,0 +1,52 @@
+package sse
+
+import (
+	"net/http"
+	"strings"
+
+	"ds2api/internal/deepseek"
+)
+
+// CollectResult holds the aggregated text and thinking content from a
+// DeepSeek SSE stream, consumed to completion (non-streaming use case).
+type CollectResult struct {
+	Text     string
+	Thinking string
+}
+
+// CollectStream fully consumes a DeepSeek SSE response and separates
+// thinking content from text content. This replaces the duplicated
+// stream-collection logic in openai.handleNonStream, claude.collectDeepSeek,
+// and admin.testAccount.
+//
+// The caller is responsible for closing resp.Body unless closeBody is true.
+func CollectStream(resp *http.Response, thinkingEnabled bool, closeBody bool) CollectResult {
+	if closeBody {
+		defer resp.Body.Close()
+	}
+	text := strings.Builder{}
+	thinking := strings.Builder{}
+	currentType := "text"
+	if thinkingEnabled {
+		currentType = "thinking"
+	}
+	_ = deepseek.ScanSSELines(resp, func(line []byte) bool {
+		result := ParseDeepSeekContentLine(line, thinkingEnabled, currentType)
+		currentType = result.NextType
+		if !result.Parsed {
+			return true
+		}
+		if result.Stop {
+			return false
+		}
+		for _, p := range result.Parts {
+			if p.Type == "thinking" {
+				thinking.WriteString(p.Text)
+			} else {
+				text.WriteString(p.Text)
+			}
+		}
+		return true
+	})
+	return CollectResult{Text: text.String(), Thinking: thinking.String()}
+}
--- a/internal/sse/line.go
+++ b/internal/sse/line.go
@@ -0,0 +1,49 @@
+package sse
+
+import "fmt"
+
+// LineResult is the normalized parse result for one DeepSeek SSE line.
+type LineResult struct {
+	Parsed        bool
+	Stop          bool
+	ContentFilter bool
+	ErrorMessage  string
+	Parts         []ContentPart
+	NextType      string
+}
+
+// ParseDeepSeekContentLine centralizes one-line DeepSeek SSE parsing for both
+// streaming and non-streaming handlers.
+func ParseDeepSeekContentLine(raw []byte, thinkingEnabled bool, currentType string) LineResult {
+	chunk, done, parsed := ParseDeepSeekSSELine(raw)
+	if !parsed {
+		return LineResult{NextType: currentType}
+	}
+	if done {
+		return LineResult{Parsed: true, Stop: true, NextType: currentType}
+	}
+	if errObj, hasErr := chunk["error"]; hasErr {
+		return LineResult{
+			Parsed:       true,
+			Stop:         true,
+			ErrorMessage: fmt.Sprintf("%v", errObj),
+			NextType:     currentType,
+		}
+	}
+	if code, _ := chunk["code"].(string); code == "content_filter" {
+		return LineResult{
+			Parsed:        true,
+			Stop:          true,
+			ContentFilter: true,
+			ErrorMessage:  "content filtered by upstream",
+			NextType:      currentType,
+		}
+	}
+	parts, finished, nextType := ParseSSEChunkForContent(chunk, thinkingEnabled, currentType)
+	return LineResult{
+		Parsed:   true,
+		Stop:     finished,
+		Parts:    parts,
+		NextType: nextType,
+	}
+}
--- a/internal/sse/line_test.go
+++ b/internal/sse/line_test.go
@@ -0,0 +1,37 @@
+package sse
+
+import "testing"
+
+func TestParseDeepSeekContentLineDone(t *testing.T) {
+	res := ParseDeepSeekContentLine([]byte("data: [DONE]"), false, "text")
+	if !res.Parsed || !res.Stop {
+		t.Fatalf("expected parsed stop result: %#v", res)
+	}
+}
+
+func TestParseDeepSeekContentLineError(t *testing.T) {
+	res := ParseDeepSeekContentLine([]byte(`data: {"error":"boom"}`), false, "text")
+	if !res.Parsed || !res.Stop {
+		t.Fatalf("expected stop on error: %#v", res)
+	}
+	if res.ErrorMessage == "" {
+		t.Fatalf("expected non-empty error message")
+	}
+}
+
+func TestParseDeepSeekContentLineContentFilter(t *testing.T) {
+	res := ParseDeepSeekContentLine([]byte(`data: {"code":"content_filter"}`), false, "text")
+	if !res.Parsed || !res.Stop || !res.ContentFilter {
+		t.Fatalf("expected content-filter stop result: %#v", res)
+	}
+}
+
+func TestParseDeepSeekContentLineContent(t *testing.T) {
+	res := ParseDeepSeekContentLine([]byte(`data: {"p":"response/content","v":"hi"}`), false, "text")
+	if !res.Parsed || res.Stop {
+		t.Fatalf("expected parsed non-stop result: %#v", res)
+	}
+	if len(res.Parts) != 1 || res.Parts[0].Text != "hi" || res.Parts[0].Type != "text" {
+		t.Fatalf("unexpected parts: %#v", res.Parts)
+	}
+}
--- a/internal/sse/parser.go
+++ b/internal/sse/parser.go
@@ -0,0 +1,259 @@
+package sse
+
+import (
+	"bytes"
+	"encoding/json"
+	"strings"
+)
+
+type ContentPart struct {
+	Text string
+	Type string
+}
+
+var skipPatterns = []string{
+	"quasi_status", "elapsed_secs", "token_usage", "pending_fragment", "conversation_mode",
+	"fragments/-1/status", "fragments/-2/status", "fragments/-3/status",
+}
+
+func ParseDeepSeekSSELine(raw []byte) (map[string]any, bool, bool) {
+	line := strings.TrimSpace(string(raw))
+	if line == "" || !strings.HasPrefix(line, "data:") {
+		return nil, false, false
+	}
+	dataStr := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
+	if dataStr == "[DONE]" {
+		return nil, true, true
+	}
+	chunk := map[string]any{}
+	if err := json.Unmarshal([]byte(dataStr), &chunk); err != nil {
+		return nil, false, false
+	}
+	return chunk, false, true
+}
+
+func shouldSkipPath(path string) bool {
+	if path == "response/search_status" {
+		return true
+	}
+	for _, p := range skipPatterns {
+		if strings.Contains(path, p) {
+			return true
+		}
+	}
+	return false
+}
+
+func ParseSSEChunkForContent(chunk map[string]any, thinkingEnabled bool, currentFragmentType string) ([]ContentPart, bool, string) {
+	v, ok := chunk["v"]
+	if !ok {
+		return nil, false, currentFragmentType
+	}
+	path, _ := chunk["p"].(string)
+	if shouldSkipPath(path) {
+		return nil, false, currentFragmentType
+	}
+	if path == "response/status" {
+		if s, ok := v.(string); ok && s == "FINISHED" {
+			return nil, true, currentFragmentType
+		}
+	}
+	newType := currentFragmentType
+	parts := make([]ContentPart, 0, 8)
+
+	// Newer DeepSeek responses may emit fragment APPEND directly on
+	// path "response/fragments" instead of wrapping it in path "response".
+	if path == "response/fragments" {
+		if op, _ := chunk["o"].(string); strings.EqualFold(op, "APPEND") {
+			if frags, ok := v.([]any); ok {
+				for _, frag := range frags {
+					fm, ok := frag.(map[string]any)
+					if !ok {
+						continue
+					}
+					t, _ := fm["type"].(string)
+					content, _ := fm["content"].(string)
+					t = strings.ToUpper(t)
+					switch t {
+					case "THINK", "THINKING":
+						newType = "thinking"
+						if content != "" {
+							parts = append(parts, ContentPart{Text: content, Type: "thinking"})
+						}
+					case "RESPONSE":
+						newType = "text"
+						if content != "" {
+							parts = append(parts, ContentPart{Text: content, Type: "text"})
+						}
+					default:
+						if content != "" {
+							parts = append(parts, ContentPart{Text: content, Type: "text"})
+						}
+					}
+				}
+			}
+		}
+	}
+
+	if path == "response" {
+		if arr, ok := v.([]any); ok {
+			for _, it := range arr {
+				m, ok := it.(map[string]any)
+				if !ok {
+					continue
+				}
+				if m["p"] == "fragments" && m["o"] == "APPEND" {
+					if frags, ok := m["v"].([]any); ok {
+						for _, frag := range frags {
+							fm, ok := frag.(map[string]any)
+							if !ok {
+								continue
+							}
+							t, _ := fm["type"].(string)
+							t = strings.ToUpper(t)
+							if t == "THINK" || t == "THINKING" {
+								newType = "thinking"
+							} else if t == "RESPONSE" {
+								newType = "text"
+							}
+						}
+					}
+				}
+			}
+		}
+	}
+	partType := "text"
+	switch {
+	case path == "response/thinking_content":
+		partType = "thinking"
+	case path == "response/content":
+		partType = "text"
+	case strings.Contains(path, "response/fragments") && strings.Contains(path, "/content"):
+		partType = newType
+	case path == "":
+		if thinkingEnabled {
+			partType = newType
+		}
+	}
+	switch val := v.(type) {
+	case string:
+		if val == "FINISHED" && (path == "" || path == "status") {
+			return nil, true, newType
+		}
+		if val != "" {
+			parts = append(parts, ContentPart{Text: val, Type: partType})
+		}
+	case []any:
+		pp, finished := extractContentRecursive(val, partType)
+		if finished {
+			return nil, true, newType
+		}
+		parts = append(parts, pp...)
+	case map[string]any:
+		resp := val
+		if wrapped, ok := val["response"].(map[string]any); ok {
+			resp = wrapped
+		}
+		if frags, ok := resp["fragments"].([]any); ok {
+			for _, item := range frags {
+				m, ok := item.(map[string]any)
+				if !ok {
+					continue
+				}
+				t, _ := m["type"].(string)
+				content, _ := m["content"].(string)
+				t = strings.ToUpper(t)
+				if t == "THINK" || t == "THINKING" {
+					newType = "thinking"
+					if content != "" {
+						parts = append(parts, ContentPart{Text: content, Type: "thinking"})
+					}
+				} else if t == "RESPONSE" {
+					newType = "text"
+					if content != "" {
+						parts = append(parts, ContentPart{Text: content, Type: "text"})
+					}
+				} else if content != "" {
+					parts = append(parts, ContentPart{Text: content, Type: partType})
+				}
+			}
+		}
+	}
+	return parts, false, newType
+}
+
+func extractContentRecursive(items []any, defaultType string) ([]ContentPart, bool) {
+	parts := make([]ContentPart, 0, len(items))
+	for _, it := range items {
+		m, ok := it.(map[string]any)
+		if !ok {
+			continue
+		}
+		itemPath, _ := m["p"].(string)
+		itemV, hasV := m["v"]
+		if !hasV {
+			continue
+		}
+		if itemPath == "status" {
+			if s, ok := itemV.(string); ok && s == "FINISHED" {
+				return nil, true
+			}
+		}
+		if shouldSkipPath(itemPath) {
+			continue
+		}
+		if content, ok := m["content"].(string); ok && content != "" {
+			typeName, _ := m["type"].(string)
+			typeName = strings.ToUpper(typeName)
+			switch typeName {
+			case "THINK", "THINKING":
+				parts = append(parts, ContentPart{Text: content, Type: "thinking"})
+			case "RESPONSE":
+				parts = append(parts, ContentPart{Text: content, Type: "text"})
+			default:
+				parts = append(parts, ContentPart{Text: content, Type: defaultType})
+			}
+			continue
+		}
+		partType := defaultType
+		if strings.Contains(itemPath, "thinking") {
+			partType = "thinking"
+		} else if strings.Contains(itemPath, "content") || itemPath == "response" || itemPath == "fragments" {
+			partType = "text"
+		}
+		switch v := itemV.(type) {
+		case string:
+			if v != "" && v != "FINISHED" {
+				parts = append(parts, ContentPart{Text: v, Type: partType})
+			}
+		case []any:
+			for _, inner := range v {
+				switch x := inner.(type) {
+				case map[string]any:
+					ct, _ := x["content"].(string)
+					if ct == "" {
+						continue
+					}
+					typeName, _ := x["type"].(string)
+					typeName = strings.ToUpper(typeName)
+					if typeName == "THINK" || typeName == "THINKING" {
+						parts = append(parts, ContentPart{Text: ct, Type: "thinking"})
+					} else if typeName == "RESPONSE" {
+						parts = append(parts, ContentPart{Text: ct, Type: "text"})
+					} else {
+						parts = append(parts, ContentPart{Text: ct, Type: partType})
+					}
+				case string:
+					if x != "" {
+						parts = append(parts, ContentPart{Text: x, Type: partType})
+					}
+				}
+			}
+		}
+	}
+	return parts, false
+}
+
+func IsCitation(text string) bool {
+	return bytes.HasPrefix([]byte(strings.TrimSpace(text)), []byte("[citation:"))
+}
--- a/internal/sse/parser_test.go
+++ b/internal/sse/parser_test.go
@@ -0,0 +1,89 @@
+package sse
+
+import "testing"
+
+func TestParseDeepSeekSSELine(t *testing.T) {
+	chunk, done, ok := ParseDeepSeekSSELine([]byte(`data: {"v":"你好"}`))
+	if !ok || done {
+		t.Fatalf("expected parsed chunk")
+	}
+	if chunk["v"] != "你好" {
+		t.Fatalf("unexpected chunk: %#v", chunk)
+	}
+}
+
+func TestParseDeepSeekSSELineDone(t *testing.T) {
+	_, done, ok := ParseDeepSeekSSELine([]byte(`data: [DONE]`))
+	if !ok || !done {
+		t.Fatalf("expected done signal")
+	}
+}
+
+func TestParseSSEChunkForContentSimple(t *testing.T) {
+	parts, finished, _ := ParseSSEChunkForContent(map[string]any{"v": "hello"}, false, "text")
+	if finished {
+		t.Fatal("expected unfinished")
+	}
+	if len(parts) != 1 || parts[0].Text != "hello" || parts[0].Type != "text" {
+		t.Fatalf("unexpected parts: %#v", parts)
+	}
+}
+
+func TestParseSSEChunkForContentThinking(t *testing.T) {
+	parts, finished, _ := ParseSSEChunkForContent(map[string]any{"p": "response/thinking_content", "v": "think"}, true, "thinking")
+	if finished {
+		t.Fatal("expected unfinished")
+	}
+	if len(parts) != 1 || parts[0].Type != "thinking" {
+		t.Fatalf("unexpected parts: %#v", parts)
+	}
+}
+
+func TestIsCitation(t *testing.T) {
+	if !IsCitation("[citation:1] abc") {
+		t.Fatal("expected citation true")
+	}
+	if IsCitation("normal text") {
+		t.Fatal("expected citation false")
+	}
+}
+
+func TestParseSSEChunkForContentFragmentsAppendSwitchToResponse(t *testing.T) {
+	chunk := map[string]any{
+		"p": "response/fragments",
+		"o": "APPEND",
+		"v": []any{
+			map[string]any{
+				"type":    "RESPONSE",
+				"content": "你好",
+			},
+		},
+	}
+	parts, finished, nextType := ParseSSEChunkForContent(chunk, true, "thinking")
+	if finished {
+		t.Fatal("expected unfinished")
+	}
+	if nextType != "text" {
+		t.Fatalf("expected next type text, got %q", nextType)
+	}
+	if len(parts) != 1 || parts[0].Type != "text" || parts[0].Text != "你好" {
+		t.Fatalf("unexpected parts: %#v", parts)
+	}
+}
+
+func TestParseSSEChunkForContentAfterAppendUsesUpdatedType(t *testing.T) {
+	chunk := map[string]any{
+		"p": "response/fragments/-1/content",
+		"v": "！",
+	}
+	parts, finished, nextType := ParseSSEChunkForContent(chunk, true, "text")
+	if finished {
+		t.Fatal("expected unfinished")
+	}
+	if nextType != "text" {
+		t.Fatalf("expected next type text, got %q", nextType)
+	}
+	if len(parts) != 1 || parts[0].Type != "text" || parts[0].Text != "！" {
+		t.Fatalf("unexpected parts: %#v", parts)
+	}
+}
--- a/internal/sse/stream.go
+++ b/internal/sse/stream.go
@@ -0,0 +1,40 @@
+package sse
+
+import (
+	"bufio"
+	"context"
+	"io"
+)
+
+const (
+	parsedLineBufferSize = 128
+	scannerBufferSize    = 64 * 1024
+	maxScannerLineSize   = 2 * 1024 * 1024
+)
+
+// StartParsedLinePump scans an upstream DeepSeek SSE body and emits normalized
+// line parse results. It centralizes scanner setup + current fragment type
+// tracking for all streaming adapters.
+func StartParsedLinePump(ctx context.Context, body io.Reader, thinkingEnabled bool, initialType string) (<-chan LineResult, <-chan error) {
+	out := make(chan LineResult, parsedLineBufferSize)
+	done := make(chan error, 1)
+	go func() {
+		defer close(out)
+		scanner := bufio.NewScanner(body)
+		scanner.Buffer(make([]byte, 0, scannerBufferSize), maxScannerLineSize)
+		currentType := initialType
+		for scanner.Scan() {
+			line := append([]byte{}, scanner.Bytes()...)
+			result := ParseDeepSeekContentLine(line, thinkingEnabled, currentType)
+			currentType = result.NextType
+			select {
+			case out <- result:
+			case <-ctx.Done():
+				done <- ctx.Err()
+				return
+			}
+		}
+		done <- scanner.Err()
+	}()
+	return out, done
+}
--- a/internal/sse/stream_test.go
+++ b/internal/sse/stream_test.go
@@ -0,0 +1,30 @@
+package sse
+
+import (
+	"context"
+	"strings"
+	"testing"
+)
+
+func TestStartParsedLinePumpParsesAndStops(t *testing.T) {
+	body := strings.NewReader("data: {\"p\":\"response/content\",\"v\":\"hi\"}\n\ndata: [DONE]\n")
+	results, done := StartParsedLinePump(context.Background(), body, false, "text")
+
+	collected := make([]LineResult, 0, 2)
+	for r := range results {
+		collected = append(collected, r)
+	}
+	if err := <-done; err != nil {
+		t.Fatalf("unexpected scanner error: %v", err)
+	}
+	if len(collected) < 2 {
+		t.Fatalf("expected at least 2 parsed results, got %d", len(collected))
+	}
+	if !collected[0].Parsed || len(collected[0].Parts) == 0 {
+		t.Fatalf("expected first line to contain parsed content")
+	}
+	last := collected[len(collected)-1]
+	if !last.Parsed || !last.Stop {
+		t.Fatalf("expected last line to stop stream, got parsed=%v stop=%v", last.Parsed, last.Stop)
+	}
+}
--- a/internal/testsuite/edge_cases.go
+++ b/internal/testsuite/edge_cases.go
@@ -0,0 +1,494 @@
+package testsuite
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"net/http"
+	"strings"
+	"sync"
+	"time"
+)
+
+func (r *Runner) caseConcurrencyThresholdLimit(ctx context.Context, cc *caseContext) error {
+	status, err := r.fetchQueueStatus(ctx, cc)
+	if err != nil {
+		return err
+	}
+	total := toInt(status["total"])
+	maxInflight := toInt(status["max_inflight_per_account"])
+	maxQueue := toInt(status["max_queue_size"])
+	if total <= 0 || maxInflight <= 0 {
+		cc.assert("queue_capacity_known", false, fmt.Sprintf("queue_status=%v", status))
+		return nil
+	}
+	capacity := total*maxInflight + maxQueue
+	if capacity <= 0 {
+		capacity = total * maxInflight
+	}
+	n := capacity + 8
+	if n < 8 {
+		n = 8
+	}
+	type one struct {
+		Status int
+		Err    string
+	}
+	res := make([]one, n)
+	var wg sync.WaitGroup
+	for i := 0; i < n; i++ {
+		wg.Add(1)
+		go func(idx int) {
+			defer wg.Done()
+			resp, err := cc.request(ctx, requestSpec{
+				Method: http.MethodPost,
+				Path:   "/v1/chat/completions",
+				Headers: map[string]string{
+					"Authorization": "Bearer " + r.apiKey,
+				},
+				Body: map[string]any{
+					"model": "deepseek-chat",
+					"messages": []map[string]any{
+						{"role": "user", "content": fmt.Sprintf("并发边界测试 #%d，请输出不少于300字。", idx)},
+					},
+					"stream": true,
+				},
+				Stream:    true,
+				Retryable: true,
+			})
+			if err != nil {
+				res[idx] = one{Err: err.Error()}
+				return
+			}
+			res[idx] = one{Status: resp.StatusCode}
+		}(i)
+	}
+	wg.Wait()
+
+	dist := map[int]int{}
+	for _, it := range res {
+		if it.Status > 0 {
+			dist[it.Status]++
+		}
+	}
+	cc.assert("has_200", dist[http.StatusOK] > 0, fmt.Sprintf("distribution=%v", dist))
+	cc.assert("has_429_when_over_capacity", dist[http.StatusTooManyRequests] > 0, fmt.Sprintf("distribution=%v capacity=%d n=%d", dist, capacity, n))
+	_, has5xx := has5xx(dist)
+	cc.assert("no_5xx", !has5xx, fmt.Sprintf("distribution=%v", dist))
+	return nil
+}
+
+func (r *Runner) caseStreamAbortRelease(ctx context.Context, cc *caseContext) error {
+	before, err := r.fetchQueueStatus(ctx, cc)
+	if err != nil {
+		return err
+	}
+	baseInUse := toInt(before["in_use"])
+	for i := 0; i < 3; i++ {
+		if err := cc.abortStreamRequest(ctx, requestSpec{
+			Method: http.MethodPost,
+			Path:   "/v1/chat/completions",
+			Headers: map[string]string{
+				"Authorization": "Bearer " + r.apiKey,
+			},
+			Body: map[string]any{
+				"model": "deepseek-chat",
+				"messages": []map[string]any{
+					{"role": "user", "content": fmt.Sprintf("中断释放测试 #%d，请流式回复", i)},
+				},
+				"stream": true,
+			},
+			Stream: true,
+		}); err != nil {
+			cc.assert("abort_request_no_error", false, err.Error())
+		}
+	}
+
+	deadline := time.Now().Add(25 * time.Second)
+	recovered := false
+	lastInUse := -1
+	for time.Now().Before(deadline) {
+		st, err := r.fetchQueueStatus(ctx, cc)
+		if err != nil {
+			time.Sleep(500 * time.Millisecond)
+			continue
+		}
+		lastInUse = toInt(st["in_use"])
+		if lastInUse <= baseInUse {
+			recovered = true
+			break
+		}
+		time.Sleep(time.Second)
+	}
+	cc.assert("in_use_recovered_after_abort", recovered, fmt.Sprintf("base=%d last=%d", baseInUse, lastInUse))
+	return nil
+}
+
+func (cc *caseContext) abortStreamRequest(ctx context.Context, spec requestSpec) error {
+	cc.seq++
+	traceID := fmt.Sprintf("ts_%s_%s_%03d", cc.runner.runID, sanitizeID(cc.id), cc.seq)
+	cc.traceIDsSet[traceID] = struct{}{}
+	fullURL, err := withTraceQuery(cc.runner.baseURL+spec.Path, traceID)
+	if err != nil {
+		return err
+	}
+	headers := map[string]string{}
+	for k, v := range spec.Headers {
+		headers[k] = v
+	}
+	headers["X-Ds2-Test-Trace"] = traceID
+	bodyBytes, _ := json.Marshal(spec.Body)
+	headers["Content-Type"] = "application/json"
+	cc.requests = append(cc.requests, requestLog{
+		Seq:       cc.seq,
+		Attempt:   1,
+		TraceID:   traceID,
+		Method:    spec.Method,
+		URL:       fullURL,
+		Headers:   headers,
+		Body:      spec.Body,
+		Timestamp: time.Now().Format(time.RFC3339Nano),
+	})
+
+	reqCtx, cancel := context.WithTimeout(ctx, cc.runner.opts.Timeout)
+	defer cancel()
+	req, err := http.NewRequestWithContext(reqCtx, spec.Method, fullURL, bytes.NewReader(bodyBytes))
+	if err != nil {
+		return err
+	}
+	for k, v := range headers {
+		req.Header.Set(k, v)
+	}
+	start := time.Now()
+	resp, err := cc.runner.httpClient.Do(req)
+	if err != nil {
+		cc.responses = append(cc.responses, responseLog{
+			Seq:        cc.seq,
+			Attempt:    1,
+			TraceID:    traceID,
+			StatusCode: 0,
+			DurationMS: time.Since(start).Milliseconds(),
+			NetworkErr: err.Error(),
+			ReceivedAt: time.Now().Format(time.RFC3339Nano),
+		})
+		return err
+	}
+	defer resp.Body.Close()
+	buf := make([]byte, 512)
+	_, _ = resp.Body.Read(buf)
+	_ = resp.Body.Close()
+	cc.responses = append(cc.responses, responseLog{
+		Seq:        cc.seq,
+		Attempt:    1,
+		TraceID:    traceID,
+		StatusCode: resp.StatusCode,
+		Headers:    resp.Header,
+		BodyText:   "aborted_after_first_chunk",
+		DurationMS: time.Since(start).Milliseconds(),
+		ReceivedAt: time.Now().Format(time.RFC3339Nano),
+	})
+	return nil
+}
+
+func (r *Runner) caseToolcallStreamMixed(ctx context.Context, cc *caseContext) error {
+	payload := toolcallPayload(true)
+	payload["messages"] = []map[string]any{
+		{
+			"role":    "user",
+			"content": "请先输出一句普通文本，再调用工具 search 查询 golang，最后再输出一句普通文本。",
+		},
+	}
+	resp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/v1/chat/completions",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.apiKey,
+		},
+		Body:      payload,
+		Stream:    true,
+		Retryable: false,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("status_200", resp.StatusCode == http.StatusOK, fmt.Sprintf("status=%d", resp.StatusCode))
+	frames, done := parseSSEFrames(resp.Body)
+	hasTool := false
+	hasText := false
+	rawLeak := false
+	for _, f := range frames {
+		choices, _ := f["choices"].([]any)
+		for _, c := range choices {
+			ch, _ := c.(map[string]any)
+			delta, _ := ch["delta"].(map[string]any)
+			if _, ok := delta["tool_calls"]; ok {
+				hasTool = true
+			}
+			content := asString(delta["content"])
+			if content != "" {
+				hasText = true
+			}
+			if strings.Contains(strings.ToLower(content), `"tool_calls"`) {
+				rawLeak = true
+			}
+		}
+	}
+	cc.assert("tool_calls_delta_present", hasTool, "tool_calls delta missing")
+	cc.assert("no_raw_tool_json_leak", !rawLeak, "raw tool_calls leaked")
+	cc.assert("done_terminated", done, "expected [DONE]")
+	if !(hasTool && hasText) {
+		r.warnings = append(r.warnings, "toolcall mixed stream did not produce both text and tool_calls in this run (model-side behavior dependent)")
+	}
+	return nil
+}
+
+func (r *Runner) caseSSEJSONIntegrity(ctx context.Context, cc *caseContext) error {
+	openaiResp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/v1/chat/completions",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.apiKey,
+		},
+		Body: map[string]any{
+			"model": "deepseek-chat",
+			"messages": []map[string]any{
+				{"role": "user", "content": "输出一句话"},
+			},
+			"stream": true,
+		},
+		Stream:    true,
+		Retryable: false,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("openai_status_200", openaiResp.StatusCode == http.StatusOK, fmt.Sprintf("status=%d", openaiResp.StatusCode))
+	badOpenAI := countMalformedSSEJSONLines(openaiResp.Body)
+	cc.assert("openai_sse_json_valid", badOpenAI == 0, fmt.Sprintf("malformed=%d", badOpenAI))
+
+	anthropicResp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/anthropic/v1/messages",
+		Headers: map[string]string{
+			"Authorization":     "Bearer " + r.apiKey,
+			"anthropic-version": "2023-06-01",
+		},
+		Body: map[string]any{
+			"model": "claude-sonnet-4-5",
+			"messages": []map[string]any{
+				{"role": "user", "content": "stream json integrity"},
+			},
+			"stream": true,
+		},
+		Stream:    true,
+		Retryable: false,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("anthropic_status_200", anthropicResp.StatusCode == http.StatusOK, fmt.Sprintf("status=%d", anthropicResp.StatusCode))
+	badAnthropic := countMalformedSSEJSONLines(anthropicResp.Body)
+	cc.assert("anthropic_sse_json_valid", badAnthropic == 0, fmt.Sprintf("malformed=%d", badAnthropic))
+	return nil
+}
+
+func (r *Runner) caseInvalidModel(ctx context.Context, cc *caseContext) error {
+	resp, err := cc.requestOnce(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/v1/chat/completions",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.apiKey,
+		},
+		Body: map[string]any{
+			"model": "deepseek-not-exists",
+			"messages": []map[string]any{
+				{"role": "user", "content": "hi"},
+			},
+			"stream": false,
+		},
+		Retryable: false,
+	}, 1)
+	if err != nil {
+		return err
+	}
+	cc.assert("status_503", resp.StatusCode == http.StatusServiceUnavailable, fmt.Sprintf("status=%d", resp.StatusCode))
+	var m map[string]any
+	_ = json.Unmarshal(resp.Body, &m)
+	e, _ := m["error"].(map[string]any)
+	cc.assert("error_type_service_unavailable", asString(e["type"]) == "service_unavailable_error", fmt.Sprintf("body=%s", string(resp.Body)))
+	return nil
+}
+
+func (r *Runner) caseMissingMessages(ctx context.Context, cc *caseContext) error {
+	resp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/v1/chat/completions",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.apiKey,
+		},
+		Body: map[string]any{
+			"model":  "deepseek-chat",
+			"stream": false,
+		},
+		Retryable: true,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("status_400", resp.StatusCode == http.StatusBadRequest, fmt.Sprintf("status=%d", resp.StatusCode))
+	var m map[string]any
+	_ = json.Unmarshal(resp.Body, &m)
+	e, _ := m["error"].(map[string]any)
+	cc.assert("error_type_invalid_request", asString(e["type"]) == "invalid_request_error", fmt.Sprintf("body=%s", string(resp.Body)))
+	return nil
+}
+
+func (r *Runner) caseAdminUnauthorized(ctx context.Context, cc *caseContext) error {
+	resp, err := cc.request(ctx, requestSpec{
+		Method:    http.MethodGet,
+		Path:      "/admin/config",
+		Retryable: true,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("status_401", resp.StatusCode == http.StatusUnauthorized, fmt.Sprintf("status=%d", resp.StatusCode))
+	return nil
+}
+
+func (r *Runner) caseTokenRefreshManagedAccount(ctx context.Context, cc *caseContext) error {
+	if len(r.configRaw.Accounts) == 0 {
+		cc.assert("account_present", false, "no account in config")
+		return nil
+	}
+	acc := r.configRaw.Accounts[0]
+	id := strings.TrimSpace(acc.Email)
+	if id == "" {
+		id = strings.TrimSpace(acc.Mobile)
+	}
+	if id == "" {
+		cc.assert("account_identifier", false, "first account has no identifier")
+		return nil
+	}
+	if strings.TrimSpace(acc.Password) == "" {
+		r.warnings = append(r.warnings, "token refresh edge case skipped strict check: first account password empty")
+		cc.assert("account_password_present", true, "skipped strict refresh check due empty password")
+		return nil
+	}
+	invalidToken := "invalid-testsuite-refresh-token-" + sanitizeID(r.runID)
+	update := map[string]any{
+		"keys": r.configRaw.Keys,
+		"accounts": []map[string]any{
+			{
+				"email":    acc.Email,
+				"mobile":   acc.Mobile,
+				"password": acc.Password,
+				"token":    invalidToken,
+			},
+		},
+	}
+	updResp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/admin/config",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.adminJWT,
+		},
+		Body:      update,
+		Retryable: true,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("update_config_status_200", updResp.StatusCode == http.StatusOK, fmt.Sprintf("status=%d", updResp.StatusCode))
+
+	chatResp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodPost,
+		Path:   "/v1/chat/completions",
+		Headers: map[string]string{
+			"Authorization":        "Bearer " + r.apiKey,
+			"X-Ds2-Target-Account": id,
+		},
+		Body: map[string]any{
+			"model": "deepseek-chat",
+			"messages": []map[string]any{
+				{"role": "user", "content": "token refresh test"},
+			},
+			"stream": false,
+		},
+		Retryable: true,
+	})
+	if err != nil {
+		return err
+	}
+	cc.assert("chat_status_200", chatResp.StatusCode == http.StatusOK, fmt.Sprintf("status=%d body=%s", chatResp.StatusCode, string(chatResp.Body)))
+
+	cfgResp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodGet,
+		Path:   "/admin/config",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.adminJWT,
+		},
+		Retryable: true,
+	})
+	if err != nil {
+		return err
+	}
+	var cfg map[string]any
+	_ = json.Unmarshal(cfgResp.Body, &cfg)
+	accounts, _ := cfg["accounts"].([]any)
+	preview := ""
+	hasToken := false
+	for _, item := range accounts {
+		m, _ := item.(map[string]any)
+		e := asString(m["email"])
+		mo := asString(m["mobile"])
+		if e == acc.Email && mo == acc.Mobile {
+			preview = asString(m["token_preview"])
+			hasToken, _ = m["has_token"].(bool)
+			break
+		}
+	}
+	cc.assert("has_token_after_refresh", hasToken, fmt.Sprintf("config=%s", string(cfgResp.Body)))
+	cc.assert("token_preview_changed_from_invalid", !strings.HasPrefix(preview, invalidToken[:20]), fmt.Sprintf("preview=%s invalid_prefix=%s", preview, invalidToken[:20]))
+	return nil
+}
+
+func (r *Runner) fetchQueueStatus(ctx context.Context, cc *caseContext) (map[string]any, error) {
+	resp, err := cc.request(ctx, requestSpec{
+		Method: http.MethodGet,
+		Path:   "/admin/queue/status",
+		Headers: map[string]string{
+			"Authorization": "Bearer " + r.adminJWT,
+		},
+		Retryable: true,
+	})
+	if err != nil {
+		return nil, err
+	}
+	var m map[string]any
+	if err := json.Unmarshal(resp.Body, &m); err != nil {
+		return nil, err
+	}
+	return m, nil
+}
+
+func countMalformedSSEJSONLines(body []byte) int {
+	lines := strings.Split(string(body), "\n")
+	bad := 0
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		if !strings.HasPrefix(line, "data:") {
+			continue
+		}
+		payload := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
+		if payload == "" || payload == "[DONE]" {
+			continue
+		}
+		var v any
+		if err := json.Unmarshal([]byte(payload), &v); err != nil {
+			bad++
+		}
+	}
+	return bad
+}
--- a/internal/testsuite/runner.go
+++ b/internal/testsuite/runner.go
--- a/internal/util/helpers.go
+++ b/internal/util/helpers.go
@@ -0,0 +1,37 @@
+package util
+
+import (
+	"encoding/json"
+	"net/http"
+)
+
+// WriteJSON writes a JSON response with the given status code.
+// This is a shared helper to avoid duplicate writeJSON functions
+// in openai, claude, and admin packages.
+func WriteJSON(w http.ResponseWriter, status int, payload any) {
+	w.Header().Set("Content-Type", "application/json")
+	w.WriteHeader(status)
+	_ = json.NewEncoder(w).Encode(payload)
+}
+
+// ToBool loosely converts an interface value to bool.
+func ToBool(v any) bool {
+	if b, ok := v.(bool); ok {
+		return b
+	}
+	return false
+}
+
+// IntFrom converts a JSON-decoded numeric value (float64, int, int64) to int.
+func IntFrom(v any) int {
+	switch n := v.(type) {
+	case float64:
+		return int(n)
+	case int:
+		return n
+	case int64:
+		return int(n)
+	default:
+		return 0
+	}
+}
--- a/internal/util/messages.go
+++ b/internal/util/messages.go
@@ -0,0 +1,141 @@
+package util
+
+import (
+	"regexp"
+	"strings"
+
+	"ds2api/internal/config"
+)
+
+var markdownImagePattern = regexp.MustCompile(`!\[(.*?)\]\((.*?)\)`)
+
+const ClaudeDefaultModel = "claude-sonnet-4-5"
+
+type Message struct {
+	Role    string `json:"role"`
+	Content any    `json:"content"`
+}
+
+func MessagesPrepare(messages []map[string]any) string {
+	type block struct {
+		Role string
+		Text string
+	}
+	processed := make([]block, 0, len(messages))
+	for _, m := range messages {
+		role, _ := m["role"].(string)
+		text := normalizeContent(m["content"])
+		processed = append(processed, block{Role: role, Text: text})
+	}
+	if len(processed) == 0 {
+		return ""
+	}
+	merged := make([]block, 0, len(processed))
+	for _, msg := range processed {
+		if len(merged) > 0 && merged[len(merged)-1].Role == msg.Role {
+			merged[len(merged)-1].Text += "\n\n" + msg.Text
+			continue
+		}
+		merged = append(merged, msg)
+	}
+	parts := make([]string, 0, len(merged))
+	for i, m := range merged {
+		switch m.Role {
+		case "assistant":
+			parts = append(parts, "<｜Assistant｜>"+m.Text+"<｜end▁of▁sentence｜>")
+		case "user", "system":
+			if i > 0 {
+				parts = append(parts, "<｜User｜>"+m.Text)
+			} else {
+				parts = append(parts, m.Text)
+			}
+		default:
+			parts = append(parts, m.Text)
+		}
+	}
+	out := strings.Join(parts, "")
+	return markdownImagePattern.ReplaceAllString(out, `[${1}](${2})`)
+}
+
+func normalizeContent(v any) string {
+	switch x := v.(type) {
+	case string:
+		return x
+	case []any:
+		parts := make([]string, 0, len(x))
+		for _, item := range x {
+			m, ok := item.(map[string]any)
+			if !ok {
+				continue
+			}
+			if m["type"] == "text" {
+				if txt, ok := m["text"].(string); ok {
+					parts = append(parts, txt)
+				}
+			}
+		}
+		return strings.Join(parts, "\n")
+	default:
+		return ""
+	}
+}
+
+func ConvertClaudeToDeepSeek(claudeReq map[string]any, store *config.Store) map[string]any {
+	messages, _ := claudeReq["messages"].([]any)
+	model, _ := claudeReq["model"].(string)
+	if model == "" {
+		model = ClaudeDefaultModel
+	}
+	mapping := store.ClaudeMapping()
+	dsModel := mapping["fast"]
+	if dsModel == "" {
+		dsModel = "deepseek-chat"
+	}
+	modelLower := strings.ToLower(model)
+	if strings.Contains(modelLower, "opus") || strings.Contains(modelLower, "reasoner") || strings.Contains(modelLower, "slow") {
+		if slow := mapping["slow"]; slow != "" {
+			dsModel = slow
+		}
+	}
+	convertedMessages := make([]any, 0, len(messages)+1)
+	if system, ok := claudeReq["system"].(string); ok && system != "" {
+		convertedMessages = append(convertedMessages, map[string]any{"role": "system", "content": system})
+	}
+	convertedMessages = append(convertedMessages, messages...)
+
+	out := map[string]any{"model": dsModel, "messages": convertedMessages}
+	for _, k := range []string{"temperature", "top_p", "stream"} {
+		if v, ok := claudeReq[k]; ok {
+			out[k] = v
+		}
+	}
+	if stopSeq, ok := claudeReq["stop_sequences"]; ok {
+		out["stop"] = stopSeq
+	}
+	return out
+}
+
+// EstimateTokens provides a rough token count approximation.
+// For ASCII text (English, code, etc.) we use ~4 chars per token.
+// For non-ASCII text (Chinese, Japanese, Korean, etc.) we use ~1.3 chars per token,
+// which better reflects typical BPE tokenizer behavior for CJK scripts.
+func EstimateTokens(text string) int {
+	if text == "" {
+		return 0
+	}
+	asciiChars := 0
+	nonASCIIChars := 0
+	for _, r := range text {
+		if r < 128 {
+			asciiChars++
+		} else {
+			nonASCIIChars++
+		}
+	}
+	// ASCII: ~4 chars per token; non-ASCII (CJK): ~1.3 chars per token
+	n := asciiChars/4 + (nonASCIIChars*10+7)/13
+	if n < 1 {
+		return 1
+	}
+	return n
+}
--- a/internal/util/messages_test.go
+++ b/internal/util/messages_test.go
@@ -0,0 +1,69 @@
+package util
+
+import (
+	"testing"
+
+	"ds2api/internal/config"
+)
+
+func TestMessagesPrepareBasic(t *testing.T) {
+	messages := []map[string]any{{"role": "user", "content": "Hello"}}
+	got := MessagesPrepare(messages)
+	if got == "" {
+		t.Fatal("expected non-empty prompt")
+	}
+	if got != "Hello" {
+		t.Fatalf("unexpected prompt: %q", got)
+	}
+}
+
+func TestMessagesPrepareRoles(t *testing.T) {
+	messages := []map[string]any{
+		{"role": "system", "content": "You are helper"},
+		{"role": "user", "content": "Hi"},
+		{"role": "assistant", "content": "Hello"},
+		{"role": "user", "content": "How are you"},
+	}
+	got := MessagesPrepare(messages)
+	if !contains(got, "<｜Assistant｜>") {
+		t.Fatalf("expected assistant marker in %q", got)
+	}
+	if !contains(got, "<｜User｜>") {
+		t.Fatalf("expected user marker in %q", got)
+	}
+}
+
+func TestConvertClaudeToDeepSeek(t *testing.T) {
+	store := config.LoadStore()
+	req := map[string]any{
+		"model":    "claude-opus-4-6",
+		"messages": []any{map[string]any{"role": "user", "content": "Hi"}},
+		"system":   "You are helpful",
+		"stream":   true,
+	}
+	out := ConvertClaudeToDeepSeek(req, store)
+	if out["model"] == "" {
+		t.Fatal("expected mapped model")
+	}
+	msgs, ok := out["messages"].([]any)
+	if !ok || len(msgs) == 0 {
+		t.Fatal("expected messages")
+	}
+	first, _ := msgs[0].(map[string]any)
+	if first["role"] != "system" {
+		t.Fatalf("expected first message system, got %#v", first)
+	}
+}
+
+func contains(s, sub string) bool {
+	return len(s) >= len(sub) && (s == sub || len(sub) == 0 || (len(s) > 0 && (indexOf(s, sub) >= 0)))
+}
+
+func indexOf(s, sub string) int {
+	for i := 0; i+len(sub) <= len(s); i++ {
+		if s[i:i+len(sub)] == sub {
+			return i
+		}
+	}
+	return -1
+}
--- a/internal/util/toolcalls.go
+++ b/internal/util/toolcalls.go
@@ -0,0 +1,317 @@
+package util
+
+import (
+	"encoding/json"
+	"regexp"
+	"strings"
+
+	"github.com/google/uuid"
+)
+
+var toolCallPattern = regexp.MustCompile(`\{\s*["']tool_calls["']\s*:\s*\[(.*?)\]\s*\}`)
+var fencedJSONPattern = regexp.MustCompile("(?s)```(?:json)?\\s*(.*?)\\s*```")
+
+type ParsedToolCall struct {
+	Name  string         `json:"name"`
+	Input map[string]any `json:"input"`
+}
+
+func ParseToolCalls(text string, availableToolNames []string) []ParsedToolCall {
+	if strings.TrimSpace(text) == "" {
+		return nil
+	}
+
+	candidates := buildToolCallCandidates(text)
+	var parsed []ParsedToolCall
+	for _, candidate := range candidates {
+		if tc := parseToolCallsPayload(candidate); len(tc) > 0 {
+			parsed = tc
+			break
+		}
+	}
+	if len(parsed) == 0 {
+		return nil
+	}
+
+	allowed := map[string]struct{}{}
+	for _, name := range availableToolNames {
+		allowed[name] = struct{}{}
+	}
+	out := make([]ParsedToolCall, 0, len(parsed))
+	for _, tc := range parsed {
+		if tc.Name == "" {
+			continue
+		}
+		if len(allowed) > 0 {
+			if _, ok := allowed[tc.Name]; !ok {
+				continue
+			}
+		}
+		if tc.Input == nil {
+			tc.Input = map[string]any{}
+		}
+		out = append(out, tc)
+	}
+	// If the model clearly emitted tool_calls JSON but all names are outside the
+	// declared set, keep the parsed calls as a fallback so upper layers can still
+	// intercept structured tool output instead of leaking raw JSON to users.
+	if len(out) == 0 && len(parsed) > 0 {
+		for _, tc := range parsed {
+			if tc.Name == "" {
+				continue
+			}
+			if tc.Input == nil {
+				tc.Input = map[string]any{}
+			}
+			out = append(out, tc)
+		}
+	}
+	return out
+}
+
+func buildToolCallCandidates(text string) []string {
+	trimmed := strings.TrimSpace(text)
+	candidates := []string{trimmed}
+
+	// fenced code block candidates: ```json ... ```
+	for _, match := range fencedJSONPattern.FindAllStringSubmatch(trimmed, -1) {
+		if len(match) >= 2 {
+			candidates = append(candidates, strings.TrimSpace(match[1]))
+		}
+	}
+
+	// best-effort extraction around "tool_calls" key in mixed text payloads.
+	candidates = append(candidates, extractToolCallObjects(trimmed)...)
+
+	// best-effort object slice: from first '{' to last '}'
+	first := strings.Index(trimmed, "{")
+	last := strings.LastIndex(trimmed, "}")
+	if first >= 0 && last > first {
+		candidates = append(candidates, strings.TrimSpace(trimmed[first:last+1]))
+	}
+
+	// legacy regex extraction fallback
+	if m := toolCallPattern.FindStringSubmatch(trimmed); len(m) >= 2 {
+		candidates = append(candidates, "{"+`"tool_calls":[`+m[1]+"]}")
+	}
+
+	uniq := make([]string, 0, len(candidates))
+	seen := map[string]struct{}{}
+	for _, c := range candidates {
+		if c == "" {
+			continue
+		}
+		if _, ok := seen[c]; ok {
+			continue
+		}
+		seen[c] = struct{}{}
+		uniq = append(uniq, c)
+	}
+	return uniq
+}
+
+func parseToolCallsPayload(payload string) []ParsedToolCall {
+	var decoded any
+	if err := json.Unmarshal([]byte(payload), &decoded); err != nil {
+		return nil
+	}
+	switch v := decoded.(type) {
+	case map[string]any:
+		if tc, ok := v["tool_calls"]; ok {
+			return parseToolCallList(tc)
+		}
+		if parsed, ok := parseToolCallItem(v); ok {
+			return []ParsedToolCall{parsed}
+		}
+	case []any:
+		return parseToolCallList(v)
+	}
+	return nil
+}
+
+func parseToolCallList(v any) []ParsedToolCall {
+	items, ok := v.([]any)
+	if !ok {
+		return nil
+	}
+	out := make([]ParsedToolCall, 0, len(items))
+	for _, item := range items {
+		m, ok := item.(map[string]any)
+		if !ok {
+			continue
+		}
+		if tc, ok := parseToolCallItem(m); ok {
+			out = append(out, tc)
+		}
+	}
+	if len(out) == 0 {
+		return nil
+	}
+	return out
+}
+
+func parseToolCallItem(m map[string]any) (ParsedToolCall, bool) {
+	name, _ := m["name"].(string)
+	inputRaw, hasInput := m["input"]
+	if fn, ok := m["function"].(map[string]any); ok {
+		if name == "" {
+			name, _ = fn["name"].(string)
+		}
+		if !hasInput {
+			if v, ok := fn["arguments"]; ok {
+				inputRaw = v
+				hasInput = true
+			}
+		}
+	}
+	if !hasInput {
+		for _, key := range []string{"arguments", "args", "parameters", "params"} {
+			if v, ok := m[key]; ok {
+				inputRaw = v
+				hasInput = true
+				break
+			}
+		}
+	}
+	if strings.TrimSpace(name) == "" {
+		return ParsedToolCall{}, false
+	}
+	return ParsedToolCall{
+		Name:  strings.TrimSpace(name),
+		Input: parseToolCallInput(inputRaw),
+	}, true
+}
+
+func parseToolCallInput(v any) map[string]any {
+	switch x := v.(type) {
+	case nil:
+		return map[string]any{}
+	case map[string]any:
+		return x
+	case string:
+		raw := strings.TrimSpace(x)
+		if raw == "" {
+			return map[string]any{}
+		}
+		var parsed map[string]any
+		if err := json.Unmarshal([]byte(raw), &parsed); err == nil && parsed != nil {
+			return parsed
+		}
+		return map[string]any{"_raw": raw}
+	default:
+		b, err := json.Marshal(x)
+		if err != nil {
+			return map[string]any{}
+		}
+		var parsed map[string]any
+		if err := json.Unmarshal(b, &parsed); err == nil && parsed != nil {
+			return parsed
+		}
+		return map[string]any{}
+	}
+}
+
+func extractToolCallObjects(text string) []string {
+	if text == "" {
+		return nil
+	}
+	lower := strings.ToLower(text)
+	out := []string{}
+	offset := 0
+	for {
+		idx := strings.Index(lower[offset:], "tool_calls")
+		if idx < 0 {
+			break
+		}
+		idx += offset
+		start := strings.LastIndex(text[:idx], "{")
+		for start >= 0 {
+			candidate, end, ok := extractJSONObject(text, start)
+			if ok {
+				// Move forward to avoid repeatedly matching the same object.
+				offset = end
+				out = append(out, strings.TrimSpace(candidate))
+				break
+			}
+			start = strings.LastIndex(text[:start], "{")
+		}
+		if start < 0 {
+			offset = idx + len("tool_calls")
+		}
+	}
+	return out
+}
+
+func extractJSONObject(text string, start int) (string, int, bool) {
+	if start < 0 || start >= len(text) || text[start] != '{' {
+		return "", 0, false
+	}
+	depth := 0
+	quote := byte(0)
+	escaped := false
+	for i := start; i < len(text); i++ {
+		ch := text[i]
+		if quote != 0 {
+			if escaped {
+				escaped = false
+				continue
+			}
+			if ch == '\\' {
+				escaped = true
+				continue
+			}
+			if ch == quote {
+				quote = 0
+			}
+			continue
+		}
+		if ch == '"' || ch == '\'' {
+			quote = ch
+			continue
+		}
+		if ch == '{' {
+			depth++
+			continue
+		}
+		if ch == '}' {
+			depth--
+			if depth == 0 {
+				return text[start : i+1], i + 1, true
+			}
+		}
+	}
+	return "", 0, false
+}
+
+func FormatOpenAIToolCalls(calls []ParsedToolCall) []map[string]any {
+	out := make([]map[string]any, 0, len(calls))
+	for _, c := range calls {
+		args, _ := json.Marshal(c.Input)
+		out = append(out, map[string]any{
+			"id":   "call_" + strings.ReplaceAll(uuid.NewString(), "-", ""),
+			"type": "function",
+			"function": map[string]any{
+				"name":      c.Name,
+				"arguments": string(args),
+			},
+		})
+	}
+	return out
+}
+
+func FormatOpenAIStreamToolCalls(calls []ParsedToolCall) []map[string]any {
+	out := make([]map[string]any, 0, len(calls))
+	for i, c := range calls {
+		args, _ := json.Marshal(c.Input)
+		out = append(out, map[string]any{
+			"index": i,
+			"id":    "call_" + strings.ReplaceAll(uuid.NewString(), "-", ""),
+			"type":  "function",
+			"function": map[string]any{
+				"name":      c.Name,
+				"arguments": string(args),
+			},
+		})
+	}
+	return out
+}
--- a/internal/util/toolcalls_test.go
+++ b/internal/util/toolcalls_test.go
@@ -0,0 +1,64 @@
+package util
+
+import "testing"
+
+func TestParseToolCalls(t *testing.T) {
+	text := `prefix {"tool_calls":[{"name":"search","input":{"q":"golang"}}]} suffix`
+	calls := ParseToolCalls(text, []string{"search"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 call, got %d", len(calls))
+	}
+	if calls[0].Name != "search" {
+		t.Fatalf("unexpected tool name: %s", calls[0].Name)
+	}
+	if calls[0].Input["q"] != "golang" {
+		t.Fatalf("unexpected args: %#v", calls[0].Input)
+	}
+}
+
+func TestParseToolCallsFromFencedJSON(t *testing.T) {
+	text := "I will call tools now\n```json\n{\"tool_calls\":[{\"name\":\"search\",\"input\":{\"q\":\"news\"}}]}\n```"
+	calls := ParseToolCalls(text, []string{"search"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 call, got %d", len(calls))
+	}
+	if calls[0].Input["q"] != "news" {
+		t.Fatalf("unexpected args: %#v", calls[0].Input)
+	}
+}
+
+func TestParseToolCallsWithFunctionArgumentsString(t *testing.T) {
+	text := `{"tool_calls":[{"function":{"name":"get_weather","arguments":"{\"city\":\"beijing\"}"}}]}`
+	calls := ParseToolCalls(text, []string{"get_weather"})
+	if len(calls) != 1 {
+		t.Fatalf("expected 1 call, got %d", len(calls))
+	}
+	if calls[0].Name != "get_weather" {
+		t.Fatalf("unexpected tool name: %s", calls[0].Name)
+	}
+	if calls[0].Input["city"] != "beijing" {
+		t.Fatalf("unexpected args: %#v", calls[0].Input)
+	}
+}
+
+func TestParseToolCallsKeepsUnknownAsFallback(t *testing.T) {
+	text := `{"tool_calls":[{"name":"unknown","input":{}}]}`
+	calls := ParseToolCalls(text, []string{"search"})
+	if len(calls) != 1 {
+		t.Fatalf("expected fallback 1 call, got %d", len(calls))
+	}
+	if calls[0].Name != "unknown" {
+		t.Fatalf("unexpected name: %s", calls[0].Name)
+	}
+}
+
+func TestFormatOpenAIToolCalls(t *testing.T) {
+	formatted := FormatOpenAIToolCalls([]ParsedToolCall{{Name: "search", Input: map[string]any{"q": "x"}}})
+	if len(formatted) != 1 {
+		t.Fatalf("expected 1, got %d", len(formatted))
+	}
+	fn, _ := formatted[0]["function"].(map[string]any)
+	if fn["name"] != "search" {
+		t.Fatalf("unexpected function name: %#v", fn)
+	}
+}
--- a/internal/webui/build.go
+++ b/internal/webui/build.go
@@ -0,0 +1,103 @@
+package webui
+
+import (
+	"context"
+	"errors"
+	"fmt"
+	"os"
+	"os/exec"
+	"path/filepath"
+	"strings"
+	"time"
+
+	"ds2api/internal/config"
+)
+
+const (
+	defaultBuildTimeout = 5 * time.Minute
+)
+
+func EnsureBuiltOnStartup() {
+	if !shouldAutoBuild() {
+		return
+	}
+	staticDir := resolveStaticAdminDir(config.StaticAdminDir())
+	if hasBuiltUI(staticDir) {
+		return
+	}
+	if err := buildWebUI(staticDir); err != nil {
+		config.Logger.Warn("[webui] auto build failed", "error", err)
+		return
+	}
+	if hasBuiltUI(staticDir) {
+		config.Logger.Info("[webui] auto build completed", "dir", staticDir)
+		return
+	}
+	config.Logger.Warn("[webui] auto build finished but output missing", "dir", staticDir)
+}
+
+func shouldAutoBuild() bool {
+	raw := strings.TrimSpace(os.Getenv("DS2API_AUTO_BUILD_WEBUI"))
+	if raw == "" {
+		return !config.IsVercel()
+	}
+	switch strings.ToLower(raw) {
+	case "1", "true", "yes", "on":
+		return true
+	case "0", "false", "no", "off":
+		return false
+	default:
+		return !config.IsVercel()
+	}
+}
+
+func hasBuiltUI(staticDir string) bool {
+	if strings.TrimSpace(staticDir) == "" {
+		return false
+	}
+	indexPath := filepath.Join(staticDir, "index.html")
+	st, err := os.Stat(indexPath)
+	return err == nil && !st.IsDir()
+}
+
+func buildWebUI(staticDir string) error {
+	if _, err := exec.LookPath("npm"); err != nil {
+		return fmt.Errorf("npm not found in PATH: %w", err)
+	}
+	if strings.TrimSpace(staticDir) == "" {
+		return errors.New("static admin dir is empty")
+	}
+
+	config.Logger.Info("[webui] static files missing, running npm build")
+	ctx, cancel := context.WithTimeout(context.Background(), defaultBuildTimeout)
+	defer cancel()
+
+	if _, err := os.Stat(filepath.Join("webui", "node_modules")); err != nil {
+		if !os.IsNotExist(err) {
+			return err
+		}
+		installCmd := exec.CommandContext(ctx, "npm", "ci", "--prefix", "webui")
+		installCmd.Stdout = os.Stdout
+		installCmd.Stderr = os.Stderr
+		if err := installCmd.Run(); err != nil {
+			if errors.Is(ctx.Err(), context.DeadlineExceeded) {
+				return fmt.Errorf("webui npm ci timed out after %s", defaultBuildTimeout)
+			}
+			return err
+		}
+	}
+
+	if err := os.MkdirAll(staticDir, 0o755); err != nil {
+		return err
+	}
+	cmd := exec.CommandContext(ctx, "npm", "run", "build", "--prefix", "webui", "--", "--outDir", staticDir, "--emptyOutDir")
+	cmd.Stdout = os.Stdout
+	cmd.Stderr = os.Stderr
+	if err := cmd.Run(); err != nil {
+		if errors.Is(ctx.Err(), context.DeadlineExceeded) {
+			return fmt.Errorf("webui build timed out after %s", defaultBuildTimeout)
+		}
+		return err
+	}
+	return nil
+}
--- a/internal/webui/handler.go
+++ b/internal/webui/handler.go
@@ -0,0 +1,121 @@
+package webui
+
+import (
+	"net/http"
+	"os"
+	"path/filepath"
+	"strings"
+
+	"github.com/go-chi/chi/v5"
+
+	"ds2api/internal/config"
+)
+
+const welcomeHTML = `<!DOCTYPE html>
+<html lang="zh-CN"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>DS2API</title>
+<style>body{font-family:Inter,system-ui,sans-serif;background:#030712;color:#f9fafb;display:flex;min-height:100vh;align-items:center;justify-content:center;margin:0}a{color:#f59e0b;text-decoration:none}main{max-width:700px;padding:24px;text-align:center}h1{font-size:48px;margin:0 0 12px}.links{display:flex;gap:16px;justify-content:center;margin-top:20px;flex-wrap:wrap}</style>
+</head><body><main><h1>DS2API</h1><p>DeepSeek to OpenAI & Claude Compatible API</p><div class="links"><a href="/admin">管理面板</a><a href="/v1/models">API 状态</a><a href="https://github.com/CJackHwang/ds2api" target="_blank">GitHub</a></div></main></body></html>`
+
+type Handler struct {
+	StaticDir string
+}
+
+func NewHandler() *Handler {
+	return &Handler{StaticDir: resolveStaticAdminDir(config.StaticAdminDir())}
+}
+
+func RegisterRoutes(r chi.Router, h *Handler) {
+	r.Get("/", h.index)
+	r.Get("/admin", h.admin)
+}
+
+func (h *Handler) HandleAdminFallback(w http.ResponseWriter, r *http.Request) bool {
+	if r.Method != http.MethodGet {
+		return false
+	}
+	if !strings.HasPrefix(r.URL.Path, "/admin/") {
+		return false
+	}
+	h.admin(w, r)
+	return true
+}
+
+func (h *Handler) index(w http.ResponseWriter, _ *http.Request) {
+	w.Header().Set("Content-Type", "text/html; charset=utf-8")
+	w.WriteHeader(http.StatusOK)
+	_, _ = w.Write([]byte(welcomeHTML))
+}
+
+func (h *Handler) admin(w http.ResponseWriter, r *http.Request) {
+	staticDir := resolveStaticAdminDir(h.StaticDir)
+	if fi, err := os.Stat(staticDir); err == nil && fi.IsDir() {
+		h.serveFromDisk(w, r, staticDir)
+		return
+	}
+	http.Error(w, "WebUI not built. Run `cd webui && npm run build` first.", http.StatusNotFound)
+}
+
+func (h *Handler) serveFromDisk(w http.ResponseWriter, r *http.Request, staticDir string) {
+	path := strings.TrimPrefix(r.URL.Path, "/admin")
+	path = strings.TrimPrefix(path, "/")
+	if path != "" && strings.Contains(path, ".") {
+		full := filepath.Join(staticDir, filepath.Clean(path))
+		if !strings.HasPrefix(full, staticDir) {
+			http.NotFound(w, r)
+			return
+		}
+		if _, err := os.Stat(full); err == nil {
+			if strings.HasPrefix(path, "assets/") {
+				w.Header().Set("Cache-Control", "public, max-age=31536000, immutable")
+			} else {
+				w.Header().Set("Cache-Control", "no-store, must-revalidate")
+			}
+			http.ServeFile(w, r, full)
+			return
+		}
+		http.NotFound(w, r)
+		return
+	}
+	index := filepath.Join(staticDir, "index.html")
+	if _, err := os.Stat(index); err != nil {
+		http.Error(w, "index.html not found", http.StatusNotFound)
+		return
+	}
+	w.Header().Set("Cache-Control", "no-store, must-revalidate")
+	http.ServeFile(w, r, index)
+}
+
+func resolveStaticAdminDir(preferred string) string {
+	if strings.TrimSpace(os.Getenv("DS2API_STATIC_ADMIN_DIR")) != "" {
+		return filepath.Clean(preferred)
+	}
+	candidates := []string{preferred}
+	if wd, err := os.Getwd(); err == nil {
+		candidates = append(candidates, filepath.Join(wd, "static/admin"))
+	}
+	if exe, err := os.Executable(); err == nil {
+		exeDir := filepath.Dir(exe)
+		candidates = append(candidates,
+			filepath.Join(exeDir, "static/admin"),
+			filepath.Join(filepath.Dir(exeDir), "static/admin"),
+		)
+	}
+	// Common serverless locations.
+	candidates = append(candidates, "/var/task/static/admin", "/var/task/user/static/admin")
+
+	seen := map[string]struct{}{}
+	for _, c := range candidates {
+		c = filepath.Clean(strings.TrimSpace(c))
+		if c == "" {
+			continue
+		}
+		if _, ok := seen[c]; ok {
+			continue
+		}
+		seen[c] = struct{}{}
+		if fi, err := os.Stat(c); err == nil && fi.IsDir() {
+			return c
+		}
+	}
+	return filepath.Clean(preferred)
+}
--- a/opencode.json.example
+++ b/opencode.json.example
@@ -0,0 +1,22 @@
+{
+  "$schema": "https://opencode.ai/config.json",
+  "provider": {
+    "ds2api": {
+      "npm": "@ai-sdk/openai-compatible",
+      "name": "DS2API",
+      "options": {
+        "baseURL": "http://localhost:5001/v1",
+        "apiKey": "your-api-key"
+      },
+      "models": {
+        "deepseek-chat": {
+          "name": "DeepSeek Chat (DS2API)"
+        },
+        "deepseek-reasoner": {
+          "name": "DeepSeek Reasoner (DS2API)"
+        }
+      }
+    }
+  },
+  "model": "ds2api/deepseek-chat"
+}
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,24 +0,0 @@
-# DS2API 依赖
-# 安装命令: pip install -r requirements.txt
-# Python 版本要求: >=3.9
-
-# ===== Web 框架 =====
-fastapi>=0.110.0,<1.0.0
-uvicorn[standard]>=0.24.0,<1.0.0
-
-# ===== HTTP 客户端 =====
-# curl_cffi: 支持 TLS 指纹模拟，绕过 Cloudflare 等防护
-curl_cffi>=0.7.0
-# httpx: 异步 HTTP 客户端，用于 Vercel API 调用
-httpx>=0.25.0
-
-# ===== 模板引擎 =====
-jinja2>=3.1.0,<4.0.0
-
-# ===== Tokenizer =====
-# 用于 token 计数（可选，不安装则使用估算方式）
-transformers>=4.39.0,<5.0.0
-
-# ===== WASM 运行时 =====
-# 用于 DeepSeek PoW (Proof of Work) 计算
-wasmtime>=14.0.0
--- a/routes/init.py
+++ b/routes/init.py
@@ -1 +0,0 @@
-# DS2API Routes
--- a/routes/admin/init.py
+++ b/routes/admin/init.py
@@ -1,20 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Admin 路由模块 - 合并所有子模块路由"""
-from fastapi import APIRouter
-
-from .auth import router as auth_router, verify_admin, ADMIN_KEY
-from .config import router as config_router
-from .accounts import router as accounts_router
-from .vercel import router as vercel_router
-
-# 创建主路由
-router = APIRouter(prefix="/admin", tags=["admin"])
-
-# 包含所有子路由
-router.include_router(auth_router)
-router.include_router(config_router)
-router.include_router(accounts_router)
-router.include_router(vercel_router)
-
-# 导出常用依赖
-__all__ = ["router", "verify_admin", "ADMIN_KEY"]
--- a/routes/admin/accounts.py
+++ b/routes/admin/accounts.py
@@ -1,342 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Admin 账号管理模块 - 账号测试与导入"""
-import asyncio
-import json
-import base64
-
-from fastapi import APIRouter, HTTPException, Request, Depends
-from fastapi.responses import JSONResponse
-
-from core.config import CONFIG, save_config, logger, WASM_PATH
-from core.auth import init_account_queue, get_account_identifier
-from core.deepseek import (
-    login_deepseek_via_account, 
-    DEEPSEEK_CREATE_SESSION_URL, 
-    DEEPSEEK_COMPLETION_URL, 
-    BASE_HEADERS,
-)
-from core.pow import compute_pow_answer
-from core.models import get_model_config
-from core.sse_parser import parse_sse_chunk_for_content
-
-from .auth import verify_admin
-
-router = APIRouter()
-
-
-# ----------------------------------------------------------------------
-# 账号 API 测试
-# ----------------------------------------------------------------------
-async def test_account_api(account: dict, model: str = "deepseek-chat", message: str = "") -> dict:
-    """测试单个账号的 API 调用能力
-    
-    如果提供 message，会发送实际请求并返回 AI 回复；
-    否则只快速测试创建会话。
-    """
-    from curl_cffi import requests as cffi_requests
-    import time
-    
-    acc_id = get_account_identifier(account)
-    result = {
-        "account": acc_id,
-        "success": False,
-        "response_time": 0,
-        "message": "",
-        "model": model,
-    }
-    
-    start_time = time.time()
-    
-    def _is_token_invalid(status_code: int, data: dict) -> bool:
-        msg = (data.get("msg") or data.get("message") or "").lower()
-        code = data.get("code")
-        return status_code in {401, 403} or code in {40001, 40002, 40003} or "token" in msg or "unauthorized" in msg
-
-    def _create_session(token: str) -> dict:
-        headers = {**BASE_HEADERS, "authorization": f"Bearer {token}"}
-        try:
-            session_resp = cffi_requests.post(
-                DEEPSEEK_CREATE_SESSION_URL,
-                headers=headers,
-                json={"agent": "chat"},
-                impersonate="safari15_3",
-                timeout=15,
-            )
-        except Exception as e:
-            return {"success": False, "message": f"请求异常: {e}", "status_code": 0, "data": {}}
-
-        try:
-            session_data = session_resp.json()
-        except Exception:
-            session_data = {}
-        finally:
-            session_resp.close()
-
-        if session_resp.status_code == 200 and session_data.get("code") == 0:
-            return {
-                "success": True,
-                "session_id": session_data.get("data", {}).get("biz_data", {}).get("id"),
-                "status_code": session_resp.status_code,
-                "data": session_data,
-            }
-        return {
-            "success": False,
-            "message": session_data.get("msg") or f"HTTP {session_resp.status_code}",
-            "status_code": session_resp.status_code,
-            "data": session_data,
-        }
-
-    try:
-        token = account.get("token", "").strip()
-        session_result = None
-        if token:
-            session_result = _create_session(token)
-
-        if not token or (session_result and not session_result["success"] and _is_token_invalid(session_result["status_code"], session_result["data"])):
-            try:
-                account["token"] = ""
-                login_deepseek_via_account(account)
-                token = account.get("token", "")
-                session_result = _create_session(token)
-            except Exception as e:
-                result["message"] = f"登录失败: {str(e)}"
-                return result
-
-        if not session_result or not session_result["success"]:
-            result["message"] = f"创建会话失败: {session_result['message'] if session_result else 'Unknown error'}"
-            return result
-
-        session_id = session_result["session_id"]
-        headers = {**BASE_HEADERS, "authorization": f"Bearer {token}"}
-        
-        if not message.strip():
-            result["success"] = True
-            result["message"] = "API 测试成功（仅会话创建）"
-            result["response_time"] = round((time.time() - start_time) * 1000)
-            return result
-        
-        pow_url = "https://chat.deepseek.com/api/v0/chat/create_pow_challenge"
-        pow_resp = cffi_requests.post(
-            pow_url,
-            headers=headers,
-            json={"target_path": "/api/v0/chat/completion"},
-            timeout=30,
-            impersonate="safari15_3",
-        )
-        
-        pow_data = pow_resp.json()
-        if pow_data.get("code") != 0:
-            result["message"] = f"获取 PoW 失败: {pow_data.get('msg')}"
-            return result
-        
-        challenge = pow_data["data"]["biz_data"]["challenge"]
-        try:
-            answer = compute_pow_answer(
-                challenge["algorithm"],
-                challenge["challenge"],
-                challenge["salt"],
-                challenge.get("difficulty", 144000),
-                challenge.get("expire_at", 1680000000),
-                challenge["signature"],
-                challenge["target_path"],
-                WASM_PATH,
-            )
-        except Exception as e:
-            result["message"] = f"PoW 计算失败: {str(e)}"
-            return result
-        
-        pow_dict = {
-            "algorithm": challenge["algorithm"],
-            "challenge": challenge["challenge"],
-            "salt": challenge["salt"],
-            "answer": answer,
-            "signature": challenge["signature"],
-            "target_path": challenge["target_path"],
-        }
-        pow_str = json.dumps(pow_dict, separators=(",", ":"), ensure_ascii=False)
-        pow_header = base64.b64encode(pow_str.encode("utf-8")).decode("utf-8").rstrip()
-        
-        thinking_enabled, search_enabled = get_model_config(model)
-        if thinking_enabled is None:
-            thinking_enabled = False
-            search_enabled = False
-        
-        payload = {
-            "chat_session_id": session_id,
-            "prompt": f"<｜User｜>{message}",
-            "ref_file_ids": [],
-            "thinking_enabled": thinking_enabled,
-            "search_enabled": search_enabled,
-        }
-        
-        completion_headers = {**headers, "x-ds-pow-response": pow_header}
-        
-        completion_resp = cffi_requests.post(
-            DEEPSEEK_COMPLETION_URL,
-            headers=completion_headers,
-            json=payload,
-            impersonate="safari15_3",
-            timeout=60,
-            stream=True,
-        )
-        
-        if completion_resp.status_code != 200:
-            result["message"] = f"请求失败: HTTP {completion_resp.status_code}"
-            return result
-        
-        thinking_parts = []
-        content_parts = []
-        current_fragment_type = "thinking" if thinking_enabled else "text"
-        
-        for line in completion_resp.iter_lines():
-            if not line:
-                continue
-            try:
-                line_str = line.decode("utf-8")
-            except:
-                continue
-            
-            if not line_str.startswith("data:"):
-                continue
-            
-            data_str = line_str[5:].strip()
-            if data_str == "[DONE]":
-                break
-            
-            try:
-                chunk = json.loads(data_str)
-                # 使用共享的解析函数
-                contents, is_finished, current_fragment_type = parse_sse_chunk_for_content(
-                    chunk, thinking_enabled, current_fragment_type
-                )
-                
-                if is_finished:
-                    break
-                
-                for content, ctype in contents:
-                    if ctype == "thinking":
-                        thinking_parts.append(content)
-                    else:
-                        content_parts.append(content)
-            except:
-                continue
-        
-        completion_resp.close()
-        
-        result["success"] = True
-        result["response_time"] = round((time.time() - start_time) * 1000)
-        result["message"] = "".join(content_parts) or "（无回复内容）"
-        if thinking_parts:
-            result["thinking"] = "".join(thinking_parts)
-        
-    except Exception as e:
-        result["message"] = f"测试失败: {str(e)}"
-    
-    return result
-
-
-@router.post("/accounts/test")
-async def test_single_account(request: Request, _: bool = Depends(verify_admin)):
-    """测试单个账号的 API 调用"""
-    data = await request.json()
-    identifier = data.get("identifier", "")
-    model = data.get("model", "deepseek-chat")
-    message = data.get("message", "")
-    
-    if not identifier:
-        raise HTTPException(status_code=400, detail="需要账号标识（email 或 mobile）")
-    
-    account = None
-    for acc in CONFIG.get("accounts", []):
-        if acc.get("email") == identifier or acc.get("mobile") == identifier:
-            account = acc
-            break
-    
-    if not account:
-        raise HTTPException(status_code=404, detail="账号不存在")
-    
-    result = await test_account_api(account, model, message)
-    save_config(CONFIG)
-    
-    return JSONResponse(content=result)
-
-
-@router.post("/accounts/test-all")
-async def test_all_accounts(request: Request, _: bool = Depends(verify_admin)):
-    """批量测试所有账号的 API 调用"""
-    data = await request.json()
-    model = data.get("model", "deepseek-chat")
-    
-    accounts = CONFIG.get("accounts", [])
-    if not accounts:
-        return JSONResponse(content={
-            "total": 0, "success": 0, "failed": 0, "results": [],
-        })
-    
-    results = []
-    success_count = 0
-    
-    for acc in accounts:
-        result = await test_account_api(acc, model)
-        results.append(result)
-        if result["success"]:
-            success_count += 1
-        await asyncio.sleep(1)
-    
-    save_config(CONFIG)
-    
-    return JSONResponse(content={
-        "total": len(accounts),
-        "success": success_count,
-        "failed": len(accounts) - success_count,
-        "results": results,
-    })
-
-
-# ----------------------------------------------------------------------
-# 批量导入
-# ----------------------------------------------------------------------
-@router.post("/import")
-async def batch_import(request: Request, _: bool = Depends(verify_admin)):
-    """批量导入 keys 和 accounts"""
-    try:
-        data = await request.json()
-        imported_keys = 0
-        imported_accounts = 0
-        
-        if "keys" in data:
-            for key in data["keys"]:
-                if key not in CONFIG.get("keys", []):
-                    if "keys" not in CONFIG:
-                        CONFIG["keys"] = []
-                    CONFIG["keys"].append(key)
-                    imported_keys += 1
-        
-        if "accounts" in data:
-            existing_ids = set()
-            for acc in CONFIG.get("accounts", []):
-                existing_ids.add(acc.get("email", ""))
-                existing_ids.add(acc.get("mobile", ""))
-            
-            for acc in data["accounts"]:
-                acc_id = acc.get("email", "") or acc.get("mobile", "")
-                if acc_id and acc_id not in existing_ids:
-                    if "accounts" not in CONFIG:
-                        CONFIG["accounts"] = []
-                    CONFIG["accounts"].append(acc)
-                    existing_ids.add(acc_id)
-                    imported_accounts += 1
-        
-        init_account_queue()
-        save_config(CONFIG)
-        
-        return JSONResponse(content={
-            "success": True,
-            "imported_keys": imported_keys,
-            "imported_accounts": imported_accounts,
-        })
-    except json.JSONDecodeError:
-        raise HTTPException(status_code=400, detail="无效的 JSON 格式")
-    except Exception as e:
-        logger.error(f"[batch_import] 错误: {e}")
-        raise HTTPException(status_code=500, detail=str(e))
--- a/routes/admin/auth.py
+++ b/routes/admin/auth.py
@@ -1,155 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Admin 认证模块 - JWT 和登录相关"""
-import base64
-import os
-import time
-import hashlib
-import hmac
-
-from fastapi import APIRouter, HTTPException, Request, Depends
-from fastapi.responses import JSONResponse
-from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
-
-from core.config import logger
-
-router = APIRouter()
-security = HTTPBearer(auto_error=False)
-
-# Admin Key 验证（默认值适用于开发/演示环境，生产环境请务必修改）
-ADMIN_KEY = os.getenv("DS2API_ADMIN_KEY", "your-admin-secret-key")
-
-# JWT 配置
-JWT_SECRET = os.getenv("DS2API_JWT_SECRET", ADMIN_KEY or "ds2api-default-secret")
-JWT_EXPIRE_HOURS = int(os.getenv("DS2API_JWT_EXPIRE_HOURS", "24"))
-
-
-# ----------------------------------------------------------------------
-# JWT 工具函数（轻量实现，无需额外依赖）
-# ----------------------------------------------------------------------
-def _b64_encode(data: bytes) -> str:
-    """Base64 URL 安全编码"""
-    return base64.urlsafe_b64encode(data).rstrip(b"=").decode("ascii")
-
-def _b64_decode(data: str) -> bytes:
-    """Base64 URL 安全解码"""
-    padding = 4 - len(data) % 4
-    if padding != 4:
-        data += "=" * padding
-    return base64.urlsafe_b64decode(data)
-
-def create_jwt_token(expire_hours: int = None) -> str:
-    """创建 JWT Token"""
-    import json
-    
-    if expire_hours is None:
-        expire_hours = JWT_EXPIRE_HOURS
-    
-    header = {"alg": "HS256", "typ": "JWT"}
-    payload = {
-        "iat": int(time.time()),
-        "exp": int(time.time()) + (expire_hours * 3600),
-        "role": "admin"
-    }
-    
-    header_b64 = _b64_encode(json.dumps(header, separators=(",", ":")).encode())
-    payload_b64 = _b64_encode(json.dumps(payload, separators=(",", ":")).encode())
-    
-    message = f"{header_b64}.{payload_b64}"
-    signature = hmac.new(JWT_SECRET.encode(), message.encode(), hashlib.sha256).digest()
-    signature_b64 = _b64_encode(signature)
-    
-    return f"{message}.{signature_b64}"
-
-def verify_jwt_token(token: str) -> dict:
-    """验证 JWT Token，返回 payload 或抛出异常"""
-    import json
-    
-    try:
-        parts = token.split(".")
-        if len(parts) != 3:
-            raise ValueError("Invalid token format")
-        
-        header_b64, payload_b64, signature_b64 = parts
-        
-        # 验证签名
-        message = f"{header_b64}.{payload_b64}"
-        expected_sig = hmac.new(JWT_SECRET.encode(), message.encode(), hashlib.sha256).digest()
-        actual_sig = _b64_decode(signature_b64)
-        
-        if not hmac.compare_digest(expected_sig, actual_sig):
-            raise ValueError("Invalid signature")
-        
-        # 解析 payload
-        payload = json.loads(_b64_decode(payload_b64))
-        
-        # 验证过期时间
-        if payload.get("exp", 0) < time.time():
-            raise ValueError("Token expired")
-        
-        return payload
-    except Exception as e:
-        raise ValueError(f"Token verification failed: {str(e)}")
-
-
-# ----------------------------------------------------------------------
-# 登录端点
-# ----------------------------------------------------------------------
-@router.post("/login")
-async def admin_login(request: Request):
-    """管理员登录，返回 JWT Token"""
-    try:
-        data = await request.json()
-    except:
-        data = {}
-    
-    admin_key = data.get("admin_key", "")
-    expire_hours = data.get("expire_hours", JWT_EXPIRE_HOURS)
-    
-    if admin_key != ADMIN_KEY:
-        raise HTTPException(status_code=401, detail="Invalid admin key")
-    
-    token = create_jwt_token(expire_hours)
-    return JSONResponse(content={
-        "success": True,
-        "token": token,
-        "expires_in": expire_hours * 3600
-    })
-
-
-@router.get("/verify")
-async def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
-    """验证当前 Token 是否有效"""
-    if not credentials:
-        raise HTTPException(status_code=401, detail="No credentials provided")
-    
-    token = credentials.credentials
-    try:
-        payload = verify_jwt_token(token)
-        return JSONResponse(content={
-            "valid": True,
-            "expires_at": payload.get("exp"),
-            "remaining_seconds": max(0, payload.get("exp", 0) - int(time.time()))
-        })
-    except ValueError as e:
-        raise HTTPException(status_code=401, detail=str(e))
-
-
-def verify_admin(credentials: HTTPAuthorizationCredentials = Depends(security)):
-    """验证 Admin 权限（支持 JWT 和直接 admin key）"""
-    if not credentials:
-        raise HTTPException(status_code=401, detail="Authentication required")
-    
-    token = credentials.credentials
-    
-    # 尝试 JWT 验证
-    try:
-        verify_jwt_token(token)
-        return True
-    except ValueError:
-        pass
-    
-    # 尝试直接 admin key
-    if token == ADMIN_KEY:
-        return True
-    
-    raise HTTPException(status_code=401, detail="Invalid credentials")
--- a/routes/admin/config.py
+++ b/routes/admin/config.py
@@ -1,226 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Admin 配置管理模块 - 配置、API Keys、账号管理"""
-import os
-
-from fastapi import APIRouter, HTTPException, Request, Depends
-from fastapi.responses import JSONResponse
-
-from core.config import CONFIG, save_config, logger
-from core.auth import init_account_queue, get_queue_status, get_account_identifier
-from core.deepseek import login_deepseek_via_account
-
-from .auth import verify_admin
-
-router = APIRouter()
-
-# Vercel 预配置
-VERCEL_TOKEN = os.getenv("VERCEL_TOKEN", "")
-VERCEL_PROJECT_ID = os.getenv("VERCEL_PROJECT_ID", "")
-VERCEL_TEAM_ID = os.getenv("VERCEL_TEAM_ID", "")
-
-
-# ----------------------------------------------------------------------
-# Vercel 预配置信息
-# ----------------------------------------------------------------------
-@router.get("/vercel/config")
-async def get_vercel_config(_: bool = Depends(verify_admin)):
-    """获取预配置的 Vercel 信息（脱敏）"""
-    return JSONResponse(content={
-        "has_token": bool(VERCEL_TOKEN),
-        "project_id": VERCEL_PROJECT_ID,
-        "team_id": VERCEL_TEAM_ID or None,
-    })
-
-
-# ----------------------------------------------------------------------
-# 配置管理
-# ----------------------------------------------------------------------
-@router.get("/config")
-async def get_config(_: bool = Depends(verify_admin)):
-    """获取当前配置（密码脱敏）"""
-    safe_config = {
-        "keys": CONFIG.get("keys", []),
-        "accounts": [],
-        "claude_mapping": CONFIG.get("claude_mapping", {}),
-    }
-    
-    for acc in CONFIG.get("accounts", []):
-        safe_acc = {
-            "email": acc.get("email", ""),
-            "mobile": acc.get("mobile", ""),
-            "has_password": bool(acc.get("password")),
-            "has_token": bool(acc.get("token")),
-            "token_preview": acc.get("token", "")[:20] + "..." if acc.get("token") else "",
-        }
-        safe_config["accounts"].append(safe_acc)
-    
-    return JSONResponse(content=safe_config)
-
-
-@router.post("/config")
-async def update_config(request: Request, _: bool = Depends(verify_admin)):
-    """更新完整配置"""
-    data = await request.json()
-    
-    if "keys" in data:
-        CONFIG["keys"] = data["keys"]
-    
-    if "accounts" in data:
-        # 保留原有密码和 token
-        existing = {get_account_identifier(a): a for a in CONFIG.get("accounts", [])}
-        for acc in data["accounts"]:
-            acc_id = get_account_identifier(acc)
-            if acc_id in existing:
-                if not acc.get("password"):
-                    acc["password"] = existing[acc_id].get("password", "")
-                if not acc.get("token"):
-                    acc["token"] = existing[acc_id].get("token", "")
-        CONFIG["accounts"] = data["accounts"]
-        init_account_queue()
-    
-    if "claude_mapping" in data:
-        CONFIG["claude_mapping"] = data["claude_mapping"]
-    
-    save_config(CONFIG)
-    return JSONResponse(content={"success": True, "message": "配置已更新"})
-
-
-# ----------------------------------------------------------------------
-# API Keys 管理
-# ----------------------------------------------------------------------
-@router.post("/keys")
-async def add_key(request: Request, _: bool = Depends(verify_admin)):
-    """添加 API Key"""
-    data = await request.json()
-    key = data.get("key", "").strip()
-    
-    if not key:
-        raise HTTPException(status_code=400, detail="Key 不能为空")
-    
-    if key in CONFIG.get("keys", []):
-        raise HTTPException(status_code=400, detail="Key 已存在")
-    
-    if "keys" not in CONFIG:
-        CONFIG["keys"] = []
-    CONFIG["keys"].append(key)
-    save_config(CONFIG)
-    
-    return JSONResponse(content={"success": True, "total_keys": len(CONFIG["keys"])})
-
-
-@router.delete("/keys/{key}")
-async def delete_key(key: str, _: bool = Depends(verify_admin)):
-    """删除 API Key"""
-    if key not in CONFIG.get("keys", []):
-        raise HTTPException(status_code=404, detail="Key 不存在")
-    
-    CONFIG["keys"].remove(key)
-    save_config(CONFIG)
-    return JSONResponse(content={"success": True, "total_keys": len(CONFIG["keys"])})
-
-
-# ----------------------------------------------------------------------
-# 账号管理
-# ----------------------------------------------------------------------
-@router.get("/accounts")
-async def list_accounts(
-    page: int = 1,
-    page_size: int = 10,
-    _: bool = Depends(verify_admin)
-):
-    """获取账号列表（分页，倒序，密码脱敏）"""
-    accounts = CONFIG.get("accounts", [])
-    total = len(accounts)
-
-    # 倒序排列
-    accounts = list(reversed(accounts))
-
-    # 计算分页
-    page = max(1, page)
-    page_size = max(1, min(100, page_size))  # 限制每页最多 100 条
-    total_pages = (total + page_size - 1) // page_size if total > 0 else 1
-
-    start = (page - 1) * page_size
-    end = start + page_size
-    page_accounts = accounts[start:end]
-
-    # 脱敏处理
-    safe_accounts = []
-    for acc in page_accounts:
-        safe_acc = {
-            "email": acc.get("email", ""),
-            "mobile": acc.get("mobile", ""),
-            "has_password": bool(acc.get("password")),
-            "has_token": bool(acc.get("token")),
-            "token_preview": acc.get("token", "")[:20] + "..." if acc.get("token") else "",
-        }
-        safe_accounts.append(safe_acc)
-
-    return JSONResponse(content={
-        "items": safe_accounts,
-        "total": total,
-        "page": page,
-        "page_size": page_size,
-        "total_pages": total_pages,
-    })
-
-
-@router.post("/accounts")
-async def add_account(request: Request, _: bool = Depends(verify_admin)):
-    """添加账号"""
-    data = await request.json()
-    email = data.get("email", "").strip()
-    mobile = data.get("mobile", "").strip()
-    password = data.get("password", "").strip()
-    token = data.get("token", "").strip()
-    
-    if not email and not mobile:
-        raise HTTPException(status_code=400, detail="需要 email 或 mobile")
-    
-    # 检查是否已存在
-    for acc in CONFIG.get("accounts", []):
-        if email and acc.get("email") == email:
-            raise HTTPException(status_code=400, detail="邮箱已存在")
-        if mobile and acc.get("mobile") == mobile:
-            raise HTTPException(status_code=400, detail="手机号已存在")
-    
-    new_account = {}
-    if email:
-        new_account["email"] = email
-    if mobile:
-        new_account["mobile"] = mobile
-    if password:
-        new_account["password"] = password
-    if token:
-        new_account["token"] = token
-    
-    if "accounts" not in CONFIG:
-        CONFIG["accounts"] = []
-    CONFIG["accounts"].append(new_account)
-    init_account_queue()
-    save_config(CONFIG)
-    
-    return JSONResponse(content={"success": True, "total_accounts": len(CONFIG["accounts"])})
-
-
-@router.delete("/accounts/{identifier}")
-async def delete_account(identifier: str, _: bool = Depends(verify_admin)):
-    """删除账号（通过 email 或 mobile）"""
-    accounts = CONFIG.get("accounts", [])
-    for i, acc in enumerate(accounts):
-        if acc.get("email") == identifier or acc.get("mobile") == identifier:
-            accounts.pop(i)
-            init_account_queue()
-            save_config(CONFIG)
-            return JSONResponse(content={"success": True, "total_accounts": len(accounts)})
-    
-    raise HTTPException(status_code=404, detail="账号不存在")
-
-
-# ----------------------------------------------------------------------
-# 账号队列状态
-# ----------------------------------------------------------------------
-@router.get("/queue/status")
-async def get_account_queue_status(_: bool = Depends(verify_admin)):
-    """获取账号轮询队列状态"""
-    return JSONResponse(content=get_queue_status())
--- a/routes/admin/vercel.py
+++ b/routes/admin/vercel.py
@@ -1,316 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Admin Vercel 模块 - Vercel 同步和部署"""
-import asyncio
-import base64
-import hashlib
-import json
-import os
-import time as _time
-
-import httpx
-from fastapi import APIRouter, HTTPException, Request, Depends
-from fastapi.responses import JSONResponse
-
-from core.config import CONFIG, save_config, logger
-from core.auth import get_account_identifier, init_account_queue
-from core.deepseek import login_deepseek_via_account
-
-from .auth import verify_admin
-
-router = APIRouter()
-
-# Vercel 预配置
-VERCEL_TOKEN = os.getenv("VERCEL_TOKEN", "")
-VERCEL_PROJECT_ID = os.getenv("VERCEL_PROJECT_ID", "")
-VERCEL_TEAM_ID = os.getenv("VERCEL_TEAM_ID", "")
-
-
-def _compute_config_hash() -> str:
-    """计算可同步配置的指纹哈希（仅包含 keys 和 accounts）"""
-    syncable = {
-        "keys": CONFIG.get("keys", []),
-        "accounts": [
-            {k: v for k, v in acc.items() if k != "token"}
-            for acc in CONFIG.get("accounts", [])
-        ],
-    }
-    raw = json.dumps(syncable, sort_keys=True, ensure_ascii=False, separators=(",", ":"))
-    return hashlib.md5(raw.encode("utf-8")).hexdigest()
-
-
-# ----------------------------------------------------------------------
-# API 测试（通过本地 API）
-# ----------------------------------------------------------------------
-@router.post("/test")
-async def test_api(request: Request, _: bool = Depends(verify_admin)):
-    """测试 API 调用"""
-    try:
-        data = await request.json()
-        model = data.get("model", "deepseek-chat")
-        message = data.get("message", "你好")
-        api_key = data.get("api_key", "")
-        
-        if not api_key:
-            keys = CONFIG.get("keys", [])
-            if not keys:
-                raise HTTPException(status_code=400, detail="没有可用的 API Key")
-            api_key = keys[0]
-        
-        host = request.headers.get("host", "localhost:5001")
-        scheme = "https" if "vercel" in host.lower() else "http"
-        base_url = f"{scheme}://{host}"
-        
-        async with httpx.AsyncClient(timeout=60.0) as client:
-            response = await client.post(
-                f"{base_url}/v1/chat/completions",
-                headers={"Authorization": f"Bearer {api_key}"},
-                json={
-                    "model": model,
-                    "messages": [{"role": "user", "content": message}],
-                    "stream": False,
-                },
-            )
-            
-            return JSONResponse(content={
-                "success": response.status_code == 200,
-                "status_code": response.status_code,
-                "response": response.json() if response.status_code == 200 else response.text,
-            })
-    except Exception as e:
-        logger.error(f"[test_api] 错误: {e}")
-        return JSONResponse(content={"success": False, "error": str(e)})
-
-
-# ----------------------------------------------------------------------
-# Vercel 同步
-# ----------------------------------------------------------------------
-@router.post("/vercel/sync")
-async def sync_to_vercel(request: Request, _: bool = Depends(verify_admin)):
-    """同步配置到 Vercel 并触发重新部署"""
-    try:
-        data = await request.json()
-        vercel_token = data.get("vercel_token", "")
-        project_id = data.get("project_id", "")
-        team_id = data.get("team_id", "")
-        auto_validate = data.get("auto_validate", True)
-        save_vercel_credentials = data.get("save_credentials", True)
-        
-        use_preconfig = vercel_token == "__USE_PRECONFIG__" or not vercel_token
-        if use_preconfig:
-            vercel_token = VERCEL_TOKEN
-        if not project_id:
-            project_id = VERCEL_PROJECT_ID
-        if not team_id:
-            team_id = VERCEL_TEAM_ID
-        
-        if not vercel_token or not project_id:
-            raise HTTPException(status_code=400, detail="需要 Vercel Token 和 Project ID")
-        
-        # 自动验证账号
-        validated_count = 0
-        failed_accounts = []
-        if auto_validate:
-            accounts = CONFIG.get("accounts", [])
-            for acc in accounts:
-                acc_id = get_account_identifier(acc)
-                if not acc.get("token", "").strip():
-                    try:
-                        logger.info(f"[sync_to_vercel] 自动验证账号: {acc_id}")
-                        login_deepseek_via_account(acc)
-                        validated_count += 1
-                    except Exception as e:
-                        logger.warning(f"[sync_to_vercel] 账号 {acc_id} 验证失败: {e}")
-                        failed_accounts.append(acc_id)
-                    await asyncio.sleep(0.5)
-        
-        config_json = json.dumps(CONFIG, ensure_ascii=False, separators=(",", ":"))
-        config_b64 = base64.b64encode(config_json.encode("utf-8")).decode("utf-8")
-        
-        headers = {"Authorization": f"Bearer {vercel_token}"}
-        base_url = "https://api.vercel.com"
-        
-        async with httpx.AsyncClient(timeout=30.0) as client:
-            params = {"teamId": team_id} if team_id else {}
-            env_resp = await client.get(
-                f"{base_url}/v9/projects/{project_id}/env",
-                headers=headers,
-                params=params,
-            )
-            
-            if env_resp.status_code != 200:
-                raise HTTPException(status_code=env_resp.status_code, detail=f"获取环境变量失败: {env_resp.text}")
-            
-            env_vars = env_resp.json().get("envs", [])
-            existing_env = None
-            for env in env_vars:
-                if env.get("key") == "DS2API_CONFIG_JSON":
-                    existing_env = env
-                    break
-            
-            if existing_env:
-                env_id = existing_env["id"]
-                update_resp = await client.patch(
-                    f"{base_url}/v9/projects/{project_id}/env/{env_id}",
-                    headers=headers,
-                    params=params,
-                    json={"value": config_b64},
-                )
-                if update_resp.status_code not in [200, 201]:
-                    raise HTTPException(status_code=update_resp.status_code, detail=f"更新环境变量失败: {update_resp.text}")
-            else:
-                create_resp = await client.post(
-                    f"{base_url}/v10/projects/{project_id}/env",
-                    headers=headers,
-                    params=params,
-                    json={
-                        "key": "DS2API_CONFIG_JSON",
-                        "value": config_b64,
-                        "type": "encrypted",
-                        "target": ["production", "preview"],
-                    },
-                )
-                if create_resp.status_code not in [200, 201]:
-                    raise HTTPException(status_code=create_resp.status_code, detail=f"创建环境变量失败: {create_resp.text}")
-            
-            # 保存 Vercel 凭证
-            saved_credentials = []
-            if save_vercel_credentials and not use_preconfig:
-                creds_to_save = [
-                    ("VERCEL_TOKEN", vercel_token),
-                    ("VERCEL_PROJECT_ID", project_id),
-                ]
-                if team_id:
-                    creds_to_save.append(("VERCEL_TEAM_ID", team_id))
-                
-                for key, value in creds_to_save:
-                    existing = None
-                    for env in env_vars:
-                        if env.get("key") == key:
-                            existing = env
-                            break
-                    
-                    if existing:
-                        upd_resp = await client.patch(
-                            f"{base_url}/v9/projects/{project_id}/env/{existing['id']}",
-                            headers=headers,
-                            params=params,
-                            json={"value": value},
-                        )
-                        if upd_resp.status_code in [200, 201]:
-                            saved_credentials.append(key)
-                    else:
-                        crt_resp = await client.post(
-                            f"{base_url}/v10/projects/{project_id}/env",
-                            headers=headers,
-                            params=params,
-                            json={
-                                "key": key,
-                                "value": value,
-                                "type": "encrypted",
-                                "target": ["production", "preview"],
-                            },
-                        )
-                        if crt_resp.status_code in [200, 201]:
-                            saved_credentials.append(key)
-            
-            # 触发重新部署
-            project_resp = await client.get(
-                f"{base_url}/v9/projects/{project_id}",
-                headers=headers,
-                params=params,
-            )
-            
-            if project_resp.status_code == 200:
-                project_data = project_resp.json()
-                repo = project_data.get("link", {})
-                
-                if repo.get("type") == "github":
-                    deploy_resp = await client.post(
-                        f"{base_url}/v13/deployments",
-                        headers=headers,
-                        params=params,
-                        json={
-                            "name": project_id,
-                            "project": project_id,
-                            "target": "production",
-                            "gitSource": {
-                                "type": "github",
-                                "repoId": repo.get("repoId"),
-                                "ref": repo.get("productionBranch", "main"),
-                            },
-                        },
-                    )
-                    
-                    if deploy_resp.status_code in [200, 201]:
-                        deploy_data = deploy_resp.json()
-                        # 记录同步哈希和时间
-                        CONFIG["_vercel_sync_hash"] = _compute_config_hash()
-                        CONFIG["_vercel_sync_time"] = int(_time.time())
-                        save_config(CONFIG)
-                        result = {
-                            "success": True,
-                            "message": "配置已同步，正在重新部署...",
-                            "deployment_url": deploy_data.get("url"),
-                            "validated_accounts": validated_count,
-                        }
-                        if failed_accounts:
-                            result["failed_accounts"] = failed_accounts
-                        if saved_credentials:
-                            result["saved_credentials"] = saved_credentials
-                        return JSONResponse(content=result)
-            
-            # 环境变量已更新，但无法自动触发重新部署
-            CONFIG["_vercel_sync_hash"] = _compute_config_hash()
-            CONFIG["_vercel_sync_time"] = int(_time.time())
-            save_config(CONFIG)
-            result = {
-                "success": True,
-                "message": "配置已同步到 Vercel，请手动触发重新部署",
-                "manual_deploy_required": True,
-                "validated_accounts": validated_count,
-            }
-            if failed_accounts:
-                result["failed_accounts"] = failed_accounts
-            if saved_credentials:
-                result["saved_credentials"] = saved_credentials
-            return JSONResponse(content=result)
-            
-    except HTTPException:
-        raise
-    except Exception as e:
-        logger.error(f"[sync_to_vercel] 错误: {e}")
-        raise HTTPException(status_code=500, detail=str(e))
-
-
-# ----------------------------------------------------------------------
-# 同步状态查询
-# ----------------------------------------------------------------------
-@router.get("/vercel/status")
-async def get_vercel_sync_status(_: bool = Depends(verify_admin)):
-    """检查当前配置与上次同步到 Vercel 的配置是否一致"""
-    last_hash = CONFIG.get("_vercel_sync_hash", "")
-    last_time = CONFIG.get("_vercel_sync_time", 0)
-    current_hash = _compute_config_hash()
-
-    synced = bool(last_hash and last_hash == current_hash)
-
-    return JSONResponse(content={
-        "synced": synced,
-        "last_sync_time": last_time if last_time else None,
-        "has_synced_before": bool(last_hash),
-    })
-
-
-# ----------------------------------------------------------------------
-# 导出配置
-# ----------------------------------------------------------------------
-@router.get("/export")
-async def export_config(_: bool = Depends(verify_admin)):
-    """导出完整配置（JSON 和 Base64）"""
-    config_json = json.dumps(CONFIG, ensure_ascii=False, separators=(",", ":"))
-    config_b64 = base64.b64encode(config_json.encode("utf-8")).decode("utf-8")
-    
-    return JSONResponse(content={
-        "json": config_json,
-        "base64": config_b64,
-    })
--- a/routes/claude.py
+++ b/routes/claude.py
@@ -1,509 +0,0 @@
-# -*- coding: utf-8 -*-
-"""Claude API 路由"""
-import json
-import random
-import time
-
-from curl_cffi import requests as cffi_requests
-from fastapi import APIRouter, HTTPException, Request
-from fastapi.responses import JSONResponse, StreamingResponse
-
-from core.config import CONFIG, logger
-from core.auth import (
-    determine_mode_and_token,
-    get_auth_headers,
-)
-from core.deepseek import call_completion_endpoint
-from core.session_manager import (
-    create_session,
-    get_pow,
-    cleanup_account,
-)
-from core.models import get_model_config, get_claude_models_response
-from core.sse_parser import (
-    parse_deepseek_sse_line,
-    parse_sse_chunk_for_content,
-    extract_content_from_chunk,
-    collect_deepseek_response,
-    parse_tool_calls,
-)
-from core.constants import STREAM_IDLE_TIMEOUT
-from core.utils import estimate_tokens
-from core.messages import (
-    messages_prepare,
-    convert_claude_to_deepseek,
-    CLAUDE_DEFAULT_MODEL,
-)
-
-router = APIRouter()
-
-
-
-# ----------------------------------------------------------------------
-# 通过 OpenAI 接口调用 Claude
-# ----------------------------------------------------------------------
-async def call_claude_via_openai(request: Request, claude_payload: dict):
-    """通过现有OpenAI接口调用Claude（实际调用DeepSeek）"""
-    deepseek_payload = convert_claude_to_deepseek(claude_payload)
-
-    try:
-        session_id = create_session(request)
-        if not session_id:
-            raise HTTPException(status_code=401, detail="invalid token.")
-
-        pow_resp = get_pow(request)
-        if not pow_resp:
-            raise HTTPException(
-                status_code=401,
-                detail="Failed to get PoW (invalid token or unknown error).",
-            )
-
-        model = deepseek_payload.get("model", "deepseek-chat")
-        messages = deepseek_payload.get("messages", [])
-
-        # 使用会话管理器获取模型配置
-        thinking_enabled, search_enabled = get_model_config(model)
-        if thinking_enabled is None:
-            # 默认配置
-            thinking_enabled = False
-            search_enabled = False
-
-        final_prompt = messages_prepare(messages)
-
-        headers = {**get_auth_headers(request), "x-ds-pow-response": pow_resp}
-        payload = {
-            "chat_session_id": session_id,
-            "parent_message_id": None,
-            "prompt": final_prompt,
-            "ref_file_ids": [],
-            "thinking_enabled": thinking_enabled,
-            "search_enabled": search_enabled,
-        }
-
-        deepseek_resp = call_completion_endpoint(payload, headers, max_attempts=3)
-        return deepseek_resp
-
-    except Exception as e:
-        logger.error(f"[call_claude_via_openai] 调用失败: {e}")
-        return None
-
-
-# ----------------------------------------------------------------------
-# Claude 路由：模型列表
-# ----------------------------------------------------------------------
-@router.get("/anthropic/v1/models")
-def list_claude_models():
-    data = get_claude_models_response()
-    return JSONResponse(content=data, status_code=200)
-
-
-# ----------------------------------------------------------------------
-# Claude 路由：/anthropic/v1/messages
-# ----------------------------------------------------------------------
-@router.post("/anthropic/v1/messages")
-async def claude_messages(request: Request):
-    try:
-        try:
-            determine_mode_and_token(request)
-        except HTTPException as exc:
-            return JSONResponse(
-                status_code=exc.status_code, content={"error": exc.detail}
-            )
-        except Exception as exc:
-            logger.error(f"[claude_messages] determine_mode_and_token 异常: {exc}")
-            return JSONResponse(
-                status_code=500, content={"error": "Claude authentication failed."}
-            )
-
-        req_data = await request.json()
-        model = req_data.get("model")
-        messages = req_data.get("messages", [])
-
-        if not model or not messages:
-            raise HTTPException(
-                status_code=400, detail="Request must include 'model' and 'messages'."
-            )
-
-        # 标准化消息内容
-        normalized_messages = []
-        for message in messages:
-            normalized_message = message.copy()
-            if isinstance(message.get("content"), list):
-                content_parts = []
-                for content_block in message["content"]:
-                    if content_block.get("type") == "text" and "text" in content_block:
-                        content_parts.append(content_block["text"])
-                    elif content_block.get("type") == "tool_result":
-                        if "content" in content_block:
-                            content_parts.append(str(content_block["content"]))
-                if content_parts:
-                    normalized_message["content"] = "\n".join(content_parts)
-                elif isinstance(message.get("content"), list) and message["content"]:
-                    normalized_message["content"] = message["content"]
-                else:
-                    normalized_message["content"] = ""
-            normalized_messages.append(normalized_message)
-
-        tools_requested = req_data.get("tools") or []
-        has_tools = len(tools_requested) > 0
-
-        payload = req_data.copy()
-        payload["messages"] = normalized_messages.copy()
-
-        # 如果有工具定义，添加工具使用指导的系统消息
-        if has_tools and not any(m.get("role") == "system" for m in payload["messages"]):
-            tool_schemas = []
-            for tool in tools_requested:
-                tool_name = tool.get("name", "unknown")
-                tool_desc = tool.get("description", "No description available")
-                schema = tool.get("input_schema", {})
-
-                tool_info = f"Tool: {tool_name}\nDescription: {tool_desc}"
-                if "properties" in schema:
-                    props = []
-                    required = schema.get("required", [])
-                    for prop_name, prop_info in schema["properties"].items():
-                        prop_type = prop_info.get("type", "string")
-                        is_req = " (required)" if prop_name in required else ""
-                        props.append(f"  - {prop_name}: {prop_type}{is_req}")
-                    if props:
-                        tool_info += f"\nParameters:\n{chr(10).join(props)}"
-                tool_schemas.append(tool_info)
-
-            system_message = {
-                "role": "system",
-                "content": f"""You are Claude, a helpful AI assistant. You have access to these tools:
-
-{chr(10).join(tool_schemas)}
-
-When you need to use tools, you can call multiple tools in a single response. Use this format:
-
-{{"tool_calls": [
-  {{"name": "tool1", "input": {{"param": "value"}}}},
-  {{"name": "tool2", "input": {{"param": "value"}}}}
-]}}
-
-IMPORTANT: You can call multiple tools in ONE response.
-
-Remember: Output ONLY the JSON, no other text. The response must start with {{ and end with ]}}""",
-            }
-            payload["messages"].insert(0, system_message)
-
-        deepseek_resp = await call_claude_via_openai(request, payload)
-        if not deepseek_resp:
-            raise HTTPException(status_code=500, detail="Failed to get Claude response.")
-
-        if deepseek_resp.status_code != 200:
-            deepseek_resp.close()
-            return JSONResponse(
-                status_code=500,
-                content={"error": {"type": "api_error", "message": "Failed to get response"}},
-            )
-
-        # 流式响应或普通响应
-        if bool(req_data.get("stream", False)):
-
-            def claude_sse_stream():
-                # 使用导入的常量（不再本地定义）
-                try:
-                    message_id = f"msg_{int(time.time())}_{random.randint(1000, 9999)}"
-                    input_tokens = sum(len(str(m.get("content", ""))) for m in messages) // 4
-                    output_tokens = 0
-                    full_response_text = ""
-                    last_content_time = time.time()
-                    has_content = False
-
-
-                    for line in deepseek_resp.iter_lines():
-                        current_time = time.time()
-                        
-                        # 智能超时检测
-                        if has_content and (current_time - last_content_time) > STREAM_IDLE_TIMEOUT:
-                            logger.warning(f"[claude_sse_stream] 智能超时: 已有内容但 {STREAM_IDLE_TIMEOUT}s 无新数据，强制结束")
-                            break
-                        
-                        if not line:
-                            continue
-                        try:
-                            line_str = line.decode("utf-8")
-                        except Exception:
-                            continue
-
-                        if line_str.startswith("data:"):
-                            data_str = line_str[5:].strip()
-                            if data_str == "[DONE]":
-                                break
-
-                            try:
-                                chunk = json.loads(data_str)
-                                
-                                # 检测内容审核/敏感词阻止
-                                if "error" in chunk or chunk.get("code") == "content_filter":
-                                    logger.warning(f"[claude_sse_stream] 检测到内容过滤: {chunk}")
-                                    break
-                                
-                                if "v" in chunk and isinstance(chunk["v"], str):
-                                    content = chunk["v"]
-                                    # 检查是否是 FINISHED 状态
-                                    if content == "FINISHED":
-                                        break
-                                    full_response_text += content
-                                    if content:
-                                        has_content = True
-                                        last_content_time = current_time
-                                elif "v" in chunk and isinstance(chunk["v"], list):
-                                    for item in chunk["v"]:
-                                        if item.get("p") == "status" and item.get("v") == "FINISHED":
-                                            break
-                            except (json.JSONDecodeError, KeyError):
-                                continue
-
-                    # 发送Claude格式的事件
-                    message_start = {
-                        "type": "message_start",
-                        "message": {
-                            "id": message_id,
-                            "type": "message",
-                            "role": "assistant",
-                            "model": model,
-                            "content": [],
-                            "stop_reason": None,
-                            "stop_sequence": None,
-                            "usage": {"input_tokens": input_tokens, "output_tokens": 0},
-                        },
-                    }
-                    yield f"data: {json.dumps(message_start)}\n\n"
-
-                    # 检查工具调用
-                    # 使用公共函数检测工具调用
-                    detected_tools = parse_tool_calls(full_response_text, tools_requested)
-
-                    content_index = 0
-                    if detected_tools:
-                        stop_reason = "tool_use"
-                        for tool_info in detected_tools:
-                            tool_use_id = f"toolu_{int(time.time())}_{random.randint(1000, 9999)}_{content_index}"
-                            tool_name = tool_info["name"]
-                            tool_input = tool_info["input"]
-
-                            yield f"data: {json.dumps({'type': 'content_block_start', 'index': content_index, 'content_block': {'type': 'tool_use', 'id': tool_use_id, 'name': tool_name, 'input': tool_input}})}\n\n"
-                            yield f"data: {json.dumps({'type': 'content_block_stop', 'index': content_index})}\n\n"
-
-                            content_index += 1
-                            output_tokens += len(str(tool_input)) // 4
-                    else:
-                        stop_reason = "end_turn"
-                        if full_response_text:
-                            yield f"data: {json.dumps({'type': 'content_block_start', 'index': 0, 'content_block': {'type': 'text', 'text': ''}})}\n\n"
-                            yield f"data: {json.dumps({'type': 'content_block_delta', 'index': 0, 'delta': {'type': 'text_delta', 'text': full_response_text}})}\n\n"
-                            yield f"data: {json.dumps({'type': 'content_block_stop', 'index': 0})}\n\n"
-                            output_tokens += len(full_response_text) // 4
-
-                    yield f"data: {json.dumps({'type': 'message_delta', 'delta': {'stop_reason': stop_reason, 'stop_sequence': None}, 'usage': {'output_tokens': output_tokens}})}\n\n"
-                    yield f"data: {json.dumps({'type': 'message_stop'})}\n\n"
-
-                except Exception as e:
-                    logger.error(f"[claude_sse_stream] 异常: {e}")
-                    error_event = {
-                        "type": "error",
-                        "error": {"type": "api_error", "message": f"Stream processing error: {str(e)}"},
-                    }
-                    yield f"data: {json.dumps(error_event)}\n\n"
-                finally:
-                    try:
-                        deepseek_resp.close()
-                    except Exception:
-                        pass
-                    # 注意：不在此处调用 cleanup_account，由外层 finally 统一处理
-
-            return StreamingResponse(
-                claude_sse_stream(),
-                media_type="text/event-stream",
-                headers={"Content-Type": "text/event-stream"},
-            )
-        else:
-            # 非流式响应处理
-            try:
-                final_content = ""
-                final_reasoning = ""
-
-                for line in deepseek_resp.iter_lines():
-                    if not line:
-                        continue
-                    try:
-                        line_str = line.decode("utf-8")
-                    except Exception as e:
-                        logger.warning(f"[claude_messages] 行解码失败: {e}")
-                        continue
-
-                    if line_str.startswith("data:"):
-                        data_str = line_str[5:].strip()
-                        if data_str == "[DONE]":
-                            break
-
-                        try:
-                            chunk = json.loads(data_str)
-                            if "v" in chunk:
-                                v_value = chunk["v"]
-                                if "p" in chunk and chunk.get("p") == "response/search_status":
-                                    continue
-                                ptype = "text"
-                                if "p" in chunk and chunk.get("p") == "response/thinking_content":
-                                    ptype = "thinking"
-                                elif "p" in chunk and chunk.get("p") == "response/content":
-                                    ptype = "text"
-                                if isinstance(v_value, str):
-                                    if ptype == "thinking":
-                                        final_reasoning += v_value
-                                    else:
-                                        final_content += v_value
-                                elif isinstance(v_value, list):
-                                    for item in v_value:
-                                        if item.get("p") == "status" and item.get("v") == "FINISHED":
-                                            break
-                        except json.JSONDecodeError as e:
-                            logger.warning(f"[claude_messages] JSON解析失败: {e}")
-                            continue
-                        except Exception as e:
-                            logger.warning(f"[claude_messages] chunk处理失败: {e}")
-                            continue
-
-                try:
-                    deepseek_resp.close()
-                except Exception as e:
-                    logger.warning(f"[claude_messages] 关闭响应异常: {e}")
-
-                # 检查工具调用
-                detected_tools = parse_tool_calls(final_content, tools_requested)
-
-                # 构造响应
-                claude_response = {
-                    "id": f"msg_{int(time.time())}_{random.randint(1000, 9999)}",
-                    "type": "message",
-                    "role": "assistant",
-                    "model": model,
-                    "content": [],
-                    "stop_reason": "tool_use" if detected_tools else "end_turn",
-                    "stop_sequence": None,
-                    "usage": {
-                        "input_tokens": len(str(normalized_messages)) // 4,
-                        "output_tokens": (len(final_content) + len(final_reasoning)) // 4,
-                    },
-                }
-
-                if final_reasoning:
-                    claude_response["content"].append({"type": "thinking", "thinking": final_reasoning})
-
-                if detected_tools:
-                    for i, tool_info in enumerate(detected_tools):
-                        tool_use_id = f"toolu_{int(time.time())}_{random.randint(1000, 9999)}_{i}"
-                        claude_response["content"].append({
-                            "type": "tool_use",
-                            "id": tool_use_id,
-                            "name": tool_info["name"],
-                            "input": tool_info["input"],
-                        })
-                else:
-                    if final_content or not final_reasoning:
-                        claude_response["content"].append({
-                            "type": "text",
-                            "text": final_content or "抱歉，没有生成有效的响应内容。",
-                        })
-
-                return JSONResponse(content=claude_response, status_code=200)
-
-            except Exception as e:
-                logger.error(f"[claude_messages] 非流式响应处理异常: {e}")
-                try:
-                    deepseek_resp.close()
-                except Exception as close_e:
-                    logger.warning(f"[claude_messages] 关闭响应异常2: {close_e}")
-                return JSONResponse(
-                    status_code=500,
-                    content={"error": {"type": "api_error", "message": "Response processing error"}},
-                )
-
-    except HTTPException as exc:
-        return JSONResponse(
-            status_code=exc.status_code,
-            content={"error": {"type": "invalid_request_error", "message": exc.detail}},
-        )
-    except Exception as exc:
-        logger.error(f"[claude_messages] 未知异常: {exc}")
-        return JSONResponse(
-            status_code=500,
-            content={"error": {"type": "api_error", "message": "Internal Server Error"}},
-        )
-    finally:
-        cleanup_account(request)
-
-
-# ----------------------------------------------------------------------
-# Claude 路由：/anthropic/v1/messages/count_tokens
-# ----------------------------------------------------------------------
-@router.post("/anthropic/v1/messages/count_tokens")
-async def claude_count_tokens(request: Request):
-    try:
-        try:
-            determine_mode_and_token(request)
-        except HTTPException as exc:
-            return JSONResponse(status_code=exc.status_code, content={"error": exc.detail})
-        except Exception as exc:
-            logger.error(f"[claude_count_tokens] determine_mode_and_token 异常: {exc}")
-            return JSONResponse(status_code=500, content={"error": "Claude authentication failed."})
-
-        req_data = await request.json()
-        model = req_data.get("model")
-        messages = req_data.get("messages", [])
-        system = req_data.get("system", "")
-
-        if not model or not messages:
-            raise HTTPException(
-                status_code=400, detail="Request must include 'model' and 'messages'."
-            )
-
-        input_tokens = 0
-
-        if system:
-            input_tokens += estimate_tokens(system)
-
-        for message in messages:
-            content = message.get("content", "")
-            input_tokens += 2  # 角色标记
-
-            if isinstance(content, list):
-                for content_block in content:
-                    if isinstance(content_block, dict):
-                        if content_block.get("type") == "text":
-                            input_tokens += estimate_tokens(content_block.get("text", ""))
-                        elif content_block.get("type") == "tool_result":
-                            input_tokens += estimate_tokens(content_block.get("content", ""))
-                        else:
-                            input_tokens += estimate_tokens(str(content_block))
-                    else:
-                        input_tokens += estimate_tokens(str(content_block))
-            else:
-                input_tokens += estimate_tokens(content)
-
-        tools = req_data.get("tools", [])
-        if tools:
-            for tool in tools:
-                input_tokens += estimate_tokens(tool.get("name", ""))
-                input_tokens += estimate_tokens(tool.get("description", ""))
-                input_schema = tool.get("input_schema", {})
-                input_tokens += estimate_tokens(json.dumps(input_schema, ensure_ascii=False))
-
-        response = {"input_tokens": max(1, input_tokens)}
-        return JSONResponse(content=response, status_code=200)
-
-    except HTTPException as exc:
-        return JSONResponse(
-            status_code=exc.status_code,
-            content={"error": {"type": "invalid_request_error", "message": exc.detail}},
-        )
-    except Exception as exc:
-        logger.error(f"[claude_count_tokens] 未知异常: {exc}")
-        return JSONResponse(
-            status_code=500,
-            content={"error": {"type": "api_error", "message": "Internal Server Error"}},
-        )
--- a/routes/home.py
+++ b/routes/home.py
@@ -1,308 +0,0 @@
-# -*- coding: utf-8 -*-
-"""首页和 WebUI 路由"""
-import os
-from fastapi import APIRouter, Request
-from fastapi.responses import HTMLResponse, FileResponse
-
-from core.config import STATIC_ADMIN_DIR
-
-router = APIRouter()
-
-# 首页 HTML（内嵌避免依赖模板目录）
-WELCOME_HTML = """<!DOCTYPE html>
-<html lang="zh-CN">
-<head>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>DS2API - DeepSeek to OpenAI API</title>
-    <meta name="description" content="DS2API - 将 DeepSeek 网页版转换为 OpenAI 兼容 API">
-    <link rel="preconnect" href="https://fonts.googleapis.com">
-    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=Orbitron:wght@700&display=swap" rel="stylesheet">
-    <link rel="icon" type="image/svg+xml" href="data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'%3E%3Cdefs%3E%3ClinearGradient id='g' x1='0%25' y1='0%25' x2='100%25' y2='100%25'%3E%3Cstop offset='0%25' stop-color='%23f59e0b'/%3E%3Cstop offset='100%25' stop-color='%23ef4444'/%3E%3C/linearGradient%3E%3C/defs%3E%3Crect rx='20' width='100' height='100' fill='url(%23g)'/%3E%3Ctext x='50' y='68' font-family='Arial,sans-serif' font-size='48' font-weight='bold' fill='white' text-anchor='middle'%3EDS%3C/text%3E%3C/svg%3E">
-    <style>
-        :root {
-            --primary: #f59e0b;
-            --primary-glow: rgba(245, 158, 11, 0.4);
-            --secondary: #ef4444;
-            --bg: #030712;
-            --card-bg: rgba(255, 255, 255, 0.03);
-            --card-border: rgba(255, 255, 255, 0.08);
-            --text-main: #f9fafb;
-            --text-dim: #9ca3af;
-        }
-
-        * { margin: 0; padding: 0; box-sizing: border-box; }
-        
-        body {
-            font-family: 'Inter', -apple-system, BlinkMacSystemFont, sans-serif;
-            background-color: var(--bg);
-            color: var(--text-main);
-            min-height: 100vh;
-            overflow-x: hidden;
-            display: flex;
-            flex-direction: column;
-            align-items: center;
-            justify-content: center;
-            position: relative;
-        }
-
-        /* Animated Background */
-        .bg-glow {
-            position: fixed;
-            top: 0;
-            left: 0;
-            width: 100%;
-            height: 100%;
-            z-index: -1;
-            background: 
-                radial-gradient(circle at 20% 30%, rgba(245, 158, 11, 0.05) 0%, transparent 40%),
-                radial-gradient(circle at 80% 70%, rgba(239, 68, 68, 0.05) 0%, transparent 40%);
-        }
-
-        .blob {
-            position: absolute;
-            width: 400px;
-            height: 400px;
-            background: linear-gradient(135deg, var(--primary), var(--secondary));
-            filter: blur(80px);
-            opacity: 0.15;
-            border-radius: 50%;
-            z-index: -1;
-            animation: move 20s infinite alternate;
-        }
-
-        @keyframes move {
-            from { transform: translate(-10%, -10%) scale(1); }
-            to { transform: translate(10%, 10%) scale(1.1); }
-        }
-
-        .container {
-            width: 100%;
-            max-width: 900px;
-            padding: 2rem;
-            text-align: center;
-        }
-
-        .logo-section {
-            margin-bottom: 3rem;
-            animation: fadeInUp 0.8s ease-out;
-        }
-
-        .logo {
-            font-family: 'Orbitron', sans-serif;
-            font-size: clamp(3rem, 10vw, 5rem);
-            font-weight: 700;
-            background: linear-gradient(135deg, var(--primary), var(--secondary));
-            -webkit-background-clip: text;
-            -webkit-text-fill-color: transparent;
-            background-clip: text;
-            letter-spacing: -2px;
-            margin-bottom: 0.5rem;
-            display: inline-block;
-        }
-
-        .subtitle {
-            color: var(--text-dim);
-            font-size: 1.25rem;
-            max-width: 600px;
-            margin: 0 auto;
-            line-height: 1.6;
-        }
-
-        .actions {
-            display: flex;
-            gap: 1rem;
-            justify-content: center;
-            margin-bottom: 4rem;
-            flex-wrap: wrap;
-            animation: fadeInUp 0.8s ease-out 0.2s backwards;
-        }
-
-        .btn {
-            padding: 0.8rem 2rem;
-            border-radius: 12px;
-            text-decoration: none;
-            font-weight: 600;
-            font-size: 1rem;
-            transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1);
-            display: flex;
-            align-items: center;
-            gap: 0.5rem;
-        }
-
-        .btn-primary {
-            background: linear-gradient(135deg, var(--primary), var(--secondary));
-            color: white;
-            box-shadow: 0 4px 15px var(--primary-glow);
-        }
-
-        .btn-primary:hover {
-            transform: translateY(-3px) scale(1.02);
-            box-shadow: 0 8px 25px var(--primary-glow);
-        }
-
-        .btn-secondary {
-            background: var(--card-bg);
-            color: var(--text-main);
-            border: 1px solid var(--card-border);
-            backdrop-filter: blur(10px);
-        }
-
-        .btn-secondary:hover {
-            background: rgba(255, 255, 255, 0.08);
-            border-color: rgba(255, 255, 255, 0.2);
-            transform: translateY(-2px);
-        }
-
-        .features-grid {
-            display: grid;
-            grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
-            gap: 1.5rem;
-            margin-top: 1rem;
-            animation: fadeInUp 0.8s ease-out 0.4s backwards;
-        }
-
-        .feature-card {
-            background: var(--card-bg);
-            border: 1px solid var(--card-border);
-            border-radius: 16px;
-            padding: 1.5rem;
-            text-align: left;
-            transition: all 0.3s ease;
-            backdrop-filter: blur(8px);
-        }
-
-        .feature-card:hover {
-            border-color: rgba(245, 158, 11, 0.3);
-            background: rgba(255, 255, 255, 0.05);
-            transform: translateY(-5px);
-        }
-
-        .feature-icon {
-            font-size: 1.5rem;
-            margin-bottom: 1rem;
-            display: block;
-        }
-
-        .feature-card h3 {
-            font-size: 1.1rem;
-            margin-bottom: 0.5rem;
-            font-weight: 600;
-        }
-
-        .feature-card p {
-            color: var(--text-dim);
-            font-size: 0.9rem;
-            line-height: 1.5;
-        }
-
-        footer {
-            margin-top: 4rem;
-            padding: 2rem;
-            color: var(--text-dim);
-            font-size: 0.875rem;
-            animation: fadeInUp 0.8s ease-out 0.6s backwards;
-        }
-
-        @keyframes fadeInUp {
-            from { opacity: 0; transform: translateY(20px); }
-            to { opacity: 1; transform: translateY(0); }
-        }
-
-        @media (max-width: 640px) {
-            .logo { font-size: 3.5rem; }
-            .container { padding: 1.5rem; }
-        }
-    </style>
-</head>
-<body>
-    <div class="bg-glow"></div>
-    <div class="blob" style="top: 10%; left: 15%;"></div>
-    <div class="blob" style="bottom: 10%; right: 15%; animation-delay: -5s;"></div>
-
-    <div class="container">
-        <header class="logo-section">
-            <div class="logo">DS2API</div>
-            <p class="subtitle">DeepSeek to OpenAI & Claude Compatible API Interface</p>
-        </header>
-
-        <div class="actions">
-            <a href="/admin" class="btn btn-primary">
-                <span>🎛️</span> 管理面板
-            </a>
-            <a href="/v1/models" class="btn btn-secondary">
-                <span>📡</span> API 状态
-            </a>
-            <a href="https://github.com/CJackHwang/ds2api" class="btn btn-secondary" target="_blank">
-                <span>📦</span> GitHub
-            </a>
-        </div>
-
-        <div class="features-grid">
-            <div class="feature-card">
-                <span class="feature-icon">🚀</span>
-                <h3>全面兼容</h3>
-                <p>完美适配 OpenAI 与 Claude API 格式，无缝集成现有工具。</p>
-            </div>
-            <div class="feature-card">
-                <span class="feature-icon">⚖️</span>
-                <h3>负载均衡</h3>
-                <p>内置智能轮询机制，支持多账号并发，稳定高效。</p>
-            </div>
-            <div class="feature-card">
-                <span class="feature-icon">🧠</span>
-                <h3>深度思考</h3>
-                <p>完整支持 推理过程输出，让思考可见。</p>
-            </div>
-            <div class="feature-card">
-                <span class="feature-icon">🔍</span>
-                <h3>联网搜索</h3>
-                <p>集成 DeepSeek 原生搜索能力，获取最新实时资讯。</p>
-            </div>
-        </div>
-
-        <footer>
-            <p>&copy; 2026 DS2API Project. Designed for flexibility & performance.</p>
-        </footer>
-    </div>
-</body>
-</html>"""
-
-
-@router.get("/")
-def index(request: Request):
-    return HTMLResponse(content=WELCOME_HTML)
-
-
-@router.get("/admin")
-@router.get("/admin/{path:path}")
-async def webui(request: Request, path: str = ""):
-    """提供 WebUI 静态文件"""
-    # 检查 static/admin 目录是否存在
-    if not os.path.isdir(STATIC_ADMIN_DIR):
-        return HTMLResponse(
-            content="<h1>WebUI not built</h1><p>Run <code>cd webui && npm run build</code> first.</p>",
-            status_code=404
-        )
-    
-    # 如果请求的是具体文件（如 js, css）
-    if path and "." in path:
-        file_path = os.path.join(STATIC_ADMIN_DIR, path)
-        if os.path.isfile(file_path):
-            cache_control = "public, max-age=31536000, immutable"
-            if path.startswith("assets/"):
-                headers = {"Cache-Control": cache_control}
-            else:
-                headers = {"Cache-Control": "no-store, must-revalidate"}
-            return FileResponse(file_path, headers=headers)
-        return HTMLResponse(content="Not Found", status_code=404)
-    
-    # 否则返回 index.html（SPA 路由）
-    index_path = os.path.join(STATIC_ADMIN_DIR, "index.html")
-    if os.path.isfile(index_path):
-        headers = {"Cache-Control": "no-store, must-revalidate"}
-        return FileResponse(index_path, headers=headers)
-    
-    return HTMLResponse(content="index.html not found", status_code=404)
-
--- a/routes/openai.py
+++ b/routes/openai.py
@@ -1,607 +0,0 @@
-# -*- coding: utf-8 -*-
-"""OpenAI 兼容路由"""
-import json
-import queue
-import random
-import re
-import threading
-import time
-
-from curl_cffi import requests as cffi_requests
-from fastapi import APIRouter, HTTPException, Request
-from fastapi.responses import JSONResponse, StreamingResponse
-
-from core.config import CONFIG, logger
-from core.auth import (
-    determine_mode_and_token,
-    get_auth_headers,
-    release_account,
-)
-from core.deepseek import call_completion_endpoint
-from core.session_manager import (
-    create_session,
-    get_pow,
-    cleanup_account,
-)
-from core.models import get_model_config, get_openai_models_response
-from core.sse_parser import (
-    parse_deepseek_sse_line,
-    parse_sse_chunk_for_content,
-    extract_content_from_chunk,
-    extract_content_recursive,
-    should_filter_citation,
-    parse_tool_calls,
-    format_openai_tool_calls,
-)
-from core.constants import (
-    KEEP_ALIVE_TIMEOUT,
-    STREAM_IDLE_TIMEOUT,
-    MAX_KEEPALIVE_COUNT,
-)
-from core.messages import messages_prepare
-
-router = APIRouter()
-
-# 预编译正则表达式（性能优化）
-_CITATION_PATTERN = re.compile(r"^\[citation:")
-
-
-
-
-
-
-# ----------------------------------------------------------------------
-# 路由：/v1/models
-# ----------------------------------------------------------------------
-@router.get("/v1/models")
-def list_models():
-    data = get_openai_models_response()
-    return JSONResponse(content=data, status_code=200)
-
-
-# ----------------------------------------------------------------------
-# 路由：/v1/chat/completions
-# ----------------------------------------------------------------------
-@router.post("/v1/chat/completions")
-async def chat_completions(request: Request):
-    try:
-        # 处理 token 相关逻辑，若登录失败则直接返回错误响应
-        try:
-            determine_mode_and_token(request)
-        except HTTPException as exc:
-            return JSONResponse(
-                status_code=exc.status_code, content={"error": exc.detail}
-            )
-        except Exception as exc:
-            logger.error(f"[chat_completions] determine_mode_and_token 异常: {exc}")
-            return JSONResponse(
-                status_code=500, content={"error": "Account login failed."}
-            )
-
-        req_data = await request.json()
-        model = req_data.get("model")
-        messages = req_data.get("messages", [])
-        if not model or not messages:
-            raise HTTPException(
-                status_code=400, detail="Request must include 'model' and 'messages'."
-            )
-        
-        # 解析工具调用参数（OpenAI 格式）
-        tools_requested = req_data.get("tools") or []
-        has_tools = len(tools_requested) > 0
-        
-        # 如果有工具定义，构建工具提示并注入到消息中
-        messages_with_tools = messages.copy()
-        if has_tools:
-            tool_schemas = []
-            for tool in tools_requested:
-                # OpenAI 格式: {"type": "function", "function": {"name": ..., "description": ..., "parameters": ...}}
-                func = tool.get("function", tool)  # 兼容简化格式
-                tool_name = func.get("name", "unknown")
-                tool_desc = func.get("description", "No description available")
-                schema = func.get("parameters", {})
-                
-                tool_info = f"Tool: {tool_name}\nDescription: {tool_desc}"
-                if "properties" in schema:
-                    props = []
-                    required = schema.get("required", [])
-                    for prop_name, prop_info in schema["properties"].items():
-                        prop_type = prop_info.get("type", "string")
-                        is_req = " (required)" if prop_name in required else ""
-                        props.append(f"  - {prop_name}: {prop_type}{is_req}")
-                    if props:
-                        tool_info += f"\nParameters:\n" + "\n".join(props)
-                tool_schemas.append(tool_info)
-            
-            # 检查是否已有系统消息
-            has_system = any(m.get("role") == "system" for m in messages_with_tools)
-            tool_prompt = f"""You have access to these tools:
-
-{chr(10).join(tool_schemas)}
-
-When you need to use tools, output ONLY this JSON format (no other text):
-{{"tool_calls": [
-  {{"name": "tool_name", "input": {{"param": "value"}}}}
-]}}
-
-IMPORTANT: If calling tools, output ONLY the JSON. The response must start with {{ and end with }}"""
-            
-            if has_system:
-                # 追加到现有系统消息
-                for i, m in enumerate(messages_with_tools):
-                    if m.get("role") == "system":
-                        messages_with_tools[i] = {
-                            "role": "system",
-                            "content": m.get("content", "") + "\n\n" + tool_prompt
-                        }
-                        break
-            else:
-                # 添加新的系统消息
-                messages_with_tools.insert(0, {"role": "system", "content": tool_prompt})
-        
-        # 使用会话管理器获取模型配置
-        thinking_enabled, search_enabled = get_model_config(model)
-        if thinking_enabled is None:
-            raise HTTPException(
-                status_code=503, detail=f"Model '{model}' is not available."
-            )
-        
-        # 使用 messages_prepare 函数构造最终 prompt（使用带工具提示的消息）
-        final_prompt = messages_prepare(messages_with_tools)
-        session_id = create_session(request)
-        if not session_id:
-            raise HTTPException(status_code=401, detail="invalid token.")
-        
-        pow_resp = get_pow(request)
-        if not pow_resp:
-            raise HTTPException(
-                status_code=401,
-                detail="Failed to get PoW (invalid token or unknown error).",
-            )
-        
-        headers = {**get_auth_headers(request), "x-ds-pow-response": pow_resp}
-        payload = {
-            "chat_session_id": session_id,
-            "parent_message_id": None,
-            "prompt": final_prompt,
-            "ref_file_ids": [],
-            "thinking_enabled": thinking_enabled,
-            "search_enabled": search_enabled,
-        }
-
-        deepseek_resp = call_completion_endpoint(payload, headers, max_attempts=3)
-        if not deepseek_resp:
-            raise HTTPException(status_code=500, detail="Failed to get completion.")
-        created_time = int(time.time())
-        completion_id = f"{session_id}"
-
-        # 流式响应（SSE）或普通响应
-        if bool(req_data.get("stream", False)):
-            if deepseek_resp.status_code != 200:
-                deepseek_resp.close()
-                return JSONResponse(
-                    content=deepseek_resp.content, status_code=deepseek_resp.status_code
-                )
-
-            def sse_stream():
-                # 使用导入的常量（不再本地定义）
-                try:
-                    final_text = ""
-                    final_thinking = ""
-                    first_chunk_sent = False
-                    result_queue = queue.Queue()
-                    last_send_time = time.time()
-                    last_content_time = time.time()  # 最后收到有效内容的时间
-                    keepalive_count = 0  # 连续 keepalive 计数
-                    has_content = False  # 是否收到过内容
-                    stream_finished = False  # 是否已发送过结束标记
-
-                    def process_data():
-                        """处理 DeepSeek SSE 数据流 - 使用 sse_parser 模块"""
-                        nonlocal has_content
-                        current_fragment_type = "thinking" if thinking_enabled else "text"
-                        logger.info(f"[sse_stream] 开始处理数据流, session_id={session_id}")
-                        
-                        try:
-                            for raw_line in deepseek_resp.iter_lines():
-                                # 解码行
-                                try:
-                                    line = raw_line.decode("utf-8")
-                                except Exception as e:
-                                    logger.warning(f"[sse_stream] 解码失败: {e}")
-                                    result_queue.put({"choices": [{"index": 0, "delta": {"content": "解码失败，请稍候再试", "type": "text"}}]})
-                                    result_queue.put(None)
-                                    break
-                                
-                                if not line:
-                                    continue
-                                    
-                                if not line.startswith("data:"):
-                                    continue
-                                    
-                                data_str = line[5:].strip()
-                                if data_str == "[DONE]":
-                                    result_queue.put(None)
-                                    break
-                                    
-                                try:
-                                    chunk = json.loads(data_str)
-                                    
-                                    # 检测内容审核/敏感词阻止
-                                    if "error" in chunk or chunk.get("code") == "content_filter":
-                                        logger.warning(f"[sse_stream] 检测到内容过滤: {chunk}")
-                                        result_queue.put({"choices": [{"index": 0, "finish_reason": "content_filter"}]})
-                                        result_queue.put(None)
-                                        return
-                                    
-                                    # 使用 sse_parser 模块解析内容
-                                    contents, is_finished, new_fragment_type = parse_sse_chunk_for_content(
-                                        chunk, thinking_enabled, current_fragment_type
-                                    )
-                                    current_fragment_type = new_fragment_type
-                                    
-                                    if is_finished:
-                                        result_queue.put({"choices": [{"index": 0, "finish_reason": "stop"}]})
-                                        result_queue.put(None)
-                                        return
-                                    
-                                    # 处理提取的内容
-                                    for content_text, content_type in contents:
-                                        if content_text:
-                                            has_content = True
-                                            unified_chunk = {
-                                                "choices": [{
-                                                    "index": 0,
-                                                    "delta": {"content": content_text, "type": content_type}
-                                                }],
-                                                "model": "",
-                                                "chunk_token_usage": len(content_text) // 4,
-                                                "created": 0,
-                                                "message_id": -1,
-                                                "parent_id": -1
-                                            }
-                                            result_queue.put(unified_chunk)
-                                            
-                                except Exception as e:
-                                    logger.warning(f"[sse_stream] 无法解析: {data_str[:100]}, 错误: {e}")
-                                    result_queue.put({"choices": [{"index": 0, "delta": {"content": "解析失败，请稍候再试", "type": "text"}}]})
-                                    result_queue.put(None)
-                                    break
-                                    
-                        except Exception as e:
-                            logger.warning(f"[sse_stream] 错误: {e}")
-                            result_queue.put({"choices": [{"index": 0, "delta": {"content": "服务器错误，请稍候再试", "type": "text"}}]})
-                            result_queue.put(None)
-                        finally:
-                            deepseek_resp.close()
-
-
-                    process_thread = threading.Thread(target=process_data)
-                    process_thread.start()
-
-                    while True:
-                        current_time = time.time()
-                        
-                        # 智能超时检测：如果已有内容且长时间无新数据，强制结束
-                        if has_content and (current_time - last_content_time) > STREAM_IDLE_TIMEOUT:
-                            logger.warning(f"[sse_stream] 智能超时: 已有内容但 {STREAM_IDLE_TIMEOUT}s 无新数据，强制结束")
-                            break
-                        
-                        # 连续 keepalive 检测：如果已有内容且连续多次 keepalive，强制结束
-                        if has_content and keepalive_count >= MAX_KEEPALIVE_COUNT:
-                            logger.warning(f"[sse_stream] 智能超时: 连续 {MAX_KEEPALIVE_COUNT} 次 keepalive，强制结束")
-                            break
-                        
-                        if current_time - last_send_time >= KEEP_ALIVE_TIMEOUT:
-                            yield ": keep-alive\n\n"
-                            last_send_time = current_time
-                            keepalive_count += 1
-                            continue
-                            
-                        try:
-                            chunk = result_queue.get(timeout=0.05)
-                            keepalive_count = 0  # 重置 keepalive 计数
-                            
-                            if chunk is None:
-                                prompt_tokens = len(final_prompt) // 4
-                                thinking_tokens = len(final_thinking) // 4
-                                completion_tokens = len(final_text) // 4
-                                usage = {
-                                    "prompt_tokens": prompt_tokens,
-                                    "completion_tokens": thinking_tokens + completion_tokens,
-                                    "total_tokens": prompt_tokens + thinking_tokens + completion_tokens,
-                                    "completion_tokens_details": {"reasoning_tokens": thinking_tokens},
-                                }
-                                
-                                # 检测工具调用
-                                detected_tools = []
-                                finish_reason = "stop"
-                                if has_tools:
-                                    detected_tools = parse_tool_calls(final_text, [{"name": t.get("function", t).get("name")} for t in tools_requested])
-                                    if detected_tools:
-                                        finish_reason = "tool_calls"
-                                
-                                if detected_tools:
-                                    # 发送工具调用响应
-                                    tool_calls_data = format_openai_tool_calls(detected_tools)
-                                    tool_chunk = {
-                                        "id": completion_id,
-                                        "object": "chat.completion.chunk",
-                                        "created": created_time,
-                                        "model": model,
-                                        "choices": [{"delta": {"tool_calls": tool_calls_data}, "index": 0}],
-                                    }
-                                    yield f"data: {json.dumps(tool_chunk, ensure_ascii=False)}\n\n"
-                                
-                                finish_chunk = {
-                                    "id": completion_id,
-                                    "object": "chat.completion.chunk",
-                                    "created": created_time,
-                                    "model": model,
-                                    "choices": [{"delta": {}, "index": 0, "finish_reason": finish_reason}],
-                                    "usage": usage,
-                                }
-                                yield f"data: {json.dumps(finish_chunk, ensure_ascii=False)}\n\n"
-                                yield "data: [DONE]\n\n"
-                                last_send_time = current_time
-                                stream_finished = True
-                                break
-                                
-                            new_choices = []
-                            for choice in chunk.get("choices", []):
-                                delta = choice.get("delta", {})
-                                ctype = delta.get("type")
-                                ctext = delta.get("content", "")
-                                if choice.get("finish_reason") == "backend_busy":
-                                    ctext = "服务器繁忙，请稍候再试"
-                                if choice.get("finish_reason") == "content_filter":
-                                    # 内容过滤，正常结束
-                                    pass
-                                if search_enabled and ctext.startswith("[citation:"):
-                                    ctext = ""
-                                if ctype == "thinking":
-                                    if thinking_enabled:
-                                        final_thinking += ctext
-                                else:
-                                    # 非 thinking 内容都作为普通文本处理（包括 ctype=None 或 "text"）
-                                    final_text += ctext
-                                delta_obj = {}
-                                if not first_chunk_sent:
-                                    delta_obj["role"] = "assistant"
-                                    first_chunk_sent = True
-                                if ctype == "thinking":
-                                    if thinking_enabled:
-                                        delta_obj["reasoning_content"] = ctext
-                                else:
-                                    # 非 thinking 内容都作为 content 输出
-                                    if ctext:
-                                        delta_obj["content"] = ctext
-                                if delta_obj:
-                                    new_choices.append({"delta": delta_obj, "index": choice.get("index", 0)})
-                                    
-                            if new_choices:
-                                last_content_time = current_time  # 更新最后内容时间
-                                out_chunk = {
-                                    "id": completion_id,
-                                    "object": "chat.completion.chunk",
-                                    "created": created_time,
-                                    "model": model,
-                                    "choices": new_choices,
-                                }
-                                yield f"data: {json.dumps(out_chunk, ensure_ascii=False)}\n\n"
-                                last_send_time = current_time
-                        except queue.Empty:
-                            continue
-                            
-                    # 如果是超时退出且尚未发送结束标记，补发结束标记
-                    if has_content and not stream_finished:
-                        prompt_tokens = len(final_prompt) // 4
-                        thinking_tokens = len(final_thinking) // 4
-                        completion_tokens = len(final_text) // 4
-                        usage = {
-                            "prompt_tokens": prompt_tokens,
-                            "completion_tokens": thinking_tokens + completion_tokens,
-                            "total_tokens": prompt_tokens + thinking_tokens + completion_tokens,
-                            "completion_tokens_details": {"reasoning_tokens": thinking_tokens},
-                        }
-                        
-                        # 检测工具调用
-                        detected_tools = []
-                        finish_reason = "stop"
-                        if has_tools:
-                            detected_tools = parse_tool_calls(final_text, [{"name": t.get("function", t).get("name")} for t in tools_requested])
-                            if detected_tools:
-                                finish_reason = "tool_calls"
-                        
-                        if detected_tools:
-                            tool_calls_data = format_openai_tool_calls(detected_tools)
-                            tool_chunk = {
-                                "id": completion_id,
-                                "object": "chat.completion.chunk",
-                                "created": created_time,
-                                "model": model,
-                                "choices": [{"delta": {"tool_calls": tool_calls_data}, "index": 0}],
-                            }
-                            yield f"data: {json.dumps(tool_chunk, ensure_ascii=False)}\n\n"
-                        
-                        finish_chunk = {
-                            "id": completion_id,
-                            "object": "chat.completion.chunk",
-                            "created": created_time,
-                            "model": model,
-                            "choices": [{"delta": {}, "index": 0, "finish_reason": finish_reason}],
-                            "usage": usage,
-                        }
-                        yield f"data: {json.dumps(finish_chunk, ensure_ascii=False)}\n\n"
-                        yield "data: [DONE]\n\n"
-                        
-                except Exception as e:
-                    logger.error(f"[sse_stream] 异常: {e}")
-                # 注意：不在此处调用 cleanup_account，由外层 finally 统一处理
-
-            return StreamingResponse(
-                sse_stream(),
-                media_type="text/event-stream",
-                headers={"Content-Type": "text/event-stream"},
-            )
-        else:
-            # 非流式响应处理
-            think_list = []
-            text_list = []
-            result = None
-
-            data_queue = queue.Queue()
-
-            def collect_data():
-                nonlocal result
-                current_fragment_type = "thinking" if thinking_enabled else "text"
-                try:
-                    for raw_line in deepseek_resp.iter_lines():
-                        chunk = parse_deepseek_sse_line(raw_line)
-                        if not chunk:
-                            continue
-                        if chunk.get("type") == "done":
-                            data_queue.put(None)
-                            break
-                        try:
-                            contents, is_finished, new_fragment_type = parse_sse_chunk_for_content(
-                                chunk, thinking_enabled, current_fragment_type
-                            )
-                            current_fragment_type = new_fragment_type
-                            if is_finished:
-                                final_reasoning = "".join(think_list)
-                                final_content = "".join(text_list)
-                                prompt_tokens = len(final_prompt) // 4
-                                reasoning_tokens = len(final_reasoning) // 4
-                                completion_tokens = len(final_content) // 4
-
-                                # 检测工具调用
-                                detected_tools = []
-                                finish_reason = "stop"
-                                if has_tools:
-                                    detected_tools = parse_tool_calls(final_content, [{"name": t.get("function", t).get("name")} for t in tools_requested])
-                                    if detected_tools:
-                                        finish_reason = "tool_calls"
-
-                                # 构建 message 对象
-                                message_obj = {
-                                    "role": "assistant",
-                                    "content": final_content if not detected_tools else None,
-                                }
-                                # 只有启用思考模式时才包含 reasoning_content
-                                if thinking_enabled and final_reasoning:
-                                    message_obj["reasoning_content"] = final_reasoning
-                                # 添加工具调用
-                                if detected_tools:
-                                    tool_calls_data = format_openai_tool_calls(detected_tools)
-                                    message_obj["tool_calls"] = tool_calls_data
-                                    message_obj["content"] = None
-
-                                result = {
-                                    "id": completion_id,
-                                    "object": "chat.completion",
-                                    "created": created_time,
-                                    "model": model,
-                                    "choices": [{
-                                        "index": 0,
-                                        "message": message_obj,
-                                        "finish_reason": finish_reason,
-                                    }],
-                                    "usage": {
-                                        "prompt_tokens": prompt_tokens,
-                                        "completion_tokens": reasoning_tokens + completion_tokens,
-                                        "total_tokens": prompt_tokens + reasoning_tokens + completion_tokens,
-                                        "completion_tokens_details": {"reasoning_tokens": reasoning_tokens},
-                                    },
-                                }
-                                data_queue.put("DONE")
-                                return
-
-                            for content_text, content_type in contents:
-                                if should_filter_citation(content_text, search_enabled):
-                                    continue
-                                if content_type == "thinking":
-                                    think_list.append(content_text)
-                                else:
-                                    text_list.append(content_text)
-                        except Exception as e:
-                            logger.warning(f"[collect_data] 无法解析: {chunk}, 错误: {e}")
-                            text_list.append("解析失败，请稍候再试")
-                            data_queue.put(None)
-                            break
-                except Exception as e:
-                    logger.warning(f"[collect_data] 错误: {e}")
-                    text_list.append("处理失败，请稍候再试")
-                    data_queue.put(None)
-                finally:
-                    deepseek_resp.close()
-                    if result is None:
-                        final_content = "".join(text_list)
-                        final_reasoning = "".join(think_list)
-                        prompt_tokens = len(final_prompt) // 4
-                        reasoning_tokens = len(final_reasoning) // 4
-                        completion_tokens = len(final_content) // 4
-                        
-                        # 检测工具调用
-                        detected_tools = []
-                        finish_reason = "stop"
-                        if has_tools:
-                            detected_tools = parse_tool_calls(final_content, [{"name": t.get("function", t).get("name")} for t in tools_requested])
-                            if detected_tools:
-                                finish_reason = "tool_calls"
-                        
-                        # 构建 message 对象
-                        message_obj = {
-                            "role": "assistant",
-                            "content": final_content if not detected_tools else None,
-                        }
-                        # 只有启用思考模式时才包含 reasoning_content
-                        if thinking_enabled and final_reasoning:
-                            message_obj["reasoning_content"] = final_reasoning
-                        # 添加工具调用
-                        if detected_tools:
-                            tool_calls_data = format_openai_tool_calls(detected_tools)
-                            message_obj["tool_calls"] = tool_calls_data
-                            message_obj["content"] = None
-                        
-                        result = {
-                            "id": completion_id,
-                            "object": "chat.completion",
-                            "created": created_time,
-                            "model": model,
-                            "choices": [{
-                                "index": 0,
-                                "message": message_obj,
-                                "finish_reason": finish_reason,
-                            }],
-                            "usage": {
-                                "prompt_tokens": prompt_tokens,
-                                "completion_tokens": reasoning_tokens + completion_tokens,
-                                "total_tokens": prompt_tokens + reasoning_tokens + completion_tokens,
-                            },
-                        }
-                    data_queue.put("DONE")
-
-            collect_thread = threading.Thread(target=collect_data)
-            collect_thread.start()
-
-            def generate():
-                last_send_time = time.time()
-                while True:
-                    current_time = time.time()
-                    if current_time - last_send_time >= KEEP_ALIVE_TIMEOUT:
-                        yield ""
-                        last_send_time = current_time
-                    if not collect_thread.is_alive() and result is not None:
-                        yield json.dumps(result)
-                        break
-                    time.sleep(0.1)
-
-            return StreamingResponse(generate(), media_type="application/json")
-    except HTTPException as exc:
-        return JSONResponse(status_code=exc.status_code, content={"error": exc.detail})
-    except Exception as exc:
-        logger.error(f"[chat_completions] 未知异常: {exc}")
-        return JSONResponse(status_code=500, content={"error": "Internal Server Error"})
-    finally:
-        cleanup_account(request)
--- a/scripts/testsuite/run-live.sh
+++ b/scripts/testsuite/run-live.sh
@@ -0,0 +1,8 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+ROOT_DIR="$(cd "$(dirname "$0")/../.." && pwd)"
+cd "$ROOT_DIR"
+
+go run ./cmd/ds2api-tests "$@"
+
--- a/tests/README.md
+++ b/tests/README.md
@@ -1,138 +0,0 @@
-# DS2API 测试文档
-
-## 测试文件结构
-
-```
-tests/
-├── __init__.py           # 测试模块初始化
-├── test_unit.py          # 单元测试（不依赖网络）
-├── test_all.py           # API 集成测试
-├── test_accounts.py      # 账号池测试
-└── run_tests.sh          # 测试运行脚本
-```
-
-## 快速开始
-
-### 运行所有测试
-
-```bash
-# 使用脚本
-./tests/run_tests.sh all
-
-# 或直接运行
-python3 tests/test_unit.py      # 单元测试
-python3 tests/test_all.py       # API 测试
-```
-
-### 运行单元测试
-
-```bash
-python3 tests/test_unit.py
-```
-
-测试内容：
- 配置加载
- 消息处理（`messages_prepare`）
- WASM 缓存
- 模型配置获取
- 正则表达式模式
- 流式响应解析
- **工具调用解析**（`parse_tool_calls`）
- **Token 估算**
-
-### 运行 API 集成测试
-
-```bash
-# 完整测试
-python3 tests/test_all.py
-
-# 快速测试（跳过耗时测试）
-python3 tests/test_all.py --quick
-
-# 指定端点
-python3 tests/test_all.py --endpoint http://your-server.com
-
-# 详细输出
-python3 tests/test_all.py --verbose
-```
-
-测试覆盖：
-
-| 类别 | 测试项 |
-|-----|--------|
-| 基础 | 服务健康检查 |
-| OpenAI | 模型列表、非流式对话、流式对话、无效模型处理、认证错误、Reasoner 模式 |
-| Claude | 模型列表、非流式消息、流式消息、Token 计数 |
-| 高级 | 多轮对话、长输入处理 |
-| **工具调用** | OpenAI 工具调用（流式/非流式）、Claude 工具调用 |
-| **搜索模式** | OpenAI 搜索模式 |
-
-### 运行账号测试
-
-```bash
-# 测试所有账号登录
-python3 tests/test_accounts.py --login
-
-# 测试账号轮换
-python3 tests/test_accounts.py --rotation
-
-# 运行所有
-python3 tests/test_accounts.py --all
-```
-
-## 配置
-
-测试使用 `config.json` 中的配置：
-
-```json
-{
-  "keys": ["test-api-key-001"],
-  "accounts": [
-    {"email": "xxx@gmail.com", "password": "xxx", "token": ""}
-  ]
-}
-```
-
-## 预期输出
-
-### 单元测试
-
-```
-Ran 32 tests in 9.0s
-OK
-```
-
-### API 测试
-
-```
-📊 测试报告
-总计: 18 个测试
-✅ 通过: 18
-❌ 失败: 0
-⏱️  耗时: ~60s
-📈 通过率: 100.0%
-```
-
-## 故障排除
-
-### 服务未运行
-
-```
-⚠️  服务未运行，跳过其他测试
-```
-
-解决：先启动服务 `python dev.py`
-
-### 认证失败
-
-```
-❌ 失败: 状态码: 401
-```
-
-解决：检查 `config.json` 中的 API key 和账号配置
-
-### 流式测试超时
-
-可能是 DeepSeek API 响应慢，可以尝试：
- 使用 `--quick` 模式
- 增加测试超时时间
--- a/tests/init.py
+++ b/tests/init.py
@@ -1 +0,0 @@
-# DS2API 测试模块
--- a/tests/run_tests.sh
+++ b/tests/run_tests.sh
@@ -1,111 +0,0 @@
-#!/bin/bash
-# DS2API 测试运行器
-
-set -e
-
-cd "$(dirname "$0")/.."
-
-echo "=================================================="
-echo "     🧪 DS2API 测试套件"
-echo "=================================================="
-echo ""
-
-# 颜色
-GREEN='\033[0;32m'
-RED='\033[0;31m'
-YELLOW='\033[1;33m'
-NC='\033[0m'
-
-# 检查服务是否运行
-check_service() {
-    echo -e "${YELLOW}检查服务状态...${NC}"
-    if curl -s http://localhost:5001/ > /dev/null 2>&1; then
-        echo -e "${GREEN}✅ 服务运行中${NC}"
-        return 0
-    else
-        echo -e "${RED}❌ 服务未运行${NC}"
-        echo "请先启动服务: python dev.py"
-        return 1
-    fi
-}
-
-# 运行单元测试
-run_unit_tests() {
-    echo ""
-    echo "=================================================="
-    echo "     📋 单元测试"
-    echo "=================================================="
-    python3 -m pytest tests/test_unit.py -v --tb=short 2>/dev/null || python3 tests/test_unit.py
-}
-
-# 运行 API 测试
-run_api_tests() {
-    echo ""
-    echo "=================================================="
-    echo "     🌐 API 集成测试"
-    echo "=================================================="
-    python3 tests/test_all.py "$@"
-}
-
-# 运行账号测试
-run_account_tests() {
-    echo ""
-    echo "=================================================="
-    echo "     🔑 账号测试"
-    echo "=================================================="
-    python3 tests/test_accounts.py --all
-}
-
-# 显示帮助
-show_help() {
-    echo "用法: $0 [选项]"
-    echo ""
-    echo "选项:"
-    echo "  unit       只运行单元测试"
-    echo "  api        只运行 API 测试"
-    echo "  api --quick 快速 API 测试"
-    echo "  accounts   只运行账号测试"
-    echo "  all        运行所有测试"
-    echo "  help       显示此帮助"
-    echo ""
-    echo "示例:"
-    echo "  $0 unit"
-    echo "  $0 api --quick"
-    echo "  $0 all"
-}
-
-# 主逻辑
-case "${1:-all}" in
-    unit)
-        run_unit_tests
-        ;;
-    api)
-        if check_service; then
-            shift
-            run_api_tests "$@"
-        fi
-        ;;
-    accounts)
-        run_account_tests
-        ;;
-    all)
-        run_unit_tests
-        echo ""
-        if check_service; then
-            run_api_tests --quick
-        fi
-        ;;
-    help|--help|-h)
-        show_help
-        ;;
-    *)
-        echo "未知选项: $1"
-        show_help
-        exit 1
-        ;;
-esac
-
-echo ""
-echo "=================================================="
-echo "     ✨ 测试完成"
-echo "=================================================="
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
CJACK.	46a56d0389	Merge pull request #34 from CJackHwang/dev Merge pull request #33 from CJackHwang/codex/add-docker-image-build-to-github-actions ci: build and publish Docker images in release workflow	2026-02-17 19:46:04 +08:00
CJACK.	cfd57288d7	Merge pull request #33 from CJackHwang/codex/add-docker-image-build-to-github-actions ci: build and publish Docker images in release workflow	2026-02-17 19:44:20 +08:00
CJACK.	1049a723d8	ci: publish docker image on release	2026-02-17 19:43:12 +08:00
CJACK.	4dae9a3882	Merge pull request #32 from CJackHwang/dev refactor: Improve chat stream content and tool call parsing with a new recursive extraction function and dedicated tests.	2026-02-17 14:36:29 +08:00
CJACK	05422b2449	refactor: Improve chat stream content and tool call parsing with a new recursive extraction function and dedicated tests.	2026-02-17 14:35:24 +08:00
CJACK.	50e66b1571	Merge pull request #31 from CJackHwang/dev feat: Standardize tool name fallback to 'unknown' for parity with Go and ensure `parseTool` consistently returns raw input on parsing failures.	2026-02-17 14:20:06 +08:00
CJACK	5106773573	feat: Standardize tool name fallback to 'unknown' for parity with Go and ensure `parseTool` consistently returns raw input on parsing failures.	2026-02-17 14:18:47 +08:00
CJACK.	a9828e33ad	Merge pull request #30 from CJackHwang/dev feat: Add comprehensive historical and current Claude model IDs for API compatibility and dynamic Docker port configuration.	2026-02-17 14:03:03 +08:00
CJACK	76ae2fed51	feat: Add comprehensive historical and current Claude model IDs for API compatibility and dynamic Docker port configuration.	2026-02-17 14:01:31 +08:00
CJACK	d0549c27c7	feat: Add OpenCode CLI integration instructions to READMEs and provide an example configuration file.	2026-02-17 13:44:14 +08:00
CJACK	7dcddef91f	feat: Update Claude model names and IDs across configuration, documentation, and tests, including the default model and thinking delta logic.	2026-02-17 13:36:19 +08:00
CJACK	6697d0d227	feat: enhance tool call streaming and anti-leakage by suppressing invalid or incomplete tool JSON and refining detection in Node.js.	2026-02-17 13:18:52 +08:00
CJACK	d21fb74f29	fix: Prevent partial tool call JSON leaks in stream processing by removing size-based buffer limits and holding incomplete blocks longer.	2026-02-17 12:57:01 +08:00
CJACK.	0f389471ac	Merge pull request #29 from CJackHwang/dev Merge pull request #28 from CJackHwang/codex/fix-api-stream-buffering-issue Stream Go proxy responses to Vercel clients	2026-02-17 12:10:57 +08:00
CJACK.	5c1dd59502	Merge pull request #28 from CJackHwang/codex/fix-api-stream-buffering-issue Stream Go proxy responses to Vercel clients	2026-02-17 12:09:29 +08:00
CJACK.	0bdbb3a4ef	Stream Go proxy responses to Vercel clients	2026-02-17 12:08:45 +08:00
CJACK.	031f5cd39e	Merge pull request #27 from CJackHwang/dev Dev	2026-02-17 04:51:09 +08:00
CJACK	5fbea97aec	docs: Enhance architecture diagrams, update API and deployment configurations, and remove obsolete documentation files.	2026-02-17 04:45:21 +08:00
CJACK	07de35a093	refactor: centralize SSE stream parsing logic into a new `sse` package and update the PoW solver to honor context cancellation during module acquisition.	2026-02-17 04:40:01 +08:00
CJACK	23d5ac7fa2	feat: centralize DeepSeek SSE parsing, improve account identifier resolution, and simplify CORS configuration.	2026-02-17 03:45:55 +08:00
CJACK	2cde0a1d84	refactor: Enhance WASM POW solver with channel-based pooling and configurable size, update token estimation, and fix CORS origin reflection.	2026-02-17 03:34:48 +08:00
CJACK	534fd1d14b	feat: centralize utility functions, abstract SSE stream collection, and add concurrency to admin account testing.	2026-02-17 03:31:19 +08:00
CJACK	4251438ff5	feat: Implement graceful server shutdown, optimize WASM module instantiation, remove tokenizer files, and refine config saving and admin key warning.	2026-02-17 03:23:56 +08:00
CJACK	f8effc5e84	feat: add option to prune old test run directories, keeping a configurable maximum number.	2026-02-17 02:57:21 +08:00
CJACK	d3ac698129	docs: comprehensively update and expand project documentation across various guides for improved clarity and detail.	2026-02-17 02:50:11 +08:00
CJACK	a19f281229	feat: reimplement Claude streaming to use full SSE events with thinking, tool calls, and stream management, and add related test cases.	2026-02-17 02:31:56 +08:00
CJACK	8cbb5a4262	feat: Introduce a new Go-based test suite runner with supporting scripts and documentation.	2026-02-17 02:03:41 +08:00
CJACK	416b9939fc	Refactor admin handlers into specialized files and introduce OpenAI tool sieving and Vercel streaming capabilities.	2026-02-17 01:35:10 +08:00
CJACK	eeccc967f5	feat: Improve tool detection by implementing a new content splitting strategy that identifies suspicious prefixes.	2026-02-17 01:02:21 +08:00
CJACK.	003a65d9d2	Merge pull request #25 from CJackHwang/dev 继续解决问题	2026-02-17 00:47:07 +08:00
CJACK	555df63fbc	feat: Implement tool call sieving and formatting for streaming responses in Node.js, add Go fallback for non-Vercel environments, and update Vercel configuration.	2026-02-17 00:26:34 +08:00
CJACK	770f5719d8	feat: implement stream lease management for Vercel hybrid streaming path to align occupancy duration with native Go streaming behavior.	2026-02-16 23:22:04 +08:00
CJACK	eb470c33ba	feat: Add Vercel deployment protection bypass support, enhance related error handling, and update documentation.	2026-02-16 22:06:41 +08:00
CJACK	d70a0acaa8	feat: Introduce hybrid streaming for Vercel deployments using a Go prepare endpoint and Node.js stream handler to mitigate buffering.	2026-02-16 21:56:01 +08:00
CJACK	d668465734	fix(vercel): route admin static assets before api fallback	2026-02-16 20:50:32 +08:00
CJACK	3cf207bcbb	fix: Set Vercel output directory to `static/admin` and update deployment documentation to clarify this configuration.	2026-02-16 20:38:31 +08:00
CJACK	a6a87853d4	feat: Implement a waiting queue for account acquisition with configurable limits and updated status reporting.	2026-02-16 20:30:21 +08:00
CJACK	888a0e6bff	refactor: update web UI asset serving and embedding mechanism.	2026-02-16 19:58:33 +08:00
CJACK	190881f13a	refactor: introduce a public `app` package to expose the `internal/server` router, resolving `internal` package import restrictions.	2026-02-16 19:11:55 +08:00
CJACK	f82a7e3e3c	fix: embed admin webui and wasm for serverless runtime	2026-02-16 18:56:11 +08:00
CJACK	057862f7fb	fix: harden webui path and account pool compatibility	2026-02-16 18:30:27 +08:00
CJACK	844832f31b	fix: use valid vercel includeFiles glob	2026-02-16 18:17:16 +08:00
CJACK	a32154f881	fix: ship webui assets in vercel go deployment	2026-02-16 18:16:08 +08:00
CJACK	056c50676f	fix: embed admin webui for serverless runtime	2026-02-16 18:10:27 +08:00
CJACK	0913c477a6	fix: make vercel go entrypoint internal-safe	2026-02-16 18:04:12 +08:00
CJACK	63ee2e41c2	feat: implement strict-client compatible streaming tool calls with index and add Vercel build troubleshooting guide.	2026-02-16 17:58:40 +08:00
CJACK.	cff62ab839	Merge pull request #24 from CJackHwang/dev chore: Standardize Go version to 1.24 across Dockerfile, go.mod, GitHub Actions, and documentation.	2026-02-16 17:52:44 +08:00
CJACK	c7ed01bfe7	chore: Standardize Go version to 1.24 across Dockerfile, go.mod, GitHub Actions, and documentation.	2026-02-16 16:52:05 +08:00
CJACK	dec61b8008	feat: Add GitHub Actions workflow for automated release artifact generation, update Vercel configuration, and document the release process.	2026-02-16 16:35:36 +08:00
CJACK	ac57cabc80	feat: Intercept and format tool calls, including unknown ones, from model responses in the OpenAI adapter, preventing raw JSON leaks and improving stream finalization.	2026-02-16 16:24:21 +08:00
CJACK	6a6f380987	feat: implement per-account inflight request limits and dynamic recommended concurrency calculation.	2026-02-16 16:11:12 +08:00
CJACK	c7ffcd76e6	feat: Enhance DeepSeek API compatibility by updating SSE parsing, standardizing error responses, and improving API key management in the tester UI.	2026-02-16 15:17:42 +08:00
CJACK	57f2041edb	feat: enhance tool call parsing robustness, authentication flexibility, and streaming output for tool content	2026-02-16 01:24:52 +08:00
CJACK	bd788a12b1	Remove the ds2api application and update deployment and contributing documentation.	2026-02-15 20:08:21 +08:00
CJACK	a50e2ef5cd	feat: Introduce a new Go-based DeepSeek API proxy with adapters for Claude and OpenAI, including SSE parsing and updated build configurations.	2026-02-15 19:50:26 +08:00