Compare commits

...

47 Commits

Author SHA1 Message Date
CJACK.
c32fe30239 Merge pull request #408 from CJackHwang/dev
v4.4.0: streaming perf + Gemini thinking + DSML parser hardening
2026-05-03 07:37:14 +08:00
CJACK
03b2acfc9f docs: add DS2API project value note and link it from docs index
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 07:33:04 +08:00
CJACK
a299c7d1c4 refactor: remove thinking content from empty output validation logic to enforce stricter completion requirements 2026-05-03 06:59:20 +08:00
CJACK
51d3578465 fix: ensure CDATA parsing correctly tracks line offsets to preserve compact tool call content 2026-05-03 06:49:22 +08:00
CJACK
072ec57acd fix: improve CDATA parsing resilience by ignoring structural markers inside markdown fences within tool calls 2026-05-03 06:40:29 +08:00
CJACK
545ab0802f feat: extend DSML tag prefix to also recognize underscore-connected variants
Support `<dsml_tool_calls>`, `<dsml_invoke>`, `<dsml_parameter>` in
addition to the existing pipe, space, hyphen, and collapsed forms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 05:39:49 +08:00
CJACK
a7522b4188 fix: retry thinking-only empty outputs, centralize reference marker stripping
- ValidateTurn no longer errors on thinking-only responses, deferring to
  ShouldRetryEmptyOutput which now also covers thinking-only outputs.
- Empty output retry uses multi-turn follow-up with a regeneration prompt
  suffix and parent_message_id in the same DeepSeek session.
- Centralize StripReferenceMarkersEnabled into textclean package to
  eliminate duplicated hardcoded booleans across 4 protocol handlers.
- Log a deprecation warning when the legacy "compat" config key is used.
- Document thinking-only retry and reference marker stripping in API.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 05:02:26 +08:00
CJACK
1286b02247 refactor: remove legacy compatibility configuration and UI components 2026-05-03 04:14:19 +08:00
CJACK
2f7cb473fc feat: support hyphenated DSML tag variants in tool-call parsing
Add compatibility for <dsml-tool-calls>/<dsml-invoke>/<dsml-parameter>
tag forms alongside the canonical pipe-prefixed DSML shell. Hyphenated
forms only activate when a DSML prefix is detected, preventing false
matches on bare XML lookalikes. Go and Node parsers aligned, with tests
covering here-doc CDATA, streaming sieve, and negative lookalike cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 03:09:10 +08:00
CJACK
ad80a57efa docs: add missing directory entries and package descriptions to architecture docs
Fill gaps identified in architecture audit: add artifacts/ and static/ to
directory tree, and document 7 auxiliary internal/ packages (textclean,
claudeconv, compat, rawsample, devcapture, util, version) in Section 3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 02:51:19 +08:00
CJACK
5f110e6910 refactor: remove legacy history split configuration and integrate current input file handling into the completion runtime pipeline. 2026-05-03 01:50:50 +08:00
CJACK
7c0bc9ec0f feat: implement support for thinking blocks in Gemini API and enable thinking by default for supported models 2026-05-03 01:00:06 +08:00
CJACK
a901250de7 refactor: replace bufio.Scanner with bufio.Reader for SSE stream parsing and track emitted text to prevent redundant output blocks 2026-05-02 23:50:35 +08:00
CJACK
dc5bffdf89 refactor: centralize assistant turn semantics and stream accumulation into new assistantturn and completionruntime packages 2026-05-02 23:28:43 +08:00
CJACK
eccd8c957b fix: prevent continuation replay overlap by trimming redundant text from thinking and response streams 2026-05-02 21:34:36 +08:00
CJACK.
b1d0ee07c0 Merge pull request #406 from wyv202011y/perf/streaming-speed-optimization
perf(streaming): optimize TTFT and reduce buffering latency
2026-05-02 21:19:04 +08:00
CJACK
0156f6b45b Merge origin/dev into PR 406 2026-05-02 21:17:02 +08:00
CJACK
a9f46f5b25 chore: bump version to 4.3.1 2026-05-02 21:04:12 +08:00
CJACK
e7d6807c7c feat: emit empty completion chunk along with keep-alive heartbeat in chat stream 2026-05-02 20:54:10 +08:00
d407ccb773 perf(streaming): optimize TTFT and reduce buffering latency
Core changes:
- stream.go: New accumulation buffer architecture with scanner goroutine
  + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate
- dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies
- claude/stream_runtime_core.go: Integrate toolstream for incremental text
- claude/stream_runtime_finalize.go: toolstream flush support
- stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms)
- empty_retry: Add thinking-aware empty output detection
- Fix reasoning_content leak and finish_reason=null in edge cases
- Fix tail content truncation when max_tokens exceeded

Tests: sync test expectations with upstream for thinking content
2026-05-02 20:28:30 +08:00
CJACK
c8f7b6b371 refactor streaming accumulation and chat history UI 2026-05-02 20:15:38 +08:00
CJACK.
20d71f528a Merge pull request #404 from NgoQuocViet2001/ai/openai-file-retrieve
feat(openai): retrieve uploaded file metadata
2026-05-02 15:42:40 +08:00
NgoQuocViet2001
36d0239dc6 feat(openai): retrieve uploaded file metadata 2026-05-02 14:33:42 +07:00
CJACK.
e620752e2b Merge pull request #403 from VanceHud/main
修复了使用Zeabur部署会失败的问题
2026-05-02 14:10:18 +08:00
VanceHud
44cb27872c Merge branch 'CJackHwang:main' into main 2026-05-02 12:19:09 +08:00
CJACK
3d52040b3b merge: sync release smoke fix from dev 2026-05-02 04:23:02 +08:00
CJACK
049e40e5f1 fix: drop obsolete release smoke check 2026-05-02 04:19:23 +08:00
CJACK.
217e0e9f01 Merge pull request #398 from CJackHwang/dev
feat: enhanced input validation, tool call resilience & performance
2026-05-02 04:04:44 +08:00
CJACK
b3c54fcf3d chore: bump version to 4.3.0
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-02 03:55:36 +08:00
CJACK
1c38709d32 feat: add support for parsing loose JSON lists into arrays in tool call parameters 2026-05-02 03:26:43 +08:00
CJACK
28e800c670 chore: remove obsolete planning and gate target files 2026-05-02 02:40:57 +08:00
CJACK
4389e02b29 feat: implement sync.Pool for tiktoken encoding instances to optimize token counting performance 2026-05-02 02:31:24 +08:00
CJACK
e2756f800d feat: introduce JSON UTF-8 validation middleware and prepend output integrity guard system prompt to messages 2026-05-02 02:22:34 +08:00
CJACK
55abf64717 feat: add model type support for file uploads with automatic resolution and header propagation 2026-05-02 00:55:17 +08:00
CJACK
76ee2faa12 chore: bump version to 4.2.2 and update documentation to reflect improved release workflows, CI dependencies, and project structure 2026-05-01 23:44:07 +08:00
CJACK.
9c3696194c Merge pull request #393 from CJackHwang/dev
Merge pull request #391 from BigUncle/fix/vercel-admin-history-rewrite

Fix: add missing Vercel rewrite rules for admin API routes
2026-05-01 23:28:03 +08:00
CJACK
0bca6e2cee feat: implement context cancellation handling for chat and response stream runtimes to ensure clean termination without retries 2026-05-01 23:20:46 +08:00
CJACK.
934b40e572 Merge pull request #392 from wyv202011y/fix/timeout-and-context-cancel
fix: increase stream timeout constants for large-context models; guar…
2026-05-01 23:17:31 +08:00
CJACK
dd5a0c5213 refactor: update and standardize current input file continuation prompt instructions 2026-05-01 22:27:59 +08:00
CJACK
43402e7a26 refactor: rename history file constant from HISTORY.txt to DS2API_HISTORY.txt across codebase and tests 2026-05-01 22:05:45 +08:00
CJACK.
6373c001f5 Merge pull request #391 from BigUncle/fix/vercel-admin-history-rewrite
Fix: add missing Vercel rewrite rules for admin API routes
2026-05-01 21:45:14 +08:00
BigUncle
3430322e81 docs: add Vercel chat history read-only filesystem troubleshooting 2026-05-01 21:17:52 +08:00
CJACK
df1cfac9bc refactor: replace history transcript format with numbered sections and rename upload file to HISTORY.txt 2026-05-01 21:15:17 +08:00
706e68de23 fix: increase stream timeout constants for large-context models; guard against context-cancelled double-recording
- Increase StreamIdleTimeout from 90s to 300s and MaxKeepaliveCount from 10 to 40
  to prevent premature stream termination with DeepSeek V4 Pro (~50K token contexts)
- Add r.Context().Err() check after ConsumeSSE in empty_retry_runtime (chat + responses)
  to prevent historySession.error() from overwriting historySession.stopped()
  when the request context is cancelled

References:
- MaxKeepaliveCount=10 creates a 50s no-content timeout that kills the stream
  before DeepSeek V4 Pro can produce its first token with large contexts
- Hermes Agent reports 'No response from provider for 180s' because the
  underlying SSE connection was already terminated by ds2api at 50s
- Context cancellation path: OnContextDone -> stopped(), then finalize()
  with empty output -> retry -> error() overwrites stopped()
2026-05-01 21:11:36 +08:00
BigUncle
83b4c7bcad fix: add missing Vercel rewrite rules for admin API routes
/admin/chat-history, /admin/proxies, /admin/dev/raw-samples,
and /admin/dev/captures were falling through to the SPA fallback
(/admin/index.html), causing "Unexpected token '<'" JSON parse
errors on the frontend.
2026-05-01 20:50:12 +08:00
VanceHud
603801c542 Merge branch 'CJackHwang:main' into main 2026-05-01 17:18:00 +08:00
VanceHud
febd3ec83a Document Zeabur manual deployment 2026-05-01 14:29:49 +08:00
145 changed files with 6533 additions and 2636 deletions

View File

@@ -47,7 +47,6 @@ jobs:
- name: Release Blocking Gates
run: |
./tests/scripts/check-stage6-manual-smoke.sh
./tests/scripts/check-refactor-line-gate.sh
./tests/scripts/run-unit-all.sh

View File

@@ -33,6 +33,8 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
| Health probes | `GET /healthz`, `GET /readyz` |
| CORS | Enabled (uniformly covers `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*`; echoes the browser `Origin` when present, otherwise `*`; default allow-list includes `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`, and also accepts third-party preflight-requested headers such as `x-stainless-*`; `/v1/chat/completions` on Vercel Node Runtime matches the same behavior; internal-only `X-Ds2-Internal-Token` remains blocked) |
- All JSON request bodies must be valid UTF-8; malformed byte sequences are rejected on ingress with `400 invalid json`.
### 3.0 Adapter-Layer Notes
- OpenAI / Claude / Gemini protocols are now mounted on one shared `chi` router tree assembled in `internal/server/router.go`.
@@ -81,7 +83,7 @@ Two header formats accepted:
- Token is in `config.keys`**Managed account mode**: DS2API auto-selects an account via rotation
- Token is not in `config.keys`**Direct token mode**: treated as a DeepSeek token directly
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account.
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account; if the target account does not exist or the managed-account queue is exhausted, the request returns `429`, and current responses do not include `Retry-After`. If the account exists but login/refresh fails, the request returns the underlying `401` or upstream error.
Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=` as the caller credential source.
### Admin Endpoints (`/admin/*`)
@@ -109,6 +111,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
| GET | `/v1/responses/{response_id}` | Business | Query stored response (in-memory TTL) |
| POST | `/v1/embeddings` | Business | OpenAI Embeddings API |
| POST | `/v1/files` | Business | OpenAI Files upload (multipart/form-data) |
| GET | `/v1/files/{file_id}` | Business | Retrieve uploaded file status |
| GET | `/anthropic/v1/models` | None | Claude model list |
| POST | `/anthropic/v1/messages` | Business | Claude messages |
| POST | `/anthropic/v1/messages/count_tokens` | Business | Claude token counting |
@@ -165,7 +168,7 @@ Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=
| PUT | `/admin/chat-history/settings` | Admin | Update conversation history retention limit |
| GET | `/admin/version` | Admin | Check current version and latest Release |
OpenAI `/v1/*` paths are canonical. For clients configured with the bare DS2API service URL, the same OpenAI handlers are also exposed through root shortcuts: `/models`, `/models/{id}`, `/chat/completions`, `/responses`, `/responses/{response_id}`, `/embeddings`, and `/files`.
OpenAI `/v1/*` paths are canonical. For clients configured with the bare DS2API service URL, the same OpenAI handlers are also exposed through root shortcuts: `/models`, `/models/{id}`, `/chat/completions`, `/responses`, `/responses/{response_id}`, `/embeddings`, `/files`, and `/files/{file_id}`.
---
@@ -198,10 +201,15 @@ No auth required. Returns the currently supported DeepSeek native model list.
"object": "list",
"data": [
{"id": "deepseek-v4-flash", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-flash-nothinking", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-pro", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-pro-nothinking", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-flash-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-flash-search-nothinking", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-pro-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-vision", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
{"id": "deepseek-v4-pro-search-nothinking", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-vision", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-v4-vision-nothinking", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
]
}
```
@@ -300,7 +308,7 @@ data: [DONE]
- When thinking is enabled, the stream may emit `delta.reasoning_content`
- Text emits `delta.content`
- Last chunk includes `finish_reason` and `usage`
- Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent
- Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent. Failed/interrupted endings (for example `response.failed`) may not include `usage`
#### Tool Calls
@@ -416,7 +424,7 @@ Business auth required. Returns OpenAI-compatible embeddings shape.
| `model` | string | ✅ | Supports native models + alias mapping |
| `input` | string/array | ✅ | Supports string, string array, token array |
> Requires `embeddings.provider`. Current supported values: `mock` / `deterministic` / `builtin`. If missing/unsupported, returns standard error shape with HTTP 501.
> Requires `embeddings.provider`. Current supported values: `mock` / `deterministic` / `builtin` (all three use the same local deterministic implementation). If missing/unsupported, returns standard error shape with HTTP 501.
### `POST /v1/files`
@@ -430,9 +438,13 @@ Business auth required. OpenAI Files-compatible upload endpoint; currently only
Constraints and behavior:
- `Content-Type` must be `multipart/form-data` (otherwise `400`).
- Total request size limit is `100 MiB` (over-limit returns `413`).
- Total request size limit is **100 MiB** (over-limit returns `413`).
- Success returns an OpenAI `file` object (`id/object/bytes/filename/purpose/status`, etc.) and includes `account_id` for source-account tracing.
### `GET /v1/files/{file_id}`
Business auth required. Retrieves the current DeepSeek upload status for a file and returns an OpenAI `file` object. Returns `404` when no matching file is found.
---
## Claude-Compatible API
@@ -484,6 +496,13 @@ anthropic-version: 2023-06-01
| `stream` | boolean | ❌ | Default `false` |
| `system` | string | ❌ | Optional system prompt |
| `tools` | array | ❌ | Claude tool schema |
| `thinking` | object | ❌ | Anthropic thinking config; translated into downstream reasoning control, and ignored by `-nothinking` models |
| `temperature` | number | ❌ | Passed through to the downstream bridge; if `temperature` and `top_p` are both present, `temperature` wins |
| `top_p` | number | ❌ | Passed through when `temperature` is absent |
| `stop_sequences` | array | ❌ | Passed through as downstream stop sequences |
| `tool_choice` | string/object | ❌ | Supports `auto` / `none` / `required` / `{"type":"function","name":"..."}` and is translated to downstream tool choice |
> Note: `thinking`, `temperature`, `top_p`, `stop_sequences`, and `tool_choice` are translated through the compatibility bridge. Final behavior still depends on the selected model and upstream support. When both `temperature` and `top_p` are present, `temperature` takes precedence.
#### Non-Stream Response
@@ -536,7 +555,7 @@ data: {"type":"message_stop"}
**Notes**:
- Models whose names contain `opus` / `reasoner` / `slow` stream `thinking_delta`
- Models that support thinking emit `thinking` blocks / `thinking_delta` by default; explicit thinking disablement or `-nothinking` models suppress them
- `signature_delta` is not emitted (DeepSeek does not provide verifiable thinking signatures)
- In `tools` mode, the stream avoids leaking raw tool JSON and does not force `input_json_delta`
@@ -582,6 +601,7 @@ Request body accepts Gemini-style `contents` / `tools`. Model names can use alia
Response uses Gemini-compatible fields, including:
- `candidates[].content.parts[].text`
- `candidates[].content.parts[].thought=true` for thinking output
- `candidates[].content.parts[].functionCall` (when tool call is produced)
- `usageMetadata` (`promptTokenCount` / `candidatesTokenCount` / `totalTokenCount`)
@@ -590,6 +610,7 @@ Response uses Gemini-compatible fields, including:
Returns SSE (`text/event-stream`), each chunk as `data: <json>`:
- regular text: incremental text chunks
- thinking: incremental chunks with `parts[].thought=true`
- `tools` mode: buffered and emitted as `functionCall` at finalize phase
- final chunk: includes `finishReason: "STOP"` and `usageMetadata`
- Token counting prefers pass-through from upstream DeepSeek SSE (`accumulated_token_usage` / `token_usage`), and only falls back to local estimation when upstream usage is absent
@@ -712,7 +733,6 @@ Reads runtime settings and status, including:
- `success`
- `admin` (`has_password_hash`, `jwt_expire_hours`, `jwt_valid_after_unix`, `default_password_warning`)
- `runtime` (`account_max_inflight`, `account_max_queue`, `global_max_inflight`, `token_refresh_interval_hours`)
- `compat` (`wide_input_strict_output`, `strip_reference_markers`)
- `responses` / `embeddings`
- `auto_delete` (`mode`: `none` / `single` / `all`; legacy `sessions=true` is still treated as `all`)
- `current_input_file` (`enabled` defaults to `true`, plus `min_chars`)
@@ -726,13 +746,11 @@ Hot-updates runtime settings. Supported fields:
- `admin.jwt_expire_hours`
- `runtime.account_max_inflight` / `runtime.account_max_queue` / `runtime.global_max_inflight` / `runtime.token_refresh_interval_hours`
- `compat.wide_input_strict_output` / `compat.strip_reference_markers`
- `responses.store_ttl_seconds`
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `model_aliases`
- `history_split` is retained only for legacy config compatibility and no longer affects requests
- `toolcall` policy is fixed and is no longer writable through settings
### `POST /admin/settings/password`
@@ -756,9 +774,9 @@ Imports full config with:
The request can send config directly, or wrapped as `{"config": {...}, "mode":"merge"}`.
Query params `?mode=merge` / `?mode=replace` are also supported.
`replace` mode replaces the full config shape while preserving Vercel sync metadata. `merge` mode merges `keys`, `api_keys`, `accounts`, and `model_aliases`, and overwrites non-empty fields under `admin`, `runtime`, `responses`, and `embeddings`. Manage `compat`, `auto_delete`, and `current_input_file` via `/admin/settings` or the config file; `history_split` remains only for legacy compatibility; legacy `toolcall` fields are ignored.
`replace` mode replaces the full config shape while preserving Vercel sync metadata. `merge` mode merges `keys`, `api_keys`, `accounts`, and `model_aliases`, and overwrites non-empty fields under `admin`, `runtime`, `responses`, and `embeddings`. Manage `auto_delete` and `current_input_file` via `/admin/settings` or the config file; legacy `compat` and `toolcall` fields are ignored.
> Note: `merge` mode does not update `compat`, `auto_delete`, or `current_input_file`.
> Note: `merge` mode does not update `auto_delete` or `current_input_file`.
### `GET /admin/config/export`
@@ -1212,7 +1230,7 @@ Clients should handle HTTP status code plus `error` / `detail` fields.
| Code | Meaning |
| --- | --- |
| `401` | Authentication failed (invalid key/token, or expired admin JWT) |
| `429` | Too many requests (exceeded inflight + queue capacity) |
| `429` | Too many requests (exceeded inflight + queue capacity; current responses do not include `Retry-After`) |
| `503` | Model unavailable or upstream error |
---

39
API.md
View File

@@ -33,12 +33,16 @@
| 健康检查 | `GET /healthz``GET /readyz` |
| CORS | 已启用(统一覆盖 `/v1/*``/anthropic/*``/v1beta/models/*``/admin/*`;浏览器有 `Origin` 时回显该 Origin否则为 `*`;默认允许 `Content-Type`, `Authorization`, `X-API-Key`, `X-Ds2-Target-Account`, `X-Ds2-Source`, `X-Vercel-Protection-Bypass`, `X-Goog-Api-Key`, `Anthropic-Version`, `Anthropic-Beta`,并会放行预检里声明的第三方请求头,如 `x-stainless-*`Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同行为;内部专用头 `X-Ds2-Internal-Token` 仍被拦截) |
- 所有 JSON 请求体都必须是合法 UTF-8非法字节序列会在入站阶段被拒绝为 `400 invalid json`
### 3.0 接口适配层说明
- OpenAI / Claude / Gemini 三套协议已统一挂在同一 `chi` 路由树上,由 `internal/server/router.go` 负责装配。
- 适配器层职责收敛为:**请求归一化 → DeepSeek 调用 → 协议形态渲染**,减少历史版本中“同能力多处实现”的分叉。
- Tool Calling 的解析策略在 Go 与 Node Runtime 间保持一致:推荐模型输出 DSML 外壳 `<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`;兼容层也接受 DSML wrapper 别名 `<dsml|tool_calls>``<|tool_calls>``<tool_calls>`、常见 DSML 分隔符漏写形态(如 `<|DSML tool_calls>`)、`DSML` 与工具标签名黏连的常见 typo`<DSMLtool_calls>`),以及旧式 canonical XML `<tool_calls>``<invoke name="...">``<parameter name="...">`。实现上采用窄容错结构扫描:只有 `tool_calls` wrapper 或可修复的缺失 opening wrapper 会进入工具路径,裸 `<invoke>` 不计为已支持语法;流式场景继续执行防泄漏筛分。若参数体本身是合法 JSON 字面量(如 `123``true``null`、数组或对象),会按结构化值输出,不再一律当作字符串;若 CDATA 偶发漏闭合,则会在最终 parse / flush 恢复阶段做窄修复,尽量保住已完整包裹的外层工具调用。
- `Admin API` 将配置与运行时策略分开:`/admin/config*` 管静态配置,`/admin/settings*` 管运行时行为。
- 当上游返回 thinking-only 响应(模型输出了推理链但无可见文本)时,非流式补全会自动重试一次:以多轮对话 follow-up 方式追加 prompt 后缀 `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` 并设置 `parent_message_id` 在同一 DeepSeek session 内让模型重新输出;重试最大 1 次。
- 引用标记剥离strip reference markers当前为固定开启的运行时行为所有协议适配层统一生效。
---
@@ -81,7 +85,7 @@ Vercel 一键部署可先只填 `DS2API_ADMIN_KEY`,部署后在 `/admin` 导
- token 在 `config.keys` 中 → **托管账号模式**,自动轮询选择账号
- token 不在 `config.keys` 中 → **直通 token 模式**,直接作为 DeepSeek token 使用
**可选请求头**`X-Ds2-Target-Account: <email_or_mobile>` — 指定使用某个托管账号。
**可选请求头**`X-Ds2-Target-Account: <email_or_mobile>` — 指定使用某个托管账号;如果目标账号不存在,或管理账号队列已耗尽,相关业务请求会返回 `429`,当前不会附带 `Retry-After` 头。若账号存在但登录/刷新失败,则返回对应的 `401` 或上游错误
Gemini 兼容客户端还可以使用 `x-goog-api-key``?key=``?api_key=` 作为凭据来源。
### Admin 接口(`/admin/*`
@@ -109,6 +113,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
| GET | `/v1/responses/{response_id}` | 业务 | 查询已生成 response内存 TTL |
| POST | `/v1/embeddings` | 业务 | OpenAI Embeddings 接口 |
| POST | `/v1/files` | 业务 | OpenAI Files 上传multipart/form-data |
| GET | `/v1/files/{file_id}` | 业务 | 查询已上传文件状态 |
| GET | `/anthropic/v1/models` | 无 | Claude 模型列表 |
| POST | `/anthropic/v1/messages` | 业务 | Claude 消息接口 |
| POST | `/anthropic/v1/messages/count_tokens` | 业务 | Claude token 计数 |
@@ -165,7 +170,7 @@ Gemini 兼容客户端还可以使用 `x-goog-api-key`、`?key=` 或 `?api_key=`
| PUT | `/admin/chat-history/settings` | Admin | 更新对话记录保留条数 |
| GET | `/admin/version` | Admin | 查询当前版本与最新 Release |
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models``/models/{id}``/chat/completions``/responses``/responses/{response_id}``/embeddings``/files`
OpenAI `/v1/*` 仍是规范路径。对于只配置 DS2API 根地址的客户端,同一套 OpenAI handler 也通过根路径快捷路由暴露:`/models``/models/{id}``/chat/completions``/responses``/responses/{response_id}``/embeddings``/files``/files/{file_id}`
---
@@ -307,7 +312,7 @@ data: [DONE]
- 开启 thinking 时会输出 `delta.reasoning_content`
- 普通文本输出 `delta.content`
- 最后一段包含 `finish_reason``usage`
- token 计数优先透传上游 DeepSeek SSE`accumulated_token_usage` / `token_usage`);仅在上游缺失时回退本地估算
- token 计数优先透传上游 DeepSeek SSE`accumulated_token_usage` / `token_usage`);仅在上游缺失时回退本地估算。失败/中断型结束(例如 `response.failed`)可能不会携带 `usage`
#### Tool Calls
@@ -424,7 +429,7 @@ data: [DONE]
| `model` | string | ✅ | 支持原生模型 + alias 自动映射 |
| `input` | string/array | ✅ | 支持字符串、字符串数组、token 数组 |
> 需配置 `embeddings.provider`。当前支持:`mock` / `deterministic` / `builtin`。未配置或不支持时返回标准错误结构HTTP 501
> 需配置 `embeddings.provider`。当前支持:`mock` / `deterministic` / `builtin`(三者都走同一套本地确定性实现)。未配置或不支持时返回标准错误结构HTTP 501
### `POST /v1/files`
@@ -438,9 +443,13 @@ data: [DONE]
约束与行为:
- 请求必须为 `multipart/form-data`,否则返回 `400`
- 请求体总大小上限 `100 MiB`(超限返回 `413`)。
- 请求体总大小上限 **100 MiB**(超限返回 `413`)。
- 成功返回 OpenAI `file` 对象(`id/object/bytes/filename/purpose/status` 等字段),并附带 `account_id` 便于定位来源账号。
### `GET /v1/files/{file_id}`
需要业务鉴权。查询 DeepSeek 上传文件的当前状态,并返回 OpenAI `file` 对象;未找到匹配文件时返回 `404`
---
## Claude 兼容接口
@@ -495,6 +504,13 @@ anthropic-version: 2023-06-01
| `stream` | boolean | ❌ | 默认 `false` |
| `system` | string | ❌ | 可选系统提示 |
| `tools` | array | ❌ | Claude tool 定义 |
| `thinking` | object | ❌ | Anthropic thinking 配置;会转译为下游 reasoning 控制,`-nothinking` 模型会忽略 |
| `temperature` | number | ❌ | 透传到下游;若同时提供 `top_p`,以 `temperature` 为准 |
| `top_p` | number | ❌ | 当未提供 `temperature` 时透传到下游 |
| `stop_sequences` | array | ❌ | 透传到下游停用序列 |
| `tool_choice` | string/object | ❌ | 支持 `auto` / `none` / `required` / `{"type":"function","name":"..."}`,并会转译为下游工具选择 |
> 说明:上述 `thinking`、`temperature`、`top_p`、`stop_sequences`、`tool_choice` 都会走兼容层转译;最终是否生效仍取决于当前模型和上游能力。`temperature` 与 `top_p` 同时存在时,`temperature` 优先。
#### 非流式响应
@@ -547,7 +563,7 @@ data: {"type":"message_stop"}
**说明**
- 默认模型会按各 surface 的既有规则输出 thinking / reasoning 相关增量
- 默认支持 thinking 的模型会输出 `thinking` block / `thinking_delta`;请求显式关闭 thinking 或使用 `-nothinking` 模型时不会输出
-`-nothinking` 后缀的模型会强制关闭 thinking即使请求显式传了 `thinking` / `reasoning` / `reasoning_effort` 也不会输出 `thinking_delta`
- 不会输出 `signature_delta`(上游 DeepSeek 未提供可验证签名)
- `tools` 场景优先避免泄露原始工具 JSON不强制发送 `input_json_delta`
@@ -594,6 +610,7 @@ data: {"type":"message_stop"}
响应为 Gemini 兼容结构,核心字段包括:
- `candidates[].content.parts[].text`
- `candidates[].content.parts[].thought=true`thinking 输出)
- `candidates[].content.parts[].functionCall`(工具调用时)
- `usageMetadata``promptTokenCount` / `candidatesTokenCount` / `totalTokenCount`
@@ -602,6 +619,7 @@ data: {"type":"message_stop"}
返回 SSE`text/event-stream`),每个 chunk 为一条 `data: <json>`
- 常规文本:持续返回增量文本 chunk
- thinking持续返回 `parts[].thought=true` 的增量 chunk
- `tools` 场景:会缓冲并在结束时输出 `functionCall` 结构
- 结束 chunk包含 `finishReason: "STOP"``usageMetadata`
- token 计数优先透传上游 DeepSeek SSE`accumulated_token_usage` / `token_usage`);仅在上游缺失时回退本地估算
@@ -724,7 +742,6 @@ data: {"type":"message_stop"}
- `success`
- `admin``has_password_hash``jwt_expire_hours``jwt_valid_after_unix``default_password_warning`
- `runtime``account_max_inflight``account_max_queue``global_max_inflight``token_refresh_interval_hours`
- `compat``wide_input_strict_output``strip_reference_markers`
- `responses` / `embeddings`
- `auto_delete``mode``none` / `single` / `all`;旧配置 `sessions=true` 仍按 `all` 处理)
- `current_input_file``enabled` 默认返回 `true``min_chars`
@@ -738,13 +755,11 @@ data: {"type":"message_stop"}
- `admin.jwt_expire_hours`
- `runtime.account_max_inflight` / `runtime.account_max_queue` / `runtime.global_max_inflight` / `runtime.token_refresh_interval_hours`
- `compat.wide_input_strict_output` / `compat.strip_reference_markers`
- `responses.store_ttl_seconds`
- `embeddings.provider`
- `auto_delete.mode`
- `current_input_file.enabled` / `current_input_file.min_chars`
- `model_aliases`
- `history_split` 仅作为旧配置兼容字段保留,不再影响请求处理
- `toolcall` 策略已固定,不再作为可写入字段
### `POST /admin/settings/password`
@@ -768,9 +783,9 @@ data: {"type":"message_stop"}
请求可直接传配置对象,或使用 `{"config": {...}, "mode":"merge"}` 包裹格式。
也支持在查询参数里传 `?mode=merge` / `?mode=replace`
`replace` 模式会按完整配置结构替换(保留 Vercel 同步元信息);`merge` 模式会合并 `keys``api_keys``accounts``model_aliases`,并覆盖 `admin``runtime``responses``embeddings` 中的非空字段。`compat``auto_delete``current_input_file` 建议通过 `/admin/settings` 或配置文件管理;`history_split` 仅保留为旧配置兼容字段;`toolcall` 相关字段会被忽略。
`replace` 模式会按完整配置结构替换(保留 Vercel 同步元信息);`merge` 模式会合并 `keys``api_keys``accounts``model_aliases`,并覆盖 `admin``runtime``responses``embeddings` 中的非空字段。`auto_delete``current_input_file` 建议通过 `/admin/settings` 或配置文件管理;`compat``toolcall` 相关字段会被忽略。
> 注意:`merge` 模式不会更新 `compat`、`auto_delete`、`current_input_file`。
> 注意:`merge` 模式不会更新 `auto_delete`、`current_input_file`。
### `GET /admin/config/export`
@@ -1226,7 +1241,7 @@ Gemini 路由使用 Google 风格错误结构:
| 状态码 | 说明 |
| --- | --- |
| `401` | 鉴权失败key/token 无效,或 Admin JWT 过期) |
| `429` | 请求过多(超出并发上限 + 等待队列) |
| `429` | 请求过多(超出并发上限 + 等待队列;当前不附带 `Retry-After` |
| `503` | 模型不可用或上游服务异常 |
---

View File

@@ -17,7 +17,7 @@
语言 / Language: [中文](README.MD) | [English](README.en.md)
将 DeepSeek Web 对话能力转换为 OpenAI、Claude 与 Gemini 兼容 API。后端 **Go 全量实现**,前端为 React WebUI 管理台(源码在 `webui/`,部署时自动构建到 `static/admin`)。
将 DeepSeek Web 对话能力转换为 OpenAI、Claude 与 Gemini 兼容 API。核心后端 **Go** 实现Vercel 流式桥接额外使用少量 Node Runtime,前端为 React WebUI 管理台(源码在 `webui/`,部署时自动构建到 `static/admin`)。
文档入口:[文档导航](docs/README.md) / [架构说明](docs/ARCHITECTURE.md) / [接口文档](API.md)
@@ -76,13 +76,14 @@ flowchart LR
subgraph Runtime["运行时核心能力"]
Compat["PromptCompat\n(API -> 网页纯文本上下文)"]
Chat["Chat / Responses Runtime\n(统一工具调用与流式语义)"]
Completion["Completion Runtime\n(Session / PoW / Completion)"]
Turn["AssistantTurn\n(输出语义归一)"]
Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
Pool["Account Pool + Queue\n(并发槽位 + 等待队列)"]
DSClient["DeepSeek Client\n(Session / Auth / Completion / Files)"]
Pow["PoW 实现\n(纯 Go)"]
Tool["Tool Sieve\n(Go/Node 语义对齐)"]
History["History Split\n(长历史文件化)"]
History["Current Input File\n(DS2API_HISTORY.txt)"]
end
end
@@ -94,18 +95,19 @@ flowchart LR
OA --> Compat
CA & GA --> Compat
Compat --> Chat
Compat -.长历史.-> History
Vercel -.Go prepare.-> Chat
Compat --> Completion
Completion -.完整上下文.-> History
Completion --> Turn
Vercel -.Go prepare.-> Completion
Vercel -.Node SSE.-> Tool
Chat --> Auth
Chat -.账号轮询.-> Pool
Chat -.工具调用解析.-> Tool
Chat -.PoW 计算.-> Pow
Completion --> Auth
Completion -.账号轮询.-> Pool
Completion -.工具调用解析.-> Tool
Completion -.PoW 计算.-> Pow
Auth --> DSClient
DSClient --> Upstream
Upstream --> DSClient
Chat --> Client
Turn --> Client
Vercel --> Client
```
@@ -119,7 +121,7 @@ flowchart LR
| 能力 | 说明 |
| --- | --- |
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files` |
| OpenAI 兼容 | `GET /v1/models`、`GET /v1/models/{id}`、`POST /v1/chat/completions`、`POST /v1/responses`、`GET /v1/responses/{response_id}`、`POST /v1/embeddings`、`POST /v1/files`、`GET /v1/files/{file_id}` |
| Claude 兼容 | `GET /anthropic/v1/models`、`POST /anthropic/v1/messages`、`POST /anthropic/v1/messages/count_tokens`(及快捷路径 `/v1/messages`、`/messages` |
| Gemini 兼容 | `POST /v1beta/models/{model}:generateContent`、`POST /v1beta/models/{model}:streamGenerateContent`(及 `/v1/models/{model}:*` 路径) |
| 统一 CORS 兼容 | `/v1/*`、`/anthropic/*`、`/v1beta/models/*`、`/admin/*` 统一走同一套 CORS 策略Vercel 上 `/v1/chat/completions` 的 Node Runtime 也对齐相同放行规则,尽量减少第三方预检请求头限制 |
@@ -131,7 +133,7 @@ flowchart LR
| WebUI 管理台 | `/admin` 单页应用(中英文双语、深色模式,支持查看服务器端对话记录) |
| 运维探针 | `GET /healthz`(存活)、`GET /readyz`(就绪) |
OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/completions`、`/responses`、`/embeddings`、`/files` 等根路径快捷路由,方便只配置 DS2API 根地址的第三方客户端。
OpenAI `/v1/*` 仍是推荐的规范路径;同时支持 `/models`、`/chat/completions`、`/responses`、`/embeddings`、`/files`、`/files/{file_id}` 等根路径快捷路由,方便只配置 DS2API 根地址的第三方客户端。
## 平台兼容矩阵
@@ -257,6 +259,10 @@ docker-compose logs -f
2. 部署完成后访问 `/admin`,使用 Zeabur 环境变量/模板指引中的 `DS2API_ADMIN_KEY` 登录。
3. 在管理台导入/编辑配置(会写入并持久化到 `/data/config.json`)。
Zeabur 首次空卷启动时可以没有 `/data/config.json`DS2API 会先使用空的文件模式配置启动,并在管理台首次保存时创建该文件。
不依赖模板手动部署时,在 Zeabur 中选择 GitHub 仓库服务Root Directory 保持 `/`,使用仓库根目录 `Dockerfile` 构建;添加持久卷 `/data`,设置 `PORT=5001`、`DS2API_ADMIN_KEY=你的强密钥`、`DS2API_CONFIG_PATH=/data/config.json`,然后暴露 HTTP 端口 `5001`。更完整步骤见 [docs/DEPLOY.md](docs/DEPLOY.md#不使用模板手动部署)。
说明Zeabur 使用仓库内 `Dockerfile` 直接构建时,不需要额外传入 `BUILD_VERSION`;镜像会优先读取该构建参数,未提供时自动回退到仓库根目录的 `VERSION` 文件。
### 方式三Vercel 部署
@@ -285,7 +291,7 @@ base64 < config.json | tr -d '\n'
### 方式四:本地源码运行
**前置要求**Go 1.26+Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时)
**前置要求**Go 1.26+Node.js `20.19+` 或 `22.12+`(仅在需要构建 WebUI 时);同时确保 `npm` 可用,建议 `npm 10+`
```bash
# 1. 克隆仓库
@@ -304,7 +310,7 @@ go run ./cmd/ds2api
服务实际绑定:`0.0.0.0:5001`,因此同一局域网设备通常也可以通过你的内网 IP 访问。
> **WebUI 自动构建**:本地首次启动时,若 `static/admin` 不存在,会自动尝试执行 `npm ci`(仅在缺少依赖时)和 `npm run build -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js。你也可以手动构建`./scripts/build-webui.sh`
> **WebUI 自动构建**:本地首次启动时,若 `static/admin` 不存在,会自动尝试执行 `npm ci`(仅在缺少依赖时)和 `npm run build -- --outDir static/admin --emptyOutDir`(需要本机有 Node.js 和 npm)。你也可以手动构建:`./scripts/build-webui.sh`
## 配置说明
@@ -317,8 +323,7 @@ go run ./cmd/ds2api
- `model_aliases`OpenAI / Claude / Gemini 共用的模型 alias 映射。
- `runtime`:账号并发、队列与 token 刷新策略,可通过 Admin Settings 热更新。
- `auto_delete.mode`:请求结束后的远端会话清理策略,支持 `none` / `single` / `all`。
- `history_split`:旧轮次拆分字段,已废弃并忽略,仅保留兼容旧配置
- `current_input_file`:唯一生效的独立拆分策略;默认开启且阈值为 `0`,触发时将完整上下文合并上传为 `history.txt` 上下文文件。
- `current_input_file`:全局生效的上下文拆分上传策略;默认开启且阈值为 `0`,触发时将完整上下文合并上传为 `DS2API_HISTORY.txt` 上下文文件
- 如果关闭 `current_input_file`,请求会直接透传,不上传拆分上下文文件。
- `thinking_injection`:默认开启;在最新 user 消息末尾追加思考增强提示词,提高高强度推理与工具调用前的思考稳定性;`prompt` 留空时使用内置默认提示词。
@@ -334,6 +339,7 @@ go run ./cmd/ds2api
| **直通 token 模式** | 传入 token 不在 `config.keys` 中时,直接作为 DeepSeek token 使用 |
可选请求头 `X-Ds2-Target-Account`:指定使用某个托管账号(值为 email 或 mobile
如果指定账号不存在,或者当前管理账号队列已满,请求会返回 `429`;当前 `429` 不附带 `Retry-After` 头。若账号存在但登录/刷新失败,则返回对应的鉴权错误。
Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `?key=` / `?api_key=` 作为调用方凭据。
## 并发模型
@@ -346,7 +352,7 @@ Gemini 路由还可以使用 `x-goog-api-key`,或在没有认证头时使用 `
```
- 当 in-flight 槽位满时,请求进入等待队列,**不会立即 429**
- 超出总承载上限后才返回 `429 Too Many Requests`
- 超出总承载上限后才返回 `429 Too Many Requests`,当前响应不附带 `Retry-After`
- `GET /admin/queue/status` 返回实时并发状态
## Tool Call 适配
@@ -424,10 +430,10 @@ npm run build --prefix webui
工作流文件:`.github/workflows/release-artifacts.yml`
- **触发条件**:仅在 GitHub Release `published` 时触发(普通 push 不会触发)
- **构建产物**:多平台二进制包(`linux/amd64`、`linux/arm64`、`linux/armv7`、`darwin/amd64`、`darwin/arm64`、`windows/amd64`、`windows/arm64`+ `sha256sums.txt`
- **触发条件**默认仅在 GitHub Release `published` 时自动触发;也支持在 Actions 页面手动 `workflow_dispatch`,并填写 `release_tag` 复跑/补发
- **构建产物**:多平台二进制包(`linux/amd64`、`linux/arm64`、`linux/armv7`、`darwin/amd64`、`darwin/arm64`、`windows/amd64`、`windows/arm64`、Linux Docker 镜像导出包 + `sha256sums.txt`
- **容器镜像发布**:仅推送到 GHCR`ghcr.io/cjackhwang/ds2api`
- **每个压缩包包含**`ds2api` 可执行文件、`static/admin`、WASM 文件(同时支持内置 fallback、`config.example.json` 配置示例、README、LICENSE
- **每个二进制压缩包包含**`ds2api` 可执行文件、`static/admin`、`config.example.json`、`.env.example`、`README.MD`、`README.en.md`、`LICENSE`
## 免责声明

View File

@@ -16,7 +16,7 @@
Language: [中文](README.MD) | [English](README.en.md)
DS2API converts DeepSeek Web chat capability into OpenAI-compatible, Claude-compatible, and Gemini-compatible APIs. The backend is a **pure Go implementation**, with a React WebUI admin panel (source in `webui/`, build output auto-generated to `static/admin` during deployment).
DS2API converts DeepSeek Web chat capability into OpenAI-compatible, Claude-compatible, and Gemini-compatible APIs. The core backend is Go-based, with a small Node Runtime bridge used for Vercel streaming, and the React WebUI admin panel lives in `webui/` (build output auto-generated to `static/admin` during deployment).
Documentation entry: [Docs Index](docs/README.md) / [Architecture](docs/ARCHITECTURE.en.md) / [API Reference](API.en.md)
@@ -73,13 +73,14 @@ flowchart LR
subgraph Runtime["Runtime + Core Capabilities"]
Compat["PromptCompat\n(API -> web-chat plain text context)"]
Chat["Chat / Responses Runtime\n(unified tools + stream semantics)"]
Completion["Completion Runtime\n(session / PoW / completion)"]
Turn["AssistantTurn\n(output semantic normalization)"]
Auth["Auth Resolver\n(API key / bearer / x-goog-api-key)"]
Pool["Account Pool + Queue\n(in-flight slots + wait queue)"]
DSClient["DeepSeek Client\n(session / auth / completion / files)"]
Pow["PoW Solver\n(Pure Go)"]
Tool["Tool Sieve\n(Go/Node semantic parity)"]
History["History Split\n(long history as files)"]
History["Current Input File\n(DS2API_HISTORY.txt)"]
end
end
@@ -91,18 +92,19 @@ flowchart LR
OA --> Compat
CA & GA --> Compat
Compat --> Chat
Compat -.long history.-> History
Vercel -.Go prepare.-> Chat
Compat --> Completion
Completion -.full context.-> History
Completion --> Turn
Vercel -.Go prepare.-> Completion
Vercel -.Node SSE.-> Tool
Chat --> Auth
Chat -.account rotation.-> Pool
Chat -.tool-call parsing.-> Tool
Chat -.PoW solving.-> Pow
Completion --> Auth
Completion -.account rotation.-> Pool
Completion -.tool-call parsing.-> Tool
Completion -.PoW solving.-> Pow
Auth --> DSClient
DSClient --> Upstream
Upstream --> DSClient
Chat --> Client
Turn --> Client
Vercel --> Client
```
@@ -116,7 +118,7 @@ For the full module-by-module architecture and directory responsibilities, see [
| Capability | Details |
| --- | --- |
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files` |
| OpenAI compatible | `GET /v1/models`, `GET /v1/models/{id}`, `POST /v1/chat/completions`, `POST /v1/responses`, `GET /v1/responses/{response_id}`, `POST /v1/embeddings`, `POST /v1/files`, `GET /v1/files/{file_id}` |
| Claude compatible | `GET /anthropic/v1/models`, `POST /anthropic/v1/messages`, `POST /anthropic/v1/messages/count_tokens` (plus shortcut paths `/v1/messages`, `/messages`) |
| Gemini compatible | `POST /v1beta/models/{model}:generateContent`, `POST /v1beta/models/{model}:streamGenerateContent` (plus `/v1/models/{model}:*` paths) |
| Unified CORS compatibility | `/v1/*`, `/anthropic/*`, `/v1beta/models/*`, and `/admin/*` share one CORS policy; on Vercel, the Node Runtime for `/v1/chat/completions` mirrors the same relaxed preflight behavior for third-party clients |
@@ -128,7 +130,7 @@ For the full module-by-module architecture and directory responsibilities, see [
| WebUI Admin Panel | SPA at `/admin` (bilingual Chinese/English, dark mode, with server-side conversation history) |
| Health Probes | `GET /healthz` (liveness), `GET /readyz` (readiness) |
OpenAI `/v1/*` routes remain canonical, and DS2API also accepts root shortcuts such as `/models`, `/chat/completions`, `/responses`, `/embeddings`, and `/files` for clients configured with the bare service URL.
OpenAI `/v1/*` routes remain canonical, and DS2API also accepts root shortcuts such as `/models`, `/chat/completions`, `/responses`, `/embeddings`, `/files`, and `/files/{file_id}` for clients configured with the bare service URL.
## Platform Compatibility Matrix
@@ -245,6 +247,10 @@ Rebuild after updates: `docker-compose up -d --build`
2. After deployment, open `/admin` and login with `DS2API_ADMIN_KEY` shown in Zeabur env/template instructions.
3. Import / edit config in Admin UI (it will be written and persisted to `/data/config.json`).
Fresh Zeabur volumes can start without `/data/config.json`; DS2API will boot with an empty file-backed config and create the file on the first Admin UI save.
For manual deployment without the template, create a Zeabur GitHub service, keep Root Directory as `/`, build with the repo-root `Dockerfile`, mount a persistent volume at `/data`, set `PORT=5001`, `DS2API_ADMIN_KEY=your-strong-secret`, and `DS2API_CONFIG_PATH=/data/config.json`, then expose HTTP port `5001`. See [docs/DEPLOY.en.md](docs/DEPLOY.en.md#manual-deployment-without-the-template) for the full guide.
Note: when Zeabur builds directly from the repo `Dockerfile`, you do not need to pass `BUILD_VERSION`. The image prefers that build arg when provided, and automatically falls back to the repo-root `VERSION` file when it is absent.
### Option 3: Vercel
@@ -305,8 +311,7 @@ Common fields:
- `model_aliases`: one shared alias map for OpenAI / Claude / Gemini model names.
- `runtime`: account concurrency, queueing, and token refresh behavior, hot-reloadable via Admin Settings.
- `auto_delete.mode`: remote session cleanup after each request, supporting `none` / `single` / `all`.
- `history_split`: legacy multi-turn history split field, now ignored and kept only for backward-compatible config loading.
- `current_input_file`: the only active split mode; it is enabled by default and uploads the full context as a `history.txt` context file once the character threshold is reached.
- `current_input_file`: the global context split/upload mode; it is enabled by default and uploads the full context as a `DS2API_HISTORY.txt` context file once the character threshold is reached.
- If you turn off `current_input_file`, requests pass through directly without uploading any split context file.
For the full environment variable list, see [docs/DEPLOY.en.md](docs/DEPLOY.en.md). For auth behavior, see [API.en.md](API.en.md#authentication).
@@ -409,10 +414,10 @@ npm run build --prefix webui
Workflow: `.github/workflows/release-artifacts.yml`
- **Trigger**: only on GitHub Release `published` (normal pushes do not trigger builds)
- **Outputs**: multi-platform archives (`linux/amd64`, `linux/arm64`, `linux/armv7`, `darwin/amd64`, `darwin/arm64`, `windows/amd64`, `windows/arm64`) + `sha256sums.txt`
- **Trigger**: by default only on GitHub Release `published`; you can also run it manually via `workflow_dispatch` and pass `release_tag` to rerun / backfill
- **Outputs**: multi-platform binary archives (`linux/amd64`, `linux/arm64`, `linux/armv7`, `darwin/amd64`, `darwin/arm64`, `windows/amd64`, `windows/arm64`), Linux Docker image export tarballs, and `sha256sums.txt`
- **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
- **Each archive includes**: `ds2api` executable, `static/admin`, WASM file (with embedded fallback support), `config.example.json`-based config template, README, LICENSE
- **Each binary archive includes**: the `ds2api` executable, `static/admin`, `config.example.json`, `.env.example`, `README.MD`, `README.en.md`, and `LICENSE`
## Disclaimer

View File

@@ -1 +1 @@
4.2.1
4.4.0

View File

@@ -43,10 +43,6 @@
"gpt-5.3-codex": "deepseek-v4-pro",
"o3": "deepseek-v4-pro"
},
"compat": {
"wide_input_strict_output": true,
"strip_reference_markers": true
},
"responses": {
"store_ttl_seconds": 900
},

View File

@@ -15,6 +15,7 @@ ds2api/
│ └── workflows/ # GitHub Actions workflows
├── api/ # Serverless entrypoints (Vercel Go/Node)
├── app/ # Application-level handler assembly
├── artifacts/ # Debug artifacts (raw-stream-sim, stream-debug, etc.)
├── cmd/ # Executable entrypoints
│ ├── ds2api/ # Main service bootstrap
│ └── ds2api-tests/ # E2E testsuite CLI bootstrap
@@ -25,6 +26,8 @@ ds2api/
│ ├── chathistory/ # Server-side conversation history storage/query
│ ├── claudeconv/ # Claude message conversion helpers
│ ├── compat/ # Compatibility and regression helpers
│ ├── assistantturn/ # Upstream output to canonical assistant turn / stream event semantics
│ ├── completionruntime/ # Shared Go DeepSeek completion startup, non-stream collection, and retry
│ ├── config/ # Config loading/validation/hot reload
│ ├── deepseek/ # DeepSeek upstream client/protocol/transport
│ │ ├── client/ # Login/session/completion/upload/delete calls
@@ -38,13 +41,14 @@ ds2api/
│ │ ├── admin/ # Admin API root assembly and resource packages
│ │ ├── claude/ # Claude HTTP protocol adapter
│ │ ├── gemini/ # Gemini HTTP protocol adapter
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions execution entrypoint
│ │ ├── responses/ # Responses API and response store
│ │ ├── files/ # Files API and inline-file preprocessing
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP errors/models/tool formatting
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions execution entrypoint
│ │ ├── responses/ # Responses API and response store
│ │ ├── files/ # Files API and inline-file preprocessing
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP errors/models/tool formatting
│ │ └── requestbody/ # HTTP body reading and UTF-8/JSON validation helpers
│ ├── js/ # Node runtime related logic
│ │ ├── chat-stream/ # Node streaming bridge
│ │ ├── helpers/ # JS helper modules
@@ -61,13 +65,14 @@ ds2api/
│ ├── textclean/ # Text cleanup
│ ├── toolcall/ # Tool-call parsing and repair
│ ├── toolstream/ # Go streaming tool-call anti-leak and delta detection
│ ├── translatorcliproxy/ # Cross-protocol translation bridge
│ ├── translatorcliproxy/ # Vercel/fallback/test protocol translation bridge
│ ├── util/ # Shared utility helpers
│ ├── version/ # Version query/compare
│ └── webui/ # WebUI static hosting logic
├── plans/ # Stage plans and manual QA records
├── pow/ # PoW standalone implementation + benchmarks
├── scripts/ # Build/release helper scripts
├── static/ # Build artifacts (admin static resources)
├── tests/ # Test assets and scripts
│ ├── compat/ # Compatibility fixtures + expected outputs
│ │ ├── expected/ # Expected output samples
@@ -76,9 +81,9 @@ ds2api/
│ │ └── toolcalls/ # Tool-call fixtures
│ ├── node/ # Node unit tests
│ ├── raw_stream_samples/ # Upstream raw SSE samples
│ │ ├── content-filter-trigger-20260405-jwt3/ # Content-filter terminal sample
│ │ ├── continue-thinking-snapshot-replay-20260405/ # Continue-thinking sample
│ │ ├── guangzhou-weather-reasoner-search-20260404/ # Search/reference sample
│ │ ├── longtext-deepseek-v4-flash-20260429/ # Flash long-text/file-upload sample
│ │ ├── longtext-deepseek-v4-pro-20260429/ # Pro long-text/file-upload sample
│ │ ├── markdown-format-example-20260405/ # Markdown sample
│ │ └── markdown-format-example-20260405-spacefix/ # Space-fix sample
│ ├── scripts/ # Test entry scripts
@@ -91,6 +96,8 @@ ds2api/
├── features/ # Feature modules
│ ├── account/ # Account management page
│ ├── apiTester/ # API tester page
│ ├── chatHistory/ # Server-side conversation history page
│ ├── proxy/ # Proxy management page
│ ├── settings/ # Settings page
│ └── vercel/ # Vercel sync page
├── layout/ # Layout components
@@ -124,8 +131,11 @@ flowchart LR
subgraph RUNTIME[Shared runtime]
AUTH[internal/auth]
POOL[internal/account queue + concurrency]
CR[internal/completionruntime]
TURN[internal/assistantturn]
STREAM[internal/stream + internal/sse]
TOOL[internal/toolcall + internal/toolstream]
FMT[internal/format/openai + claude]
DS[internal/deepseek/client]
POW[pow + internal/deepseek/protocol]
end
@@ -151,16 +161,24 @@ flowchart LR
PC --> PROMPT
PC -.long history.-> HIST
PC --> AUTH
PC --> CR
NCS -.Go prepare/release.-> CHAT
NCS --> JS
JS --> TOOL
AUTH --> POOL
CHAT --> STREAM
RESP --> STREAM
CHAT --> CR
RESP --> CR
CA --> CR
GA --> CR
CR --> DS
CR --> STREAM
CR --> TURN
STREAM --> TURN
STREAM --> TOOL
POOL --> DS
TURN --> FMT
POOL --> CR
DS --> POW
DS --> U[DeepSeek upstream]
```
@@ -169,9 +187,12 @@ flowchart LR
- `internal/server`: router tree + middlewares (health, protocol routes, Admin/WebUI).
- `internal/httpapi/openai/*`: OpenAI HTTP surface split into chat, responses, files, embeddings, history, and shared packages; chat/responses share the promptcompat, stream, and toolcall semantics.
- `internal/httpapi/{claude,gemini}`: protocol wrappers that normalize into the same prompt compatibility semantics without duplicating upstream execution.
- `internal/httpapi/{claude,gemini}`: protocol adapters that normalize into the same prompt compatibility semantics; normal direct paths must share DeepSeek session/PoW/completion execution through `completionruntime`, while `translatorcliproxy` is reserved for Vercel prepare/release, missing-backend fallback, and regression tests.
- `internal/httpapi/requestbody`: shared HTTP body reading, JSON pre-validation, and UTF-8 error helpers across protocol adapters.
- `internal/promptcompat`: compatibility core for turning OpenAI/Claude/Gemini requests into DeepSeek web-chat plain-text context.
- `internal/translatorcliproxy`: structure translation between Claude/Gemini and OpenAI.
- `internal/assistantturn`: Go output-side canonical semantics, converting DeepSeek SSE collection results and stream finalization state into assistant turns and centralizing thinking, tool call, citation, usage, stop/error behavior.
- `internal/completionruntime`: shared Go completion execution helpers for DeepSeek session/PoW/call startup, non-stream collection, and empty-output retry; streaming paths use it to start upstream requests, continue to use `internal/stream` for real-time consumption, and use `assistantturn` during finalization.
- `internal/translatorcliproxy`: bridge compatibility layer for Claude/Gemini and OpenAI shape translation; it is not the main business protocol conversion center.
- `internal/deepseek/{client,protocol,transport}`: upstream requests, sessions, PoW adaptation, protocol constants, and transport details.
- `internal/js/chat-stream` + `api/chat-stream.js`: Vercel Node streaming bridge; Go prepare/release owns auth, account lease, and completion payload assembly, while Node relays real-time SSE with Go-aligned finalization and tool sieve semantics.
- `internal/stream` + `internal/sse`: Go stream parsing and incremental assembly.
@@ -180,6 +201,13 @@ flowchart LR
- `internal/chathistory`: server-side conversation history persistence, pagination, detail lookup, and retention policy.
- `internal/config`: config loading/validation + runtime settings hot-reload.
- `internal/account`: managed account pool, inflight slots, waiting queue.
- `internal/textclean`: text cleanup helpers, e.g. stripping `[reference: N]` markers.
- `internal/claudeconv`: Claude API request to DeepSeek format conversion.
- `internal/compat`: compatibility regression tests using SSE fixtures to verify output consistency.
- `internal/rawsample`: upstream raw response capture, read/write, and management.
- `internal/devcapture`: developer debug capture, storing HTTP request/response for troubleshooting.
- `internal/util`: cross-package utilities including JSON writing, type conversion, token counting, thinking parsing, etc.
- `internal/version`: version query and comparison, supporting build-time injection and runtime resolution.
## 4. WebUI Runtime Relation

View File

@@ -15,6 +15,7 @@ ds2api/
│ └── workflows/ # GitHub Actions 工作流
├── api/ # Serverless 入口Vercel Go/Node
├── app/ # 应用级 handler 装配层
├── artifacts/ # 调试产物raw-stream-sim, stream-debug 等)
├── cmd/ # 可执行程序入口
│ ├── ds2api/ # 主服务启动入口
│ └── ds2api-tests/ # E2E 测试集 CLI 入口
@@ -25,6 +26,8 @@ ds2api/
│ ├── chathistory/ # 服务器端对话记录存储与查询
│ ├── claudeconv/ # Claude 消息格式转换工具
│ ├── compat/ # 兼容性辅助与回归支持
│ ├── assistantturn/ # 上游输出到统一 assistant turn / stream event 的语义层
│ ├── completionruntime/ # Go 主路径共享 DeepSeek completion 启动、非流式收集与 retry
│ ├── config/ # 配置加载、校验、热更新
│ ├── deepseek/ # DeepSeek 上游 client/protocol/transport
│ │ ├── client/ # 登录、会话、completion、上传/删除等上游调用
@@ -38,13 +41,14 @@ ds2api/
│ │ ├── admin/ # Admin API 根装配与资源子包
│ │ ├── claude/ # Claude HTTP 协议适配
│ │ ├── gemini/ # Gemini HTTP 协议适配
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions 执行入口
│ │ ├── responses/ # Responses API 与 response store
│ │ ├── files/ # Files API 与 inline file 预处理
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP 公共错误/模型/工具格式
│ │ ── openai/ # OpenAI HTTP surface
│ │ ├── chat/ # Chat Completions 执行入口
│ │ ├── responses/ # Responses API 与 response store
│ │ ├── files/ # Files API 与 inline file 预处理
│ │ ├── embeddings/ # Embeddings API
│ │ ├── history/ # OpenAI context file handling
│ │ └── shared/ # OpenAI HTTP 公共错误/模型/工具格式
│ │ └── requestbody/ # HTTP 请求体读取与 UTF-8/JSON 校验辅助
│ ├── js/ # Node Runtime 相关逻辑
│ │ ├── chat-stream/ # Node 流式输出桥接
│ │ ├── helpers/ # JS 辅助函数
@@ -61,13 +65,14 @@ ds2api/
│ ├── textclean/ # 文本清洗
│ ├── toolcall/ # 工具调用解析与修复
│ ├── toolstream/ # Go 流式 tool call 防泄漏与增量检测
│ ├── translatorcliproxy/ # 协议互转桥
│ ├── translatorcliproxy/ # Vercel/fallback/测试用协议互转桥
│ ├── util/ # 通用工具函数
│ ├── version/ # 版本查询/比较
│ └── webui/ # WebUI 静态托管相关逻辑
├── plans/ # 阶段计划与人工验收记录
├── pow/ # PoW 独立实现与基准
├── scripts/ # 构建/发布/辅助脚本
├── static/ # 构建产物admin 等静态资源)
├── tests/ # 测试资源与脚本
│ ├── compat/ # 兼容性夹具与期望输出
│ │ ├── expected/ # 预期结果样本
@@ -76,9 +81,9 @@ ds2api/
│ │ └── toolcalls/ # toolcall 夹具
│ ├── node/ # Node 单元测试
│ ├── raw_stream_samples/ # 上游原始 SSE 样本
│ │ ├── content-filter-trigger-20260405-jwt3/ # 风控终态样本
│ │ ├── continue-thinking-snapshot-replay-20260405/ # continue 样本
│ │ ├── guangzhou-weather-reasoner-search-20260404/ # 搜索+引用样本
│ │ ├── longtext-deepseek-v4-flash-20260429/ # flash 长文本/文件上传样本
│ │ ├── longtext-deepseek-v4-pro-20260429/ # pro 长文本/文件上传样本
│ │ ├── markdown-format-example-20260405/ # Markdown 样本
│ │ └── markdown-format-example-20260405-spacefix/ # 空格修复样本
│ ├── scripts/ # 测试脚本入口
@@ -91,6 +96,8 @@ ds2api/
├── features/ # 功能模块
│ ├── account/ # 账号管理页面
│ ├── apiTester/ # API 测试页面
│ ├── chatHistory/ # 服务器端对话记录页面
│ ├── proxy/ # 代理管理页面
│ ├── settings/ # 设置页面
│ └── vercel/ # Vercel 同步页面
├── layout/ # 布局组件
@@ -124,8 +131,11 @@ flowchart LR
subgraph RUNTIME[Shared runtime]
AUTH[internal/auth]
POOL[internal/account queue + concurrency]
CR[internal/completionruntime]
TURN[internal/assistantturn]
STREAM[internal/stream + internal/sse]
TOOL[internal/toolcall + internal/toolstream]
FMT[internal/format/openai + claude]
DS[internal/deepseek/client]
POW[pow + internal/deepseek/protocol]
end
@@ -151,16 +161,24 @@ flowchart LR
PC --> PROMPT
PC -.长历史.-> HIST
PC --> AUTH
PC --> CR
NCS -.Go prepare/release.-> CHAT
NCS --> JS
JS --> TOOL
AUTH --> POOL
CHAT --> STREAM
RESP --> STREAM
CHAT --> CR
RESP --> CR
CA --> CR
GA --> CR
CR --> DS
CR --> STREAM
CR --> TURN
STREAM --> TURN
STREAM --> TOOL
POOL --> DS
TURN --> FMT
POOL --> CR
DS --> POW
DS --> U[DeepSeek upstream]
```
@@ -169,9 +187,12 @@ flowchart LR
- `internal/server`路由树和中间件挂载健康检查、协议入口、Admin/WebUI
- `internal/httpapi/openai/*`OpenAI HTTP surface按 chat、responses、files、embeddings、history、shared 拆分chat/responses 共享 promptcompat、stream、toolcall 等核心语义。
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义,不重复实现上游调用逻辑
- `internal/httpapi/{claude,gemini}`:协议输入输出适配,归一到同一套 prompt compatibility 语义;正常直连路径必须通过 `completionruntime` 共享 DeepSeek session/PoW/completion 调用,`translatorcliproxy` 仅保留给 Vercel prepare/release、后端缺失 fallback 和回归测试
- `internal/httpapi/requestbody`跨协议复用的请求体读取、JSON 解码前置校验与 UTF-8 错误处理辅助。
- `internal/promptcompat`OpenAI/Claude/Gemini 请求到 DeepSeek 网页纯文本上下文的兼容内核。
- `internal/translatorcliproxy`Claude/Gemini 与 OpenAI 结构互转
- `internal/assistantturn`Go 输出侧统一语义层,把 DeepSeek SSE 收集结果和流式收尾状态归一成 assistant turn集中处理 thinking、tool call、citation、usage、stop/error 语义
- `internal/completionruntime`Go surface 共享的 completion 执行辅助,负责 DeepSeek session/PoW/call 启动、非流式 collect 和 empty-output retry流式路径复用它启动上游请求继续用 `internal/stream` 做实时消费,并在最终收尾阶段接入 `assistantturn`
- `internal/translatorcliproxy`Claude/Gemini 与 OpenAI 结构互转的桥接兼容层,不作为主业务协议转换中心。
- `internal/deepseek/{client,protocol,transport}`上游请求、会话、PoW 适配、协议常量与传输层。
- `internal/js/chat-stream` + `api/chat-stream.js`Vercel Node 流式桥Go prepare/release 管理鉴权、账号租约和 completion payloadNode 侧负责实时 SSE 转发并保持 Go 对齐的终结态和 tool sieve 语义。
- `internal/stream` + `internal/sse`Go 流式解析与增量处理。
@@ -180,6 +201,13 @@ flowchart LR
- `internal/chathistory`:服务器端对话记录持久化、分页、单条详情和保留策略。
- `internal/config`:配置加载、校验、运行时 settings 热更新。
- `internal/account`:托管账号池、并发槽位、等待队列。
- `internal/textclean`:文本清洗,移除 `[reference: N]` 标记等噪声。
- `internal/claudeconv`Claude API 请求到 DeepSeek 格式的协议转换。
- `internal/compat`:兼容性回归测试套件,用 SSE 夹具验证输出一致性。
- `internal/rawsample`:上游原始响应的采集、读写与管理。
- `internal/devcapture`:开发调试抓包,存储 HTTP 请求/响应用于问题排查。
- `internal/util`:跨包通用工具,含 JSON 写入、类型转换、token 计数、thinking 解析等。
- `internal/version`:版本号查询与比较,支持构建注入和运行时解析。
## 4. WebUI 与运行时关系

View File

@@ -36,7 +36,7 @@ go run ./cmd/ds2api
cd webui
# 2. Install dependencies
npm install
npm ci
# 3. Start dev server (hot reload)
npm run dev

View File

@@ -36,7 +36,7 @@ go run ./cmd/ds2api
cd webui
# 2. 安装依赖
npm install
npm ci
# 3. 启动开发服务器(热更新)
npm run dev

View File

@@ -64,8 +64,8 @@ Use `config.json` as the single source of truth:
Built-in GitHub Actions workflow: `.github/workflows/release-artifacts.yml`
- **Trigger**: only on Release `published` (no build on normal push)
- **Outputs**: multi-platform binary archives + `sha256sums.txt`
- **Trigger**: by default only on Release `published`; you can also run it manually via `workflow_dispatch` and pass `release_tag` to rerun / backfill
- **Outputs**: multi-platform binary archives, Linux Docker image export tarballs, and `sha256sums.txt`
- **Container publishing**: GHCR only (`ghcr.io/cjackhwang/ds2api`)
| Platform | Architecture | Format |
@@ -197,7 +197,7 @@ This repo includes a `zeabur.yaml` template for one-click deployment on Zeabur:
Notes:
- **Port**: DS2API listens on `5001` by default; the template sets `PORT=5001`.
- **Persistent config**: the template mounts `/data` and sets `DS2API_CONFIG_PATH=/data/config.json`. After importing config in Admin UI, it will be written and persisted to this path.
- **Persistent config**: the template mounts `/data` and sets `DS2API_CONFIG_PATH=/data/config.json`. On a fresh volume, DS2API starts with an empty file-backed config; after importing config in Admin UI, it will be written and persisted to this path.
- **`open /app/config.json: permission denied`**: this means the instance is trying to persist runtime tokens to a read-only path (commonly `/app` inside the image).
Recommended handling:
1. Set a writable path explicitly: `DS2API_CONFIG_PATH=/data/config.json` (and mount a persistent volume at `/data`);
@@ -206,6 +206,37 @@ Notes:
- **Build version**: Zeabur / regular `docker build` does not require `BUILD_VERSION` by default. The image prefers that build arg when provided, and automatically falls back to the repo-root `VERSION` file when it is absent.
- **First login**: after deployment, open `/admin` and login with `DS2API_ADMIN_KEY` shown in Zeabur env/template instructions (recommended: rotate to a strong secret after first login).
#### Manual Deployment Without The Template
If you do not want to use the `zeabur.yaml` one-click template, deploy directly from the repo root with Zeabur's GitHub integration:
1. Fork this repo, or push the code to your own GitHub repository.
2. In Zeabur Dashboard, create a Project, add a Service, then choose a GitHub/Git repository source.
3. Select the repository and branch. Keep Root Directory as `/`.
4. Use the Dockerfile build path. Zeabur auto-detects the repo-root `Dockerfile`; do not set `ZBPACK_IGNORE_DOCKERFILE=true`. If the UI asks for a Dockerfile name, enter `Dockerfile`.
5. Add a persistent volume in the Service settings and mount it at `/data`.
6. Configure environment variables:
| Variable | Recommended value | Description |
| --- | --- | --- |
| `PORT` | `5001` | Service listen port; keep it aligned with the exposed Zeabur HTTP port. |
| `DS2API_ADMIN_KEY` | Strong random string | Required admin login key. |
| `DS2API_CONFIG_PATH` | `/data/config.json` | Recommended persistent config path. |
| `LOG_LEVEL` | `INFO` | Optional log level. |
| `DS2API_CONFIG_JSON` | Raw JSON or Base64 JSON | Optional config bootstrap from env. |
| `DS2API_ENV_WRITEBACK` | `1` | Optional; enable only when using `DS2API_CONFIG_JSON` and you want the initial config written to `/data/config.json`. |
7. Expose HTTP port `5001`. The health check path can be `/healthz`.
8. After deployment, open `/admin`, login with `DS2API_ADMIN_KEY`, then import or edit config in Admin UI. A fresh volume does not need `/data/config.json` up front; the service boots first and creates the file on the first save.
Troubleshooting:
- **Startup log says `open /data/config.json: no such file or directory`**: make sure you deployed a version that includes the fresh-volume bootstrap fix, then redeploy the latest code.
- **`open /app/config.json: permission denied`**: the config path still points at the read-only image directory; mount `/data` and set `DS2API_CONFIG_PATH=/data/config.json`.
- **Config disappears after restart**: check that the `/data` persistent volume is mounted on this service. If you use `DS2API_CONFIG_JSON` but want Admin UI saves persisted, enable `DS2API_ENV_WRITEBACK=1`.
References: Zeabur's official [GitHub/Git integration](https://zeabur.com/docs/en-US/deploy/github), [Dockerfile deployment](https://zeabur.com/docs/en-US/deploy/dockerfile), and [Volumes](https://zeabur.com/docs/data-management/volumes) docs.
---
## 3. Vercel Deployment
@@ -271,6 +302,7 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # optional for personal accounts
| `VERCEL_TOKEN` | Vercel sync token | — |
| `VERCEL_PROJECT_ID` | Vercel project ID | — |
| `VERCEL_TEAM_ID` | Vercel team ID | — |
| `DS2API_CHAT_HISTORY_PATH` | Chat history storage path (must be set to `/tmp/chat_history.json` on Vercel, otherwise unavailable due to read-only filesystem) | `data/chat_history.json` |
| `DS2API_VERCEL_PROTECTION_BYPASS` | Deployment protection bypass for internal Node→Go calls | — |
### 3.4 Vercel Architecture
@@ -360,6 +392,22 @@ If API responses return Vercel HTML `Authentication Required`:
- **Option B**: Add `x-vercel-protection-bypass` header to requests
- **Option C**: Set `VERCEL_AUTOMATION_BYPASS_SECRET` (or `DS2API_VERCEL_PROTECTION_BYPASS`) for internal Node→Go calls
#### Chat History Unavailable (read-only file system)
```text
create chat history dir: mkdir /var/task/data: read-only file system
```
**Cause**: Vercel Serverless functions have a read-only filesystem (`/var/task`). Chat history fails because it cannot create directories there.
**Fix**: Add the following in Vercel Project Settings → Environment Variables:
```text
DS2API_CHAT_HISTORY_PATH=/tmp/chat_history.json
```
`/tmp` is the only writable directory in Vercel Serverless. Data is ephemeral (not persisted across cold starts), but the feature works within a single instance lifetime.
### 3.6 Build Artifacts Not Committed
- `static/admin` directory is not in Git
@@ -402,7 +450,7 @@ Or step by step:
```bash
cd webui
npm install
npm ci
npm run build
# Output goes to static/admin/
```

View File

@@ -64,8 +64,8 @@ cp config.example.json config.json
仓库内置 GitHub Actions 工作流:`.github/workflows/release-artifacts.yml`
- **触发条件**:仅在 Release `published`触发(普通 push 不会构建)
- **构建产物**:多平台二进制压缩包 + `sha256sums.txt`
- **触发条件**默认仅在 Release `published`自动触发;也支持在 Actions 页面手动 `workflow_dispatch`,并填写 `release_tag` 复跑/补发
- **构建产物**:多平台二进制压缩包、Linux Docker 镜像导出包 + `sha256sums.txt`
- **容器镜像发布**:仅发布到 GHCR`ghcr.io/cjackhwang/ds2api`
| 平台 | 架构 | 文件格式 |
@@ -197,7 +197,7 @@ healthcheck:
部署要点:
- **端口**:服务默认监听 `5001`,模板会固定设置 `PORT=5001`
- **配置持久化**:模板挂载卷 `/data`,并设置 `DS2API_CONFIG_PATH=/data/config.json`;在管理台导入配置后,会写入并持久化到该路径。
- **配置持久化**:模板挂载卷 `/data`,并设置 `DS2API_CONFIG_PATH=/data/config.json`首次空卷启动时会先使用空的文件模式配置,在管理台导入配置后,会写入并持久化到该路径。
- **`open /app/config.json: permission denied`**:说明当前实例在尝试把运行时 token 持久化到只读路径(常见于镜像内 `/app`)。
处理建议:
1. 显式设置可写路径:`DS2API_CONFIG_PATH=/data/config.json`(并挂载持久卷到 `/data`
@@ -206,6 +206,37 @@ healthcheck:
- **构建版本号**Zeabur / 普通 `docker build` 默认不需要传 `BUILD_VERSION`;镜像会优先使用该构建参数,未提供时自动回退到仓库根目录的 `VERSION` 文件。
- **首次登录**:部署完成后访问 `/admin`,使用 Zeabur 环境变量/模板指引中的 `DS2API_ADMIN_KEY` 登录(建议首次登录后自行更换为强密码)。
#### 不使用模板手动部署
如果你不想使用 `zeabur.yaml` 一键模板,可以直接用 Zeabur 的 GitHub 集成从仓库根目录构建:
1. Fork 本仓库,或把代码推送到你自己的 GitHub 仓库。
2. 在 Zeabur Dashboard 中创建 Project然后添加 Service选择 GitHub/Git 仓库来源。
3. 选择仓库与分支Root Directory 保持 `/`
4. 构建方式使用 Dockerfile。Zeabur 会自动检测仓库根目录的 `Dockerfile`;不要设置 `ZBPACK_IGNORE_DOCKERFILE=true`。如果界面要求填写 Dockerfile 名称,填写 `Dockerfile`
5. 在 Service 配置中添加持久卷,挂载目录填写 `/data`
6. 配置环境变量:
| 变量 | 推荐值 | 说明 |
| --- | --- | --- |
| `PORT` | `5001` | 服务监听端口,需要和 Zeabur 暴露的 HTTP 端口一致。 |
| `DS2API_ADMIN_KEY` | 强随机字符串 | 管理台登录密钥,必填。 |
| `DS2API_CONFIG_PATH` | `/data/config.json` | 配置持久化路径,建议必填。 |
| `LOG_LEVEL` | `INFO` | 可选,日志级别。 |
| `DS2API_CONFIG_JSON` | 原始 JSON 或 Base64 JSON | 可选,用于用环境变量初始化配置。 |
| `DS2API_ENV_WRITEBACK` | `1` | 可选;当设置了 `DS2API_CONFIG_JSON` 且希望首次启动后写入 `/data/config.json` 时再启用。 |
7. 暴露 HTTP 端口 `5001`,健康检查路径可填 `/healthz`
8. 部署完成后访问 `/admin`,用 `DS2API_ADMIN_KEY` 登录,然后在管理台导入或编辑配置。首次空卷可以没有 `/data/config.json`,服务会先启动,第一次保存时自动创建该文件。
常见问题:
- **启动日志出现 `open /data/config.json: no such file or directory`**:请确认已经部署包含“首次空卷启动”修复的版本,并重新部署最新代码。
- **出现 `open /app/config.json: permission denied`**:说明配置路径仍指向镜像内只读目录;设置持久卷 `/data`,并确认 `DS2API_CONFIG_PATH=/data/config.json`
- **管理台保存后重启配置丢失**:检查 `/data` 持久卷是否已挂载到当前服务;如果使用了 `DS2API_CONFIG_JSON`,但想让管理台保存落盘,请启用 `DS2API_ENV_WRITEBACK=1`
参考Zeabur 官方文档的 [GitHub/Git 集成](https://zeabur.com/docs/en-US/deploy/github)、[Dockerfile 部署](https://zeabur.com/docs/zh-CN/deploy/dockerfile) 与 [Volumes](https://zeabur.com/docs/data-management/volumes)。
---
## 三、Vercel 部署
@@ -271,6 +302,7 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx # 个人账号可留空
| `VERCEL_TOKEN` | Vercel 同步 token | — |
| `VERCEL_PROJECT_ID` | Vercel 项目 ID | — |
| `VERCEL_TEAM_ID` | Vercel 团队 ID | — |
| `DS2API_CHAT_HISTORY_PATH` | Chat history 存储路径Vercel 上必须设为 `/tmp/chat_history.json`,否则因文件系统只读而不可用) | `data/chat_history.json` |
| `DS2API_VERCEL_PROTECTION_BYPASS` | 部署保护绕过密钥(内部 Node→Go 调用) | — |
### 3.3 运行时行为配置(通过 Admin API 设置)
@@ -370,6 +402,22 @@ No Output Directory named "public" found after the Build completed.
- **方案 B**:请求中添加 `x-vercel-protection-bypass`
- **方案 C**:设置 `VERCEL_AUTOMATION_BYPASS_SECRET`(或 `DS2API_VERCEL_PROTECTION_BYPASS`),仅影响内部 Node→Go 调用
#### Chat History 不可用read-only file system
```text
create chat history dir: mkdir /var/task/data: read-only file system
```
**原因**Vercel Serverless 函数的文件系统(`/var/task`为只读chat history 尝试在该路径下创建目录失败。
**解决**:在 Vercel Project Settings → Environment Variables 中添加:
```text
DS2API_CHAT_HISTORY_PATH=/tmp/chat_history.json
```
`/tmp` 是 Vercel Serverless 环境中唯一可写的目录。数据在函数冷启动之间不会持久化ephemeral但在单个实例生命周期内功能正常。
### 3.6 仓库不提交构建产物
- `static/admin` 目录不在 Git 中
@@ -412,7 +460,7 @@ go run ./cmd/ds2api
```bash
cd webui
npm install
npm ci
npm run build
# 产物输出到 static/admin/
```

View File

@@ -68,7 +68,7 @@ gofmt -w <changed-go-files>
3. 请求归一化:`internal/promptcompat` 或协议转换包。
4. 上游请求:`internal/deepseek/client`
5. 流式输出:`internal/stream``internal/sse``internal/toolstream`
6. 响应格式:`internal/format/*``internal/translatorcliproxy`
6. 响应格式:主路径看 `internal/assistantturn``internal/format/*``internal/translatorcliproxy` 只用于 Vercel/fallback/test 桥接
对话记录页面问题优先检查:

View File

@@ -1,7 +1,7 @@
# DeepSeek SSE 行为结构说明(第三方逆向版)
> 说明:本文基于当前仓库 `tests/raw_stream_samples/` 下全部 `upstream.stream.sse` 原始流样本整理而成,属于第三方逆向观察文档,不是官方协议。
> 当前 corpus 由 4 份原始流组成,覆盖搜索+引用、风控终态、Markdown 输出和空格敏感输出等行为。
> 当前 corpus 由 5 份原始流组成,覆盖长文本生成、文件上传上下文、continue 接续、Markdown 输出和空格敏感输出等行为。
> 补充:文末还会注明少量“当前实现已确认、但 corpus 尚未完整覆盖”的行为,例如长思考场景下的自动续写状态。
文档导航:[文档总索引](./README.md) / [测试指南](./TESTING.md) / [样本目录说明](../tests/raw_stream_samples/README.md)
@@ -12,8 +12,9 @@
| 样本 | 观察重点 |
| --- | --- |
| [guangzhou-weather-reasoner-search-20260404](../tests/raw_stream_samples/guangzhou-weather-reasoner-search-20260404/upstream.stream.sse) | 搜索+思考流程,包含 `reference:N` 引用标记与工具片段 |
| [content-filter-trigger-20260405-jwt3](../tests/raw_stream_samples/content-filter-trigger-20260405-jwt3/upstream.stream.sse) | `CONTENT_FILTER` 终态分支,包含拒答模板与 `ban_regenerate` |
| [longtext-deepseek-v4-flash-20260429](../tests/raw_stream_samples/longtext-deepseek-v4-flash-20260429/upstream.stream.sse) | DeepSeek V4 flash 长文本流,包含 current input file 上传后的 completion 样本 |
| [longtext-deepseek-v4-pro-20260429](../tests/raw_stream_samples/longtext-deepseek-v4-pro-20260429/upstream.stream.sse) | DeepSeek V4 pro 长文本流,包含文件上传上下文和较长 reasoning/content 输出 |
| [continue-thinking-snapshot-replay-20260405](../tests/raw_stream_samples/continue-thinking-snapshot-replay-20260405/upstream.stream.sse) | 多轮 `completion + continue` 原始流,用于验证接续思考去重 |
| [markdown-format-example-20260405](../tests/raw_stream_samples/markdown-format-example-20260405/upstream.stream.sse) | Markdown 输出的早期样本,用于观察 token 级输出形态 |
| [markdown-format-example-20260405-spacefix](../tests/raw_stream_samples/markdown-format-example-20260405-spacefix/upstream.stream.sse) | Markdown 输出修正样本,用于验证空格 chunk 必须保留 |
@@ -194,7 +195,7 @@ close
## 8. 终态行为
当前 corpus 里有两条很重要的终态分支
当前 corpus 直接覆盖正常完成和 continue 接续;当前实现还兼容 `CONTENT_FILTER` 风控终态,相关分支由协议观察与兼容性 fixture 继续守护
### 8.1 正常完成
@@ -208,7 +209,7 @@ close
### 8.2 风控终态
`content-filter-trigger-20260405-jwt3` 展示了另一种终态路径:
`CONTENT_FILTER` 不在当前 raw stream corpus 的目录样本中,但代码和兼容性测试仍按下面这种终态路径处理
1. 先继续输出一段正常正文。
2. 出现提示类 fragment例如 `TIP`

View File

@@ -16,13 +16,14 @@
### 专题文档
- [DS2API 项目价值说明](./project-value.md)
- [API -> 网页对话纯文本兼容主链路说明](./prompt-compatibility.md)
- [Tool Calling 统一语义](./toolcall-semantics.md)
- [DeepSeek SSE 行为结构说明(逆向观察)](./DeepSeekSSE行为结构说明-2026-04-05.md)
### 文档维护约定
- 文档更新必须以实际代码实现为依据:总路由装配看 `internal/server/router.go`,协议/resource 路由看 `internal/httpapi/*/**/routes.go``internal/httpapi/admin/handler.go`,配置默认值看 `internal/config/*`,模型/alias 看 `internal/config/models.go`prompt 兼容链路看 `docs/prompt-compatibility.md` 列出的代码入口。
- 文档更新必须以实际代码实现为依据:总路由装配看 `internal/server/router.go`,协议/resource 路由看 `internal/httpapi/**/handler*.go``internal/httpapi/admin/handler.go`,配置默认值看 `internal/config/*`,模型/alias 看 `internal/config/models.go`prompt 兼容链路看 `docs/prompt-compatibility.md` 列出的代码入口。
- `README.MD` / `README.en.md`:面向首次接触用户,保留“是什么 + 怎么快速跑起来”。
- `docs/ARCHITECTURE*.md`:面向开发者,集中维护项目结构、模块职责与调用链。
- `API*.md`:面向客户端接入者,聚焦接口行为、鉴权和示例。
@@ -47,13 +48,14 @@ Recommended reading order:
### Topical docs
- [DS2API project value note](./project-value.md)
- [API -> pure-text web-chat compatibility pipeline](./prompt-compatibility.md)
- [Tool-calling unified semantics](./toolcall-semantics.md)
- [DeepSeek SSE behavior notes (reverse-engineered)](./DeepSeekSSE行为结构说明-2026-04-05.md)
### Maintenance conventions
- Documentation updates must be grounded in the actual implementation: root routing lives in `internal/server/router.go`, protocol/resource routes live in `internal/httpapi/*/**/routes.go` and `internal/httpapi/admin/handler.go`, config defaults in `internal/config/*`, models/aliases in `internal/config/models.go`, and the prompt compatibility pipeline in the code entrypoints listed by `docs/prompt-compatibility.md`.
- Documentation updates must be grounded in the actual implementation: root routing lives in `internal/server/router.go`, protocol/resource routes live in `internal/httpapi/**/handler*.go` and `internal/httpapi/admin/handler.go`, config defaults in `internal/config/*`, models/aliases in `internal/config/models.go`, and the prompt compatibility pipeline in the code entrypoints listed by `docs/prompt-compatibility.md`.
- `README.MD` / `README.en.md`: onboarding-oriented (“what + quick start”).
- `docs/ARCHITECTURE*.md`: developer-oriented source of truth for module boundaries and execution flow.
- `API*.md`: integration-oriented behavior/contracts.

View File

@@ -60,11 +60,10 @@ npm run build --prefix webui
./tests/scripts/check-refactor-line-gate.sh
./tests/scripts/check-node-split-syntax.sh
./tests/scripts/check-cross-build.sh
# 历史阶段门禁:阶段 6 手工烟测签字检查(默认读取 plans/stage6-manual-smoke.md
./tests/scripts/check-stage6-manual-smoke.sh
```
说明:`plans/stage6-manual-smoke.md` 已移除,阶段 6 手工烟测不再作为当前 CI 或发布门禁。
### 端到端测试 | End-to-End Tests
```bash

119
docs/project-value.md Normal file
View File

@@ -0,0 +1,119 @@
# DS2API 项目价值说明
文档导航:[总览](../README.MD) / [文档索引](./README.md) / [接口文档](../API.md) / [兼容主链路](./prompt-compatibility.md) / [Tool Calling 语义](./toolcall-semantics.md)
> 本文用于说明 DS2API 的项目定位与长期价值。
> 它不是架构说明,也不是功能清单,而是从“网页能力如何稳定 API 化”这个角度解释本项目为什么成立。
## 1. 项目定位
DS2API 的定位不是“又一个 API 代理”,也不是训练工具。
它本质上是一个网页转 API 的兼容层:把 DeepSeek 网页对话侧可用的能力,整理成 OpenAI / Claude / Gemini 风格客户端可以接入的请求与响应形态。
本项目的核心价值在于:
1. 把 DeepSeek 网页对话能力 API 化。
2. 把不同客户端协议统一到同一套兼容入口。
3. 把网页侧会话、thinking、文件引用、流式输出等行为整理成客户端可消费的结果。
4. 为上层编程工具、自动化工具或外部编排器提供稳定后端。
## 2. 解决的问题
### 2.1 把网页能力变成可接入的 API 形态
网页侧能力可以直接对话,但标准客户端需要的是稳定的 API 契约。两者之间有一段天然差距:
- 输入格式不同
- 输出事件不同
- 流式语义不同
- 文件引用方式不同
- thinking 与正文的暴露方式不同
DS2API 通过 `promptcompat``completionruntime``assistantturn` 和各协议 renderer把这段差距收敛到一条可维护的主链路中
- 请求侧把 OpenAI / Claude / Gemini 消息归一成网页纯文本上下文。
- 上游侧按 DeepSeek 网页 completion 需要的 payload 发起会话。
- 输出侧把 DeepSeek SSE 收集或流式事件再渲染回各协议原生形态。
这才是本项目的主定义:把网页能力稳定转成 API 可消费形态。
### 2.2 不只是转发,而是兼容
普通转发只能把请求送出去无法处理协议语义之间的差异。DS2API 需要额外处理:
- 模型 alias 与 DeepSeek 原生模型的映射
- thinking / reasoning 开关与输出结构
- search 与 citation / reference 标记
- 文件上传、历史文件和 current input file
- 上游空输出、content filter、auto-continue、重试和 usage 估算
这些都不是“把 URL 改一下”能解决的事情。项目价值正是在这些细节里体现出来。
### 2.3 让外部工具链能挂上去
当用户把 DS2API 接到编程工具、自动化工具或第三方 SDK 时,很多请求会变成长链路任务:
- 读取文件
- 搜索上下文
- 修改代码
- 执行命令
- 继续修正
- 输出最终结果
DS2API 不直接定义这些外部工具链,但它提供了一个足够稳定的 API 底座,让这些工具链可以外挂在上面继续工作。
## 3. 工具调用的价值
工具调用不是 DS2API 成立的前提,但它是项目很重要的增强能力。
即使没有工具调用DS2API 仍然是网页转 API 兼容层;当请求包含工具能力时,项目会额外处理模型输出漂移、长参数和流式防泄漏等问题:
- 长脚本用 CDATA 保住原文
- 文件路径和命令参数不容易被转义打坏
- tool call 语法有统一的 DSML / canonical XML 处理
- 模型输出漂了也能宽匹配、自修正
- 流式场景能尽量不把工具块漏回普通文本
这使 DS2API 可以服务编程工具和 agent 类客户端,但项目主轴仍然是“网页能力 API 化”,不是把工具调用当作项目唯一卖点。
## 4. CDATA 的作用
CDATA 不是项目价值本身,但它是工具调用与长文本兼容中很实用的一部分。
对本项目这种场景来说CDATA 的作用很直接:
- 保护长文本不被转义破坏
- 保住脚本、命令、代码片段的原样性
- 让结构化参数和自由文本更稳定地共存
- 让历史内容更容易被原样回放和再处理
它的意义不是让协议显得更复杂,而是让内容更少在转写、解析和回放过程中坏掉。
## 5. 它不是什么
为了避免误解,需要明确项目边界:
- 不是官方 DeepSeek API。
- 不是训练平台。
- 不是人工标注系统。
- 不是独立评测工具。
- 不是简单反代。
DS2API 是兼容层。它的职责是把网页能力整理成 API 体验,并在必要时对工具、历史、文件和流式输出做兼容处理。
## 6. 长期价值
DS2API 的长期价值,不在某个单点功能,而在于它把多个难点放进了同一条可维护链路:
- 多协议入口
- DeepSeek 网页 completion 适配
- prompt 纯文本兼容
- thinking / search / file 引用处理
- Go / Node 流式输出对齐
- tool call 解析与防泄漏
- Admin / WebUI / 账号池 / 并发队列
如果要用一句话概括它的价值,可以写成:
**DS2API 的价值,是把 DeepSeek 网页能力稳定整理成标准客户端可以持续使用的 API 形态。**

View File

@@ -3,7 +3,7 @@
文档导航:[总览](../README.MD) / [架构说明](./ARCHITECTURE.md) / [接口文档](../API.md) / [测试指南](./TESTING.md)
> 本文档是 DS2API“把 OpenAI / Claude / Gemini 风格 API 请求兼容成 DeepSeek 网页对话纯文本上下文”的专项说明。
> 这是项目最重要的兼容产物之一。凡是修改消息标准化、tool prompt 注入、tool history 保留、文件引用、current input file / legacy history_split、下游 completion payload 组装等行为,都必须同步更新本文档。
> 这是项目最重要的兼容产物之一。凡是修改消息标准化、tool prompt 注入、tool history 保留、文件引用、current input file、下游 completion payload 组装等行为,都必须同步更新本文档。
## 1. 核心结论
@@ -45,9 +45,12 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
-> promptcompat 统一消息标准化
-> tool prompt 注入
-> DeepSeek 风格 prompt 拼装
-> 文件收集 / inline 上传 / current input fileOpenAI 链路)
-> 文件收集 / inline 上传OpenAI 文件链路)
-> current input filecompletion runtime 全局入口)
-> completion payload
-> 下游网页对话接口
-> assistantturn 输出语义归一Go 非流式 + 流式收尾)
-> 各协议 rendererOpenAI / Responses / Claude / Gemini
```
对应的关键代码入口:
@@ -72,6 +75,10 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
[internal/promptcompat/thinking_injection.go](../internal/promptcompat/thinking_injection.go)
- completion payload
[internal/promptcompat/standard_request.go](../internal/promptcompat/standard_request.go)
- Go 输出侧 assistant turn
[internal/assistantturn/turn.go](../internal/assistantturn/turn.go)
- Go completion runtime
[internal/completionruntime/nonstream.go](../internal/completionruntime/nonstream.go)
## 4. 下游真正收到的东西
@@ -101,8 +108,9 @@ DS2API 当前的核心思路,不是把客户端传来的 `messages`、`tools`
- 对外返回给客户端的 `prompt_tokens` / `input_tokens` / `promptTokenCount` 不再按“最后一条消息”或字符粗估近似返回,而是基于**完整上下文 prompt**做 tokenizer 计数;为了避免上下文实际超限但客户端误以为还能塞下,请求侧上下文 token 会额外保守上浮一点,宁可略大也不低估。
- 当前 `/v1/chat/completions` 业务路径仍是“每次请求新建一个远端 `chat_session_id`,并默认发送 `parent_message_id: null`”;因此 DS2API 对外默认表现为“新会话 + prompt 拼历史”,而不是复用 DeepSeek 原生会话树。
- 但 DeepSeek 远端本身支持同一 `chat_session_id` 的跨轮次持续对话。2026-04-27 已用项目内现有 DeepSeek client 做过一次不改业务代码的双轮实测:同一 `chat_session_id` 下,第 1 轮返回 `request_message_id=1` / `response_message_id=2` / 文本 `SESSION_TEST_ONE`;第 2 轮重新获取一次 PoW并发送 `parent_message_id=2` 后,成功返回 `request_message_id=3` / `response_message_id=4` / 文本 `SESSION_TEST_TWO`。这说明“同远端会话持续聊天”能力存在,且每轮需要携带正确的 parent/message 链接信息,同时重新获取对应轮次可用的 PoW。
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`Claude 消息接口在可代理场景会转换为 OpenAI chat 形态再执行
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。Claude surface 在流式请求且未显式声明 `thinking` 时,仍按 Anthropic 语义默认关闭;但在非流式代理场景,兼容层会内部开启一次下游 thinking用于捕获“正文为空、工具调用落在 thinking 里”的情况,随后在回包前剥离用户不可见的 thinking block
- OpenAI Chat / Responses 原生走统一 OpenAI 标准化与 DeepSeek payload 组装Claude / Gemini 会尽量复用 OpenAI prompt/tool 语义,其中 Gemini 直接复用 `promptcompat.BuildOpenAIPromptForAdapter`。Go 主服务新增 `completionruntime` 启动层,统一执行 DeepSeek session/PoW/call输出侧新增 `assistantturn` 语义层:非流式 OpenAI Chat / Responses / Claude / Gemini 会把 DeepSeek SSE 收集结果先归一成同一份 assistant turn再分别渲染成各协议原生外形流式 OpenAI Chat / Responses / Claude / Gemini 继续保持各协议实时 SSE framing但最终收尾的 tool fallback、schema 归一、usage、empty-output / content-filter 错误语义同样由 `assistantturn` 判定。Claude / Gemini 的常规 Go 主路径不再依赖内部 `httptest` 转发到 OpenAI handler`translatorcliproxy` 仅保留用于 Vercel bridge、后端缺失 fallback 和回归测试,不作为主业务协议转换中心
- Vercel Node 流式路径本轮不迁移,仍使用现有 Node bridge / stream-tool-sieve 实现;后续若变更 Node 流式语义,需要按 `assistantturn` 的 Go canonical 输出语义同步对齐
- 客户端传入的 thinking / reasoning 开关会被归一到下游 `thinking_enabled`。Gemini `generationConfig.thinkingConfig.thinkingBudget` 会翻译成同一套 thinking 开关;关闭时即使上游返回 `response/thinking_content`,兼容层也不会把它当作可见正文输出。若最终解析出的模型名带 `-nothinking` 后缀,则会无条件强制关闭 thinking优先级高于请求体中的 `thinking` / `reasoning` / `reasoning_effort`。未显式关闭时,各 surface 会按解析后的 DeepSeek 模型默认能力开启 thinking并用各自协议的原生形态暴露OpenAI Chat 为 `reasoning_content`OpenAI Responses 为 `response.reasoning.delta` / `reasoning` contentClaude 为 `thinking` block / `thinking_delta`Gemini 为 `thought: true` part。
- 对 OpenAI Chat / Responses 的非流式收尾,如果最终可见正文为空,兼容层会优先尝试把思维链中的独立 DSML / XML 工具块当作真实工具调用解析出来。流式链路也会在收尾阶段做同样的 fallback 检测,但不会因为思维链内容去中途拦截或改写流式输出;真正的工具识别始终基于原始上游文本,而不是基于“已经做过可见输出清洗”的版本,因此即使最终可见层会剥离完整 leaked DSML / XML `tool_calls` wrapper、并抑制全空参数或无效 wrapper 块,也不会影响真实工具调用转成结构化 `tool_calls` / `function_call`。补发结果会作为本轮 assistant 的结构化 `tool_calls` / `function_call` 输出返回,而不是塞进 `content` 文本;如果客户端没有开启 thinking / reasoning思维链只用于检测不会作为 `reasoning_content` 或可见正文暴露。只有正文为空且思维链里也没有可执行工具调用时,才继续按空回复错误处理。
- OpenAI Chat / Responses 的空回复错误处理之前会默认做一次内部补偿重试:第一次上游完整结束后,如果最终可见正文为空、没有解析到工具调用、也没有已经向客户端流式发出工具调用,并且终止原因不是 `content_filter`,兼容层会复用同一个 `chat_session_id`、账号、token 与工具策略,把原始 completion `prompt` 追加固定后缀 `Previous reply had no visible output. Please regenerate the visible final answer or tool call now.` 后重新提交一次。重试遵循 DeepSeek 多轮对话协议:从第一次上游 SSE 流中提取 `response_message_id`,并在重试 payload 中设置 `parent_message_id` 为该值,使重试成为同一会话的后续轮次而非断裂的根消息;同时重新获取一次 PoW若 PoW 获取失败则回退到原始 PoW。该重试不会重新标准化消息、不会新建 session、不会切换账号也不会向流式客户端插入重试标记第二次 thinking / reasoning 会按正常增量直接接到第一次之后,并继续使用 overlap trim 去重。若第二次仍为空,终端错误码仍保持现有 `upstream_empty_output`;若任一尝试触发空 `content_filter`,不做补偿重试并保持 `content_filter` 错误。JS Vercel 运行时同样设置 `parent_message_id`,但因无法直接调用 PoW API 而复用原始 PoW。
@@ -117,6 +125,11 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
- 普通请求会直接出现在最终 `prompt` 的最新 user block 末尾。
- 如果触发 current input file它会进入完整上下文文件中。
另外,`MessagesPrepareWithThinking` 还会在最终 prompt 的最前面预置一段固定的 system 级“输出完整性约束Output integrity guard
- 如果上游上下文、工具输出或解析后的文本出现乱码、损坏、部分解析、重复或其他畸形片段,不要模仿、不要回显,只输出给用户的正确内容。
- 这段约束位于普通 system / tool prompt 之前,因此是当前最终 prompt 里的最高优先级前置指令。
### 5.1 角色标记
最终 prompt 使用 DeepSeek 风格角色标记:
@@ -154,8 +167,8 @@ OpenAI Chat / Responses 在标准化后、current input file 之前,会默认
4. 把这整段内容并入 system prompt。
工具调用正例现在优先示范官方 DSML 风格:`<|DSML|tool_calls>``<|DSML|invoke name="...">``<|DSML|parameter name="...">`
兼容层仍接受旧式纯 `<tool_calls>` wrapper但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。
兼容层仍接受旧式纯 `<tool_calls>` wrapper并会容错若干 DSML 标签变体,包括短横线形式 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`但提示词会优先要求模型输出官方 DSML 标签,并强调不能只输出 closing wrapper 而漏掉 opening tag。需要注意这是“兼容 DSML 外壳,内部仍以 XML 解析语义为准”,不是原生 DSML 全链路实现DSML 标签会在解析入口归一化回现有 XML 标签后继续走同一套 parser。
数组参数使用 `<item>...</item>` 子节点表示;当某个参数体只包含 item 子节点时Go / Node 解析器会把它还原成数组,避免 `questions` / `options` 这类 schema 中要求 array 的参数被误解析成 `{ "item": ... }` 对象。除此之外,解析器还会回收一些更松散的列表写法,例如 JSON array 字面量或逗号分隔的 JSON 项序列,只要它们足够明确;但 `<item>` 仍然是首选形态。若模型把完整结构化 XML fragment 误包进 CDATA兼容层会在保护 `content` / `command` 等原文字段的前提下,尝试把非原文字段中的 CDATA XML fragment 还原成 object / array。不过如果 CDATA 只是单个平面的 XML/HTML 标签,例如 `<b>urgent</b>` 这种行内标记,兼容层会保留原始字符串,不会强行升成 object / array只有明显表示结构的 CDATA 片段,例如多兄弟节点、嵌套子节点或 `item` 列表,才会触发结构化恢复。`command` / `content` 等长文本参数CDATA 内部的 Markdown fenced DSML / XML 示例会作为原文保护;示例里的 `]]></parameter>``</tool_calls>` 不会截断外层工具调用,解析器会继续等待围栏外真正的参数 / wrapper 结束标签。
Go 侧读取 DeepSeek SSE 时不再依赖 `bufio.Scanner` 的固定 2MiB 单行上限;当写文件类工具把很长的 `content` 放在单个 `data:` 行里返回时,非流式收集、流式解析和 auto-continue 透传都会保留完整行,再进入同一套工具解析与序列化流程。
在 assistant 最终回包阶段,如果某个 tool 参数在声明 schema 中明确是 `string`,兼容层会在把解析后的 `tool_calls` / `function_call` 重新序列化成 OpenAI / Responses / Claude 可见参数前,递归把该路径上的 number / bool / object / array 统一转成字符串;其中 object / array 会压成紧凑 JSON 字符串。这个保护只对 schema 明确声明为 string 的路径生效,不会改写本来就是 `number` / `boolean` / `object` / `array` 的参数。这样可以兼容 DeepSeek 输出了结构化片段、但上游客户端工具 schema 又严格要求字符串参数的场景(例如 `content``prompt``path``taskId` 等)。
工具 schema 的权威来源始终是**当前请求实际携带的 schema**,而不是同名工具在其他 runtimeClaude Code / OpenCode / Codex 等)里的默认印象。兼容层现在会同时兼容 OpenAI 风格 `function.parameters`、直接工具对象上的 `parameters` / `input_schema`、以及 camelCase 的 `inputSchema` / `schema`,并在最终输出阶段按这份请求内 schema 决定是保留 array/object还是仅对明确声明为 `string` 的路径做字符串化。该规则同样适用于 Claude 的流式收尾和 Vercel Node 流式 tool-call formatter避免不同 runtime 因 schema shape 差异而出现同名工具参数类型漂移。
@@ -238,6 +251,14 @@ OpenAI 文件相关实现:
- 文件 ID 收集:
[internal/promptcompat/file_refs.go](../internal/promptcompat/file_refs.go)
OpenAI 的文件上传现在不再是“只传文件本体”的通用路径,而是会先根据请求里的 `model` 解析出 DeepSeek 的上传类型,并把它透传到上传接口的 `x-model-type`。当前可见的上传类型就是 `default` / `expert` / `vision`,其中 vision 请求上传图片时必须带上 `vision`,否则下游容易退回到仅文本或 OCR 语义。这个模型类型会同时用于:
- `/v1/files` 这类独立文件上传入口
- Chat / Responses 的 inline 图片、附件上传
- current input file 触发时生成的 `DS2API_HISTORY.txt` 上下文文件
也就是说,文件上传和完成请求的 `model_type` 现在是一致的:完成 payload 里仍然是 `model_type`,上传文件则会在 DeepSeek 上传阶段携带同样的模型类型信息。
结论:
- “systemprompt 文字”在 prompt 里
@@ -247,11 +268,10 @@ OpenAI 文件相关实现:
## 9. 多轮历史为什么不会一直完整内联在 prompt
兼容层现在只保留 `current_input_file` 这一种拆分方式;旧的 `history_split` 已废弃,只保留为兼容旧配置的字段,不再参与请求处理
兼容层现在只保留 `current_input_file` 这一种拆分方式;旧的 `history_split` 配置字段已移除,读取旧配置时会忽略它且不会再写回
- `current_input_file` 默认开启;它用于把“完整上下文”合并进 `history.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`(默认 `0`)时,兼容层会上传一个文件名为 `history.txt` 的上下文文件,并在 live prompt 中只保留一个中性的 user 消息要求模型直接回答最新请求,不再暴露文件名或要求模型读取本地文件
- `current_input_file` 默认开启;它在统一 completion runtime 入口全局生效,用于把“完整上下文”合并进 `DS2API_HISTORY.txt` 上下文文件。当最新 user turn 的纯文本长度达到 `current_input_file.min_chars`(默认 `0`)时,runtime 会上传一个文件名为 `DS2API_HISTORY.txt` 的上下文文件。文件内容会先经过各协议入口的标准化,再序列化成按轮次编号的 `DS2API_HISTORY.txt` 风格 transcript带有 `# DS2API_HISTORY.txt` 标题和 `=== N. ROLE ===` 分段live prompt 中则会给出一个 continuation 语气的 user 消息,引导模型从 `DS2API_HISTORY.txt` 的最新状态继续推进,并直接回答最新请求,避免把任务拉回起点
- 如果 `current_input_file.enabled=false`,请求会直接透传,不上传任何拆分上下文文件。
- 旧的 `history_split.enabled` / `history_split.trigger_after_turns` 会被读取进配置对象以保持兼容,但不会触发拆分上传,也不会影响 `current_input_file` 的默认开启。
- 即使触发 `current_input_file` 后 live prompt 被缩短,对客户端回包里的上下文 token 统计,仍会沿用**拆分前的完整 prompt 语义**做计数,而不是按缩短后的占位 prompt 计算;否则会把真实上下文显著算小。
相关实现:
@@ -260,14 +280,27 @@ OpenAI 文件相关实现:
[internal/config/store_accessors.go](../internal/config/store_accessors.go)
- 当前输入转文件:
[internal/httpapi/openai/history/current_input_file.go](../internal/httpapi/openai/history/current_input_file.go)
- 旧历史拆分兼容壳
[internal/httpapi/openai/history/history_split.go](../internal/httpapi/openai/history/history_split.go)
- 全局 completion runtime 应用点
[internal/completionruntime/nonstream.go](../internal/completionruntime/nonstream.go)
当前输入转文件启用并触发时,上传文件的真实文件名是 `history.txt`,文件内容是完整 `messages` 上下文;它仍会先用 OpenAI 消息标准化和 DeepSeek 角色标记序列化,并直接作为 `history.txt` 的纯文本内容上传(不再注入文件边界标签):
当前输入转文件启用并触发时,上传文件的真实文件名是 `DS2API_HISTORY.txt`,文件内容是完整 `messages` 上下文;它会使用 OpenAI-compatible 的消息/transcript 序列化规则和 DeepSeek 角色标记,再按轮次编号成 `DS2API_HISTORY.txt` 风格的 transcript(不再注入文件边界标签):
```text
[uploaded filename]: history.txt
<begin▁of▁sentence><System>...<User>...<Assistant>...<Tool>...<User>...
[uploaded filename]: DS2API_HISTORY.txt
# DS2API_HISTORY.txt
Prior conversation history and tool progress.
=== 1. SYSTEM ===
...
=== 2. USER ===
...
=== 3. ASSISTANT ===
...
=== 4. TOOL ===
...
```
开启后,请求的 live prompt 不再直接内联完整上下文,而是保留一个 user role 的短提示,提示模型基于已提供上下文直接回答最新请求;上传后的 `file_id` 会进入 `ref_file_ids`
@@ -282,7 +315,7 @@ OpenAI 文件相关实现:
- Responses `instructions` 会 prepend 为 system message
- `tools` 会注入 system prompt
- `attachments` / `input_file` / inline 文件会进入 `ref_file_ids`
- current input file 主要在这条链路里生效,旧 `history_split` 仅作兼容字段保留
- current input file 在统一 completion runtime 入口全局生效
### 10.2 Claude Messages
@@ -319,7 +352,7 @@ OpenAI 文件相关实现:
```json
{
"prompt": "<begin▁of▁sentence><System>原 system / developer\n\nYou have access to these tools: ...<end▁of▁instructions><User>The current request and prior conversation context have already been provided. Answer the latest user request directly.<Assistant>",
"prompt": "<begin▁of▁sentence><System>原 system / developer\n\nYou have access to these tools: ...<end▁of▁instructions><User>Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.<Assistant>",
"ref_file_ids": [
"file-current-input-ignore",
"file-systemprompt",
@@ -334,7 +367,7 @@ OpenAI 文件相关实现:
- 大部分结构化语义被压进 `prompt`
- 文件保持文件
- 需要时把完整上下文拆进 `history.txt` 上下文文件
- 需要时把完整上下文拆进 `DS2API_HISTORY.txt` 上下文文件,并按轮次编号成 transcript
## 12. 修改时必须同步本文档的场景
@@ -347,8 +380,8 @@ OpenAI 文件相关实现:
- tool result 注入方式变更
- tool prompt 模板或 tool_choice 约束变更
- inline 文件上传 / 文件引用收集规则变更
- current input file 触发条件、上传格式、`history.txt` 包装格式变更
-`history_split` 兼容逻辑的读取、忽略或退化行为变更
- current input file 触发条件、上传格式、`DS2API_HISTORY.txt` transcript 结构变更
-`history_split` 字段忽略/清理行为变更
- completion payload 字段语义变更
- Claude / Gemini 对这套统一语义的复用关系变更
@@ -360,7 +393,8 @@ OpenAI 文件相关实现:
- `internal/promptcompat/tool_prompt.go`
- `internal/httpapi/openai/files/file_inline_upload.go`
- `internal/promptcompat/file_refs.go`
- `internal/httpapi/openai/history/history_split.go`
- `internal/httpapi/openai/history/current_input_file.go`
- `internal/completionruntime/nonstream.go`
- `internal/promptcompat/responses_input_normalize.go`
- `internal/httpapi/claude/standard_request.go`
- `internal/httpapi/claude/handler_utils.go`

View File

@@ -54,12 +54,13 @@
在流式链路中Go / Node 一致):
- DSML `<|DSML|tool_calls>` wrapper、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- DSML `<|DSML|tool_calls>` wrapper、短横线形式(如 `<dsml-tool-calls>` / `<dsml-invoke>` / `<dsml-parameter>`)、基于固定本地标签名的 DSML 噪声容错形态、尾部管道符形态(如 `<|DSML|tool_calls|`)和 canonical `<tool_calls>` wrapper 都会进入结构化捕获
- 如果流里直接从 invoke 开始,但后面补上了 closing wrapperGo 流式筛分也会按缺失 opening wrapper 的修复路径尝试恢复
- 已识别成功的工具调用不会再次回流到普通文本
- 不符合新格式的块不会执行,并继续按原样文本透传
- fenced code block反引号 `` ``` `` 和波浪线 `~~~`)中的 XML 示例始终按普通文本处理
- 支持嵌套围栏(如 4 反引号嵌套 3 反引号)和 CDATA 内围栏保护
-`command` / `content` 等长文本参数CDATA 内部如果包含 Markdown fenced DSML / XML 示例,即使示例里出现 `]]></parameter>` / `</tool_calls>` 这类看起来像外层结束标签的片段,也会继续按参数原文保留,直到真正位于围栏外的外层结束标签
- 如果模型把 `<![CDATA[` 打开后却没有闭合,流式扫描阶段仍会保守地继续缓冲,不会误把 CDATA 里的示例 XML 当成真实工具调用;在最终 parse / flush 恢复阶段,会对这类 loose CDATA 做窄修复,尽量保住外层已完整包裹的真实工具调用
- 当文本中 mention 了某种标签名(如 `<dsml|tool_calls>` 或 Markdown inline code 里的 `<|DSML|tool_calls>`而后面紧跟真正工具调用时sieve 会跳过不可解析的 mention 候选并继续匹配后续真实工具块,不会因 mention 导致工具调用丢失,也不会截断 mention 后的正文
- Go 侧 SSE 读取不再使用 `bufio.Scanner` 的固定 token 上限;单个 `data:` 行中包含很长的写文件参数时,非流式收集、流式解析与 auto-continue 透传都应保留完整行,再交给 tool parser 处理

View File

@@ -0,0 +1,64 @@
package assistantturn
import (
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/sse"
)
type StreamEventType string
const (
StreamEventTextDelta StreamEventType = "text_delta"
StreamEventThinkingDelta StreamEventType = "thinking_delta"
StreamEventToolCall StreamEventType = "tool_call"
StreamEventDone StreamEventType = "done"
StreamEventError StreamEventType = "error"
StreamEventPing StreamEventType = "ping"
)
type StreamEvent struct {
Type StreamEventType
Text string
Thinking string
ToolCall any
Error *OutputError
Usage *Usage
}
type Accumulator struct {
inner shared.StreamAccumulator
}
type AccumulatorOptions struct {
ThinkingEnabled bool
SearchEnabled bool
StripReferenceMarkers bool
}
func NewAccumulator(opts AccumulatorOptions) *Accumulator {
return &Accumulator{
inner: shared.StreamAccumulator{
ThinkingEnabled: opts.ThinkingEnabled,
SearchEnabled: opts.SearchEnabled,
StripReferenceMarkers: opts.StripReferenceMarkers,
},
}
}
func (a *Accumulator) Apply(parsed sse.LineResult) shared.StreamAccumulatorResult {
if a == nil {
return shared.StreamAccumulatorResult{}
}
return a.inner.Apply(parsed)
}
func (a *Accumulator) Snapshot() (rawText, text, rawThinking, thinking, detectionThinking string) {
if a == nil {
return "", "", "", "", ""
}
return a.inner.RawText.String(),
a.inner.Text.String(),
a.inner.RawThinking.String(),
a.inner.Thinking.String(),
a.inner.ToolDetectionThinking.String()
}

View File

@@ -0,0 +1,285 @@
package assistantturn
import (
"net/http"
"strings"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/util"
)
type StopReason string
const (
StopReasonStop StopReason = "stop"
StopReasonToolCalls StopReason = "tool_calls"
StopReasonContentFilter StopReason = "content_filter"
StopReasonError StopReason = "error"
)
type Usage struct {
InputTokens int
OutputTokens int
ReasoningTokens int
TotalTokens int
}
type OutputError struct {
Status int
Message string
Code string
}
type Turn struct {
Model string
Prompt string
RawText string
RawThinking string
DetectionThinking string
Text string
Thinking string
ToolCalls []toolcall.ParsedToolCall
ParsedToolCalls toolcall.ToolCallParseResult
CitationLinks map[int]string
ContentFilter bool
ResponseMessageID int
StopReason StopReason
Usage Usage
Error *OutputError
}
type FinalizeOptions struct {
AlreadyEmittedToolCalls bool
}
type FinalOutcome struct {
FinishReason string
Error *OutputError
Usage Usage
HasToolCalls bool
HasVisibleText bool
HasVisibleOutput bool
ShouldFail bool
}
type BuildOptions struct {
Model string
Prompt string
RefFileTokens int
SearchEnabled bool
StripReferenceMarkers bool
ToolNames []string
ToolsRaw any
ToolChoice promptcompat.ToolChoicePolicy
}
type StreamSnapshot struct {
RawText string
VisibleText string
RawThinking string
VisibleThinking string
DetectionThinking string
ContentFilter bool
CitationLinks map[int]string
ResponseMessageID int
AlreadyEmittedCalls bool
AdditionalToolCalls []toolcall.ParsedToolCall
AlreadyEmittedToolRaw bool
}
func BuildTurnFromCollected(result sse.CollectResult, opts BuildOptions) Turn {
thinking := shared.CleanVisibleOutput(result.Thinking, opts.StripReferenceMarkers)
text := shared.CleanVisibleOutput(result.Text, opts.StripReferenceMarkers)
if opts.SearchEnabled {
text = shared.ReplaceCitationMarkersWithLinks(text, result.CitationLinks)
}
parsed := shared.DetectAssistantToolCalls(result.Text, text, result.Thinking, result.ToolDetectionThinking, opts.ToolNames)
calls := toolcall.NormalizeParsedToolCallsForSchemas(parsed.Calls, opts.ToolsRaw)
parsed.Calls = calls
stopReason := StopReasonStop
if result.ContentFilter {
stopReason = StopReasonContentFilter
}
if len(calls) > 0 {
stopReason = StopReasonToolCalls
}
turn := Turn{
Model: opts.Model,
Prompt: opts.Prompt,
RawText: result.Text,
RawThinking: result.Thinking,
DetectionThinking: result.ToolDetectionThinking,
Text: text,
Thinking: thinking,
ToolCalls: calls,
ParsedToolCalls: parsed,
CitationLinks: result.CitationLinks,
ContentFilter: result.ContentFilter,
ResponseMessageID: result.ResponseMessageID,
StopReason: stopReason,
}
turn.Usage = BuildUsage(opts.Model, opts.Prompt, thinking, text, opts.RefFileTokens)
turn.Error = ValidateTurn(turn, opts.ToolChoice)
if turn.Error != nil {
turn.StopReason = StopReasonError
}
return turn
}
func BuildTurnFromStreamSnapshot(snapshot StreamSnapshot, opts BuildOptions) Turn {
thinking := shared.CleanVisibleOutput(snapshot.VisibleThinking, opts.StripReferenceMarkers)
text := shared.CleanVisibleOutput(snapshot.VisibleText, opts.StripReferenceMarkers)
if opts.SearchEnabled {
text = shared.ReplaceCitationMarkersWithLinks(text, snapshot.CitationLinks)
}
parsed := shared.DetectAssistantToolCalls(snapshot.RawText, text, snapshot.RawThinking, snapshot.DetectionThinking, opts.ToolNames)
calls := parsed.Calls
if len(calls) == 0 && len(snapshot.AdditionalToolCalls) > 0 {
calls = snapshot.AdditionalToolCalls
}
calls = toolcall.NormalizeParsedToolCallsForSchemas(calls, opts.ToolsRaw)
parsed.Calls = calls
stopReason := StopReasonStop
if snapshot.ContentFilter {
stopReason = StopReasonContentFilter
}
if len(calls) > 0 || snapshot.AlreadyEmittedCalls || snapshot.AlreadyEmittedToolRaw {
stopReason = StopReasonToolCalls
}
turn := Turn{
Model: opts.Model,
Prompt: opts.Prompt,
RawText: snapshot.RawText,
RawThinking: snapshot.RawThinking,
DetectionThinking: snapshot.DetectionThinking,
Text: text,
Thinking: thinking,
ToolCalls: calls,
ParsedToolCalls: parsed,
CitationLinks: snapshot.CitationLinks,
ContentFilter: snapshot.ContentFilter,
ResponseMessageID: snapshot.ResponseMessageID,
StopReason: stopReason,
}
turn.Usage = BuildUsage(opts.Model, opts.Prompt, thinking, text, opts.RefFileTokens)
if !snapshot.AlreadyEmittedCalls && !snapshot.AlreadyEmittedToolRaw {
turn.Error = ValidateTurn(turn, opts.ToolChoice)
}
if turn.Error != nil && len(calls) == 0 {
turn.StopReason = StopReasonError
}
return turn
}
func BuildUsage(model, prompt, thinking, text string, refFileTokens int) Usage {
inputTokens := util.CountPromptTokens(prompt, model) + refFileTokens
reasoningTokens := util.CountOutputTokens(thinking, model)
outputTokens := reasoningTokens + util.CountOutputTokens(text, model)
return Usage{
InputTokens: inputTokens,
OutputTokens: outputTokens,
ReasoningTokens: reasoningTokens,
TotalTokens: inputTokens + outputTokens,
}
}
func ValidateTurn(turn Turn, policy promptcompat.ToolChoicePolicy) *OutputError {
if policy.IsRequired() && len(turn.ToolCalls) == 0 {
return &OutputError{
Status: http.StatusUnprocessableEntity,
Message: "tool_choice requires at least one valid tool call.",
Code: "tool_choice_violation",
}
}
if len(turn.ToolCalls) > 0 {
return nil
}
if strings.TrimSpace(turn.Text) != "" {
return nil
}
status, message, code := UpstreamEmptyOutputDetail(turn.ContentFilter, turn.Text, turn.Thinking)
return &OutputError{Status: status, Message: message, Code: code}
}
func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
_ = text
if contentFilter {
return http.StatusBadRequest, "Upstream content filtered the response and returned no output.", "content_filter"
}
if strings.TrimSpace(thinking) != "" {
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned reasoning without visible output.", "upstream_empty_output"
}
return http.StatusTooManyRequests, "Upstream account hit a rate limit and returned empty output.", "upstream_empty_output"
}
// ShouldRetryEmptyOutput returns true when the turn produced no visible text
// and has no tool calls or content filter. This includes thinking-only responses,
// where the model returned reasoning but no answer — a retry may yield text.
func ShouldRetryEmptyOutput(turn Turn, attempts, maxAttempts int) bool {
return attempts < maxAttempts &&
!turn.ContentFilter &&
len(turn.ToolCalls) == 0 &&
strings.TrimSpace(turn.Text) == ""
}
func FinalizeTurn(turn Turn, opts FinalizeOptions) FinalOutcome {
hasToolCalls := len(turn.ToolCalls) > 0 || opts.AlreadyEmittedToolCalls
hasVisibleText := strings.TrimSpace(turn.Text) != ""
hasVisibleThinking := strings.TrimSpace(turn.Thinking) != ""
err := turn.Error
if hasToolCalls {
err = nil
}
finishReason := FinishReason(turn)
if hasToolCalls {
finishReason = "tool_calls"
}
return FinalOutcome{
FinishReason: finishReason,
Error: err,
Usage: turn.Usage,
HasToolCalls: hasToolCalls,
HasVisibleText: hasVisibleText,
HasVisibleOutput: hasVisibleText || hasVisibleThinking || hasToolCalls,
ShouldFail: err != nil,
}
}
func OpenAIChatUsage(turn Turn) map[string]any {
return map[string]any{
"prompt_tokens": turn.Usage.InputTokens,
"completion_tokens": turn.Usage.OutputTokens,
"total_tokens": turn.Usage.TotalTokens,
"completion_tokens_details": map[string]any{
"reasoning_tokens": turn.Usage.ReasoningTokens,
},
}
}
func OpenAIResponsesUsage(turn Turn) map[string]any {
return map[string]any{
"input_tokens": turn.Usage.InputTokens,
"output_tokens": turn.Usage.OutputTokens,
"total_tokens": turn.Usage.TotalTokens,
}
}
func FinishReason(turn Turn) string {
switch turn.StopReason {
case StopReasonToolCalls:
return "tool_calls"
case StopReasonContentFilter:
return "content_filter"
default:
return "stop"
}
}

View File

@@ -0,0 +1,127 @@
package assistantturn
import (
"testing"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
)
func TestBuildTurnFromCollectedTextCitation(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{
Text: "See [citation:1]",
CitationLinks: map[int]string{1: "https://example.com"},
}, BuildOptions{Model: "deepseek-v4-flash", Prompt: "prompt", SearchEnabled: true, StripReferenceMarkers: true})
if turn.Text != "See [1](https://example.com)" {
t.Fatalf("text mismatch: %q", turn.Text)
}
if turn.StopReason != StopReasonStop {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
if turn.Error != nil {
t.Fatalf("unexpected error: %#v", turn.Error)
}
}
func TestBuildTurnFromCollectedToolCall(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{
Text: `<tool_calls><invoke name="Write"><parameter name="content">{"x":1}</parameter></invoke></tool_calls>`,
}, BuildOptions{
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"input_schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
})
if len(turn.ToolCalls) != 1 {
t.Fatalf("expected one tool call, got %d", len(turn.ToolCalls))
}
if turn.StopReason != StopReasonToolCalls {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
if _, ok := turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected content coerced to string, got %#v", turn.ToolCalls[0].Input["content"])
}
}
func TestBuildTurnFromCollectedThinkingOnlyIsEmptyOutput(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Thinking: "hidden"}, BuildOptions{})
if turn.Error == nil || turn.Error.Code != "upstream_empty_output" {
t.Fatalf("expected empty output error, got %#v", turn.Error)
}
}
func TestBuildTurnFromCollectedToolChoiceRequired(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Text: "hello"}, BuildOptions{
ToolChoice: promptcompat.ToolChoicePolicy{Mode: promptcompat.ToolChoiceRequired},
})
if turn.Error == nil || turn.Error.Code != "tool_choice_violation" {
t.Fatalf("expected tool choice violation, got %#v", turn.Error)
}
}
func TestBuildTurnFromStreamSnapshotUsesVisibleTextAndRawToolDetection(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{
RawText: `<tool_calls><invoke name="Write"><parameter name="content">{"x":1}</parameter></invoke></tool_calls>`,
VisibleText: "",
}, BuildOptions{
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
})
if len(turn.ToolCalls) != 1 {
t.Fatalf("expected stream snapshot tool call, got %d", len(turn.ToolCalls))
}
if _, ok := turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected stream snapshot schema coercion, got %#v", turn.ToolCalls[0].Input["content"])
}
}
func TestBuildTurnFromStreamSnapshotAlreadyEmittedToolAvoidsEmptyError(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{AlreadyEmittedCalls: true}, BuildOptions{})
if turn.Error != nil {
t.Fatalf("unexpected empty-output error after emitted tool call: %#v", turn.Error)
}
if turn.StopReason != StopReasonToolCalls {
t.Fatalf("stop reason mismatch: %q", turn.StopReason)
}
}
func TestFinalizeTurnStopOutcome(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{Text: "hello"}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{})
if outcome.ShouldFail {
t.Fatalf("unexpected failure: %#v", outcome.Error)
}
if outcome.FinishReason != "stop" || !outcome.HasVisibleText || !outcome.HasVisibleOutput {
t.Fatalf("unexpected outcome: %#v", outcome)
}
}
func TestFinalizeTurnToolCallsOutcome(t *testing.T) {
turn := BuildTurnFromStreamSnapshot(StreamSnapshot{AlreadyEmittedCalls: true}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{AlreadyEmittedToolCalls: true})
if outcome.ShouldFail || outcome.FinishReason != "tool_calls" || !outcome.HasToolCalls {
t.Fatalf("unexpected tool outcome: %#v", outcome)
}
}
func TestFinalizeTurnContentFilterOutcome(t *testing.T) {
turn := BuildTurnFromCollected(sse.CollectResult{ContentFilter: true}, BuildOptions{})
outcome := FinalizeTurn(turn, FinalizeOptions{})
if !outcome.ShouldFail || outcome.Error == nil || outcome.Error.Code != "content_filter" {
t.Fatalf("expected content filter failure, got %#v", outcome)
}
}

View File

@@ -0,0 +1,193 @@
package completionruntime
import (
"context"
"fmt"
"io"
"net/http"
"strings"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/config"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
)
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type Options struct {
StripReferenceMarkers bool
MaxAttempts int
RetryEnabled bool
RetryMaxAttempts int
CurrentInputFile history.CurrentInputConfigReader
}
type NonStreamResult struct {
SessionID string
Payload map[string]any
Turn assistantturn.Turn
Attempts int
}
type StartResult struct {
SessionID string
Payload map[string]any
Pow string
Response *http.Response
Request promptcompat.StandardRequest
}
func StartCompletion(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (StartResult, *assistantturn.OutputError) {
maxAttempts := opts.MaxAttempts
if maxAttempts <= 0 {
maxAttempts = 3
}
var prepErr *assistantturn.OutputError
stdReq, prepErr = prepareCurrentInputFile(ctx, ds, a, stdReq, opts)
if prepErr != nil {
return StartResult{Request: stdReq}, prepErr
}
sessionID, err := ds.CreateSession(ctx, a, maxAttempts)
if err != nil {
return StartResult{Request: stdReq}, authOutputError(a)
}
pow, err := ds.GetPow(ctx, a, maxAttempts)
if err != nil {
return StartResult{SessionID: sessionID, Request: stdReq}, &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Failed to get PoW (invalid token or unknown error).", Code: "error"}
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := ds.CallCompletion(ctx, a, payload, pow, maxAttempts)
if err != nil {
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Request: stdReq}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
}
return StartResult{SessionID: sessionID, Payload: payload, Pow: pow, Response: resp, Request: stdReq}, nil
}
func prepareCurrentInputFile(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (promptcompat.StandardRequest, *assistantturn.OutputError) {
if opts.CurrentInputFile == nil || stdReq.CurrentInputFileApplied {
return stdReq, nil
}
out, err := (history.Service{Store: opts.CurrentInputFile, DS: ds}).ApplyCurrentInputFile(ctx, a, stdReq)
if err != nil {
status, message := history.MapError(err)
return out, &assistantturn.OutputError{Status: status, Message: message, Code: "error"}
}
return out, nil
}
func ExecuteNonStreamWithRetry(ctx context.Context, ds DeepSeekCaller, a *auth.RequestAuth, stdReq promptcompat.StandardRequest, opts Options) (NonStreamResult, *assistantturn.OutputError) {
start, startErr := StartCompletion(ctx, ds, a, stdReq, opts)
if startErr != nil {
return NonStreamResult{SessionID: start.SessionID, Payload: start.Payload}, startErr
}
stdReq = start.Request
maxAttempts := opts.MaxAttempts
if maxAttempts <= 0 {
maxAttempts = 3
}
sessionID := start.SessionID
payload := start.Payload
pow := start.Pow
attempts := 0
currentResp := start.Response
usagePrompt := stdReq.PromptTokenText
accumulatedThinking := ""
accumulatedRawThinking := ""
accumulatedToolDetectionThinking := ""
for {
turn, outErr := collectAttempt(currentResp, stdReq, usagePrompt, opts)
if outErr != nil {
return NonStreamResult{SessionID: sessionID, Payload: payload, Attempts: attempts}, outErr
}
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, turn.Thinking)
accumulatedRawThinking += sse.TrimContinuationOverlap(accumulatedRawThinking, turn.RawThinking)
accumulatedToolDetectionThinking += sse.TrimContinuationOverlap(accumulatedToolDetectionThinking, turn.DetectionThinking)
turn.Thinking = accumulatedThinking
turn.RawThinking = accumulatedRawThinking
turn.DetectionThinking = accumulatedToolDetectionThinking
turn = assistantturn.BuildTurnFromCollected(sse.CollectResult{
Text: turn.RawText,
Thinking: turn.RawThinking,
ToolDetectionThinking: turn.DetectionThinking,
ContentFilter: turn.ContentFilter,
CitationLinks: turn.CitationLinks,
ResponseMessageID: turn.ResponseMessageID,
}, buildOptions(stdReq, usagePrompt, opts))
retryMax := opts.RetryMaxAttempts
if retryMax <= 0 {
retryMax = shared.EmptyOutputRetryMaxAttempts()
}
if !opts.RetryEnabled || !assistantturn.ShouldRetryEmptyOutput(turn, attempts, retryMax) {
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, turn.Error
}
attempts++
config.Logger.Info("[completion_runtime_empty_retry] attempting synthetic retry", "surface", stdReq.Surface, "stream", false, "retry_attempt", attempts, "parent_message_id", turn.ResponseMessageID)
retryPow, powErr := ds.GetPow(ctx, a, maxAttempts)
if powErr != nil {
config.Logger.Warn("[completion_runtime_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", stdReq.Surface, "retry_attempt", attempts, "error", powErr)
retryPow = pow
}
retryPayload := shared.ClonePayloadForEmptyOutputRetry(payload, turn.ResponseMessageID)
nextResp, err := ds.CallCompletion(ctx, a, retryPayload, retryPow, maxAttempts)
if err != nil {
return NonStreamResult{SessionID: sessionID, Payload: payload, Turn: turn, Attempts: attempts}, &assistantturn.OutputError{Status: http.StatusInternalServerError, Message: "Failed to get completion.", Code: "error"}
}
usagePrompt = shared.UsagePromptWithEmptyOutputRetry(usagePrompt, attempts)
currentResp = nextResp
}
}
func collectAttempt(resp *http.Response, stdReq promptcompat.StandardRequest, usagePrompt string, opts Options) (assistantturn.Turn, *assistantturn.OutputError) {
defer func() {
if err := resp.Body.Close(); err != nil {
config.Logger.Warn("[completion_runtime] response body close failed", "surface", stdReq.Surface, "error", err)
}
}()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
message := strings.TrimSpace(string(body))
if message == "" {
message = http.StatusText(resp.StatusCode)
}
return assistantturn.Turn{}, &assistantturn.OutputError{Status: resp.StatusCode, Message: message, Code: "error"}
}
result := sse.CollectStream(resp, stdReq.Thinking, false)
return assistantturn.BuildTurnFromCollected(result, buildOptions(stdReq, usagePrompt, opts)), nil
}
func buildOptions(stdReq promptcompat.StandardRequest, prompt string, opts Options) assistantturn.BuildOptions {
return assistantturn.BuildOptions{
Model: stdReq.ResponseModel,
Prompt: prompt,
RefFileTokens: stdReq.RefFileTokens,
SearchEnabled: stdReq.Search,
StripReferenceMarkers: opts.StripReferenceMarkers,
ToolNames: stdReq.ToolNames,
ToolsRaw: stdReq.ToolsRaw,
ToolChoice: stdReq.ToolChoice,
}
}
func authOutputError(a *auth.RequestAuth) *assistantturn.OutputError {
if a != nil && a.UseConfigToken {
return &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Account token is invalid. Please re-login the account in admin.", Code: "error"}
}
return &assistantturn.OutputError{Status: http.StatusUnauthorized, Message: "Invalid token. If this should be a DS2API key, add it to config.keys first.", Code: "error"}
}
func Errorf(status int, format string, args ...any) *assistantturn.OutputError {
return &assistantturn.OutputError{Status: status, Message: fmt.Sprintf(format, args...), Code: "error"}
}

View File

@@ -0,0 +1,174 @@
package completionruntime
import (
"context"
"io"
"net/http"
"strings"
"testing"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/promptcompat"
)
type fakeDeepSeekCaller struct {
responses []*http.Response
payloads []map[string]any
uploads []dsclient.UploadFileRequest
}
type currentInputRuntimeConfig struct{}
func (currentInputRuntimeConfig) CurrentInputFileEnabled() bool { return true }
func (currentInputRuntimeConfig) CurrentInputFileMinChars() int { return 0 }
func (f *fakeDeepSeekCaller) CreateSession(context.Context, *auth.RequestAuth, int) (string, error) {
return "session-1", nil
}
func (f *fakeDeepSeekCaller) GetPow(context.Context, *auth.RequestAuth, int) (string, error) {
return "pow", nil
}
func (f *fakeDeepSeekCaller) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
f.uploads = append(f.uploads, req)
return &dsclient.UploadFileResult{ID: "file-runtime-1"}, nil
}
func (f *fakeDeepSeekCaller) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
f.payloads = append(f.payloads, payload)
if len(f.responses) == 0 {
return sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"fallback"}`), nil
}
resp := f.responses[0]
f.responses = f.responses[1:]
return resp, nil
}
func TestExecuteNonStreamWithRetryBuildsCanonicalTurn(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{sseHTTPResponse(
http.StatusOK,
`data: {"response_message_id":42,"p":"response/content","v":"<tool_calls><invoke name=\"Write\"><parameter name=\"content\">{\"x\":1}</parameter></invoke></tool_calls>"}`,
)}}
stdReq := promptcompat.StandardRequest{
Surface: "test",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "prompt",
FinalPrompt: "final prompt",
ToolNames: []string{"Write"},
ToolsRaw: []any{map[string]any{
"name": "Write",
"input_schema": map[string]any{
"type": "object",
"properties": map[string]any{
"content": map[string]any{"type": "string"},
},
},
}},
}
result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, &auth.RequestAuth{}, stdReq, Options{})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if result.SessionID != "session-1" {
t.Fatalf("session mismatch: %q", result.SessionID)
}
if got := result.Turn.ResponseMessageID; got != 42 {
t.Fatalf("response message id mismatch: %d", got)
}
if len(result.Turn.ToolCalls) != 1 {
t.Fatalf("expected one tool call, got %d", len(result.Turn.ToolCalls))
}
if _, ok := result.Turn.ToolCalls[0].Input["content"].(string); !ok {
t.Fatalf("expected schema-normalized string argument, got %#v", result.Turn.ToolCalls[0].Input["content"])
}
if result.Turn.Usage.InputTokens == 0 || result.Turn.Usage.TotalTokens == 0 {
t.Fatalf("expected usage to be populated, got %#v", result.Turn.Usage)
}
}
func TestExecuteNonStreamWithRetryUsesParentMessageForEmptyRetry(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":77,"p":"response/status","v":"FINISHED"}`),
sseHTTPResponse(http.StatusOK, `data: {"response_message_id":78,"p":"response/content","v":"ok"}`),
}}
stdReq := promptcompat.StandardRequest{
Surface: "test",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "prompt",
FinalPrompt: "final prompt",
}
result, outErr := ExecuteNonStreamWithRetry(context.Background(), ds, &auth.RequestAuth{}, stdReq, Options{RetryEnabled: true})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if result.Attempts != 1 {
t.Fatalf("expected one retry, got %d", result.Attempts)
}
if len(ds.payloads) != 2 {
t.Fatalf("expected two completion calls, got %d", len(ds.payloads))
}
if got := ds.payloads[1]["parent_message_id"]; got != 77 {
t.Fatalf("retry parent_message_id mismatch: %#v", got)
}
if result.Turn.Text != "ok" {
t.Fatalf("retry text mismatch: %q", result.Turn.Text)
}
}
func TestStartCompletionAppliesCurrentInputFileGlobally(t *testing.T) {
ds := &fakeDeepSeekCaller{responses: []*http.Response{sseHTTPResponse(http.StatusOK, `data: {"p":"response/content","v":"ok"}`)}}
stdReq := promptcompat.StandardRequest{
Surface: "test_adapter",
RequestedModel: "deepseek-v4-flash",
ResolvedModel: "deepseek-v4-flash",
ResponseModel: "deepseek-v4-flash",
PromptTokenText: "first user turn",
FinalPrompt: "first user turn",
Messages: []any{
map[string]any{"role": "user", "content": "first user turn"},
},
}
start, outErr := StartCompletion(context.Background(), ds, &auth.RequestAuth{DeepSeekToken: "token"}, stdReq, Options{
CurrentInputFile: currentInputRuntimeConfig{},
})
if outErr != nil {
t.Fatalf("unexpected output error: %#v", outErr)
}
if len(ds.uploads) != 1 {
t.Fatalf("expected current input upload, got %d", len(ds.uploads))
}
if got := ds.uploads[0].Filename; got != "DS2API_HISTORY.txt" {
t.Fatalf("upload filename=%q want DS2API_HISTORY.txt", got)
}
if len(ds.payloads) != 1 {
t.Fatalf("expected one completion payload, got %d", len(ds.payloads))
}
refIDs, _ := ds.payloads[0]["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-runtime-1" {
t.Fatalf("expected uploaded file id in ref_file_ids, got %#v", ds.payloads[0]["ref_file_ids"])
}
prompt, _ := ds.payloads[0]["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
if !start.Request.CurrentInputFileApplied || !strings.Contains(start.Request.PromptTokenText, "# DS2API_HISTORY.txt") {
t.Fatalf("expected prepared request to carry current input file state, got %#v", start.Request)
}
}
func sseHTTPResponse(status int, lines ...string) *http.Response {
body := strings.Join(lines, "\n")
if !strings.HasSuffix(body, "\n") {
body += "\n"
}
return &http.Response{
StatusCode: status,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader(body)),
}
}

View File

@@ -35,9 +35,6 @@ func (c Config) MarshalJSON() ([]byte, error) {
if c.Runtime.AccountMaxInflight > 0 || c.Runtime.AccountMaxQueue > 0 || c.Runtime.GlobalMaxInflight > 0 || c.Runtime.TokenRefreshIntervalHours > 0 {
m["runtime"] = c.Runtime
}
if c.Compat.WideInputStrictOutput != nil || c.Compat.StripReferenceMarkers != nil {
m["compat"] = c.Compat
}
if c.Responses.StoreTTLSeconds > 0 {
m["responses"] = c.Responses
}
@@ -45,9 +42,6 @@ func (c Config) MarshalJSON() ([]byte, error) {
m["embeddings"] = c.Embeddings
}
m["auto_delete"] = c.AutoDelete
if c.HistorySplit.Enabled != nil || c.HistorySplit.TriggerAfterTurns != nil {
m["history_split"] = c.HistorySplit
}
if c.CurrentInputFile.Enabled != nil || c.CurrentInputFile.MinChars != 0 {
m["current_input_file"] = c.CurrentInputFile
}
@@ -103,8 +97,9 @@ func (c *Config) UnmarshalJSON(b []byte) error {
return fmt.Errorf("invalid field %q: %w", k, err)
}
case "compat":
if err := json.Unmarshal(v, &c.Compat); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
// Removed field ignored instead of persisted.
if Logger != nil {
Logger.Warn("config key \"compat\" is deprecated and ignored; remove it from your configuration")
}
case "toolcall":
// Legacy field ignored. Toolcall policy is fixed and no longer configurable.
@@ -121,9 +116,7 @@ func (c *Config) UnmarshalJSON(b []byte) error {
return fmt.Errorf("invalid field %q: %w", k, err)
}
case "history_split":
if err := json.Unmarshal(v, &c.HistorySplit); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
}
// Removed legacy split field is ignored instead of persisted.
case "current_input_file":
if err := json.Unmarshal(v, &c.CurrentInputFile); err != nil {
return fmt.Errorf("invalid field %q: %w", k, err)
@@ -160,17 +153,9 @@ func (c Config) Clone() Config {
ModelAliases: cloneStringMap(c.ModelAliases),
Admin: c.Admin,
Runtime: c.Runtime,
Compat: CompatConfig{
WideInputStrictOutput: cloneBoolPtr(c.Compat.WideInputStrictOutput),
StripReferenceMarkers: cloneBoolPtr(c.Compat.StripReferenceMarkers),
},
Responses: c.Responses,
Embeddings: c.Embeddings,
AutoDelete: c.AutoDelete,
HistorySplit: HistorySplitConfig{
Enabled: cloneBoolPtr(c.HistorySplit.Enabled),
TriggerAfterTurns: cloneIntPtr(c.HistorySplit.TriggerAfterTurns),
},
Responses: c.Responses,
Embeddings: c.Embeddings,
AutoDelete: c.AutoDelete,
CurrentInputFile: CurrentInputFileConfig{
Enabled: cloneBoolPtr(c.CurrentInputFile.Enabled),
MinChars: c.CurrentInputFile.MinChars,
@@ -208,14 +193,6 @@ func cloneBoolPtr(in *bool) *bool {
return &v
}
func cloneIntPtr(in *int) *int {
if in == nil {
return nil
}
v := *in
return &v
}
func parseConfigString(raw string) (Config, error) {
var cfg Config
candidates := []string{raw}

View File

@@ -15,11 +15,9 @@ type Config struct {
ModelAliases map[string]string `json:"model_aliases,omitempty"`
Admin AdminConfig `json:"admin,omitempty"`
Runtime RuntimeConfig `json:"runtime,omitempty"`
Compat CompatConfig `json:"compat,omitempty"`
Responses ResponsesConfig `json:"responses,omitempty"`
Embeddings EmbeddingsConfig `json:"embeddings,omitempty"`
AutoDelete AutoDeleteConfig `json:"auto_delete"`
HistorySplit HistorySplitConfig `json:"history_split"`
CurrentInputFile CurrentInputFileConfig `json:"current_input_file,omitempty"`
ThinkingInjection ThinkingInjectionConfig `json:"thinking_injection,omitempty"`
VercelSyncHash string `json:"_vercel_sync_hash,omitempty"`
@@ -142,11 +140,6 @@ func (c *Config) normalizeModelAliases() {
}
}
type CompatConfig struct {
WideInputStrictOutput *bool `json:"wide_input_strict_output,omitempty"`
StripReferenceMarkers *bool `json:"strip_reference_markers,omitempty"`
}
type AdminConfig struct {
PasswordHash string `json:"password_hash,omitempty"`
JWTExpireHours int `json:"jwt_expire_hours,omitempty"`
@@ -173,11 +166,6 @@ type AutoDeleteConfig struct {
Sessions bool `json:"sessions,omitempty"`
}
type HistorySplitConfig struct {
Enabled *bool `json:"enabled,omitempty"`
TriggerAfterTurns *int `json:"trigger_after_turns,omitempty"`
}
type CurrentInputFileConfig struct {
Enabled *bool `json:"enabled,omitempty"`
MinChars int `json:"min_chars,omitempty"`

View File

@@ -163,8 +163,6 @@ func TestLowerFunction(t *testing.T) {
// ─── Config.MarshalJSON / UnmarshalJSON roundtrip ────────────────────
func TestConfigJSONRoundtrip(t *testing.T) {
trueVal := true
falseVal := false
cfg := Config{
Keys: []string{"key1", "key2"},
Accounts: []Account{{Email: "user@example.com", Password: "pass", Token: "tok"}},
@@ -172,17 +170,9 @@ func TestConfigJSONRoundtrip(t *testing.T) {
AutoDelete: AutoDeleteConfig{
Mode: "single",
},
HistorySplit: HistorySplitConfig{
Enabled: &trueVal,
TriggerAfterTurns: func() *int { v := 2; return &v }(),
},
Runtime: RuntimeConfig{
TokenRefreshIntervalHours: 12,
},
Compat: CompatConfig{
WideInputStrictOutput: &trueVal,
StripReferenceMarkers: &falseVal,
},
VercelSyncHash: "hash123",
VercelSyncTime: 1234567890,
AdditionalFields: map[string]any{
@@ -215,18 +205,6 @@ func TestConfigJSONRoundtrip(t *testing.T) {
if decoded.AutoDelete.Mode != "single" {
t.Fatalf("unexpected auto delete mode: %#v", decoded.AutoDelete.Mode)
}
if decoded.HistorySplit.Enabled == nil || !*decoded.HistorySplit.Enabled {
t.Fatalf("unexpected history split enabled: %#v", decoded.HistorySplit.Enabled)
}
if decoded.HistorySplit.TriggerAfterTurns == nil || *decoded.HistorySplit.TriggerAfterTurns != 2 {
t.Fatalf("unexpected history split trigger_after_turns: %#v", decoded.HistorySplit.TriggerAfterTurns)
}
if decoded.Compat.WideInputStrictOutput == nil || !*decoded.Compat.WideInputStrictOutput {
t.Fatalf("unexpected compat wide_input_strict_output: %#v", decoded.Compat.WideInputStrictOutput)
}
if decoded.Compat.StripReferenceMarkers == nil || *decoded.Compat.StripReferenceMarkers {
t.Fatalf("unexpected compat strip_reference_markers: %#v", decoded.Compat.StripReferenceMarkers)
}
if decoded.VercelSyncHash != "hash123" {
t.Fatalf("unexpected vercel sync hash: %q", decoded.VercelSyncHash)
}
@@ -290,23 +268,31 @@ func TestConfigUnmarshalJSONIgnoresRemovedLegacyModelMappings(t *testing.T) {
}
}
func TestConfigUnmarshalJSONIgnoresRemovedHistorySplit(t *testing.T) {
raw := `{"keys":["k1"],"history_split":{"enabled":true,"trigger_after_turns":2}}`
var cfg Config
if err := json.Unmarshal([]byte(raw), &cfg); err != nil {
t.Fatalf("unmarshal error: %v", err)
}
if _, ok := cfg.AdditionalFields["history_split"]; ok {
t.Fatalf("expected removed legacy field not to persist in additional fields: %#v", cfg.AdditionalFields)
}
out, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("marshal error: %v", err)
}
if strings.Contains(string(out), "history_split") {
t.Fatalf("expected removed history_split field not to marshal, got %s", out)
}
}
// ─── Config.Clone ────────────────────────────────────────────────────
func TestConfigCloneIsDeepCopy(t *testing.T) {
falseVal := false
trueVal := true
turns := 2
cfg := Config{
Keys: []string{"key1"},
Accounts: []Account{{Email: "user@test.com", Token: "token"}},
ModelAliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"},
Compat: CompatConfig{
StripReferenceMarkers: &falseVal,
},
HistorySplit: HistorySplitConfig{
Enabled: &trueVal,
TriggerAfterTurns: &turns,
},
Keys: []string{"key1"},
Accounts: []Account{{Email: "user@test.com", Token: "token"}},
ModelAliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"},
AdditionalFields: map[string]any{"custom": "value"},
}
@@ -316,15 +302,6 @@ func TestConfigCloneIsDeepCopy(t *testing.T) {
cfg.Keys[0] = "modified"
cfg.Accounts[0].Email = "modified@test.com"
cfg.ModelAliases["claude-sonnet-4-6"] = "modified-model"
if cfg.Compat.StripReferenceMarkers != nil {
*cfg.Compat.StripReferenceMarkers = true
}
if cfg.HistorySplit.Enabled != nil {
*cfg.HistorySplit.Enabled = false
}
if cfg.HistorySplit.TriggerAfterTurns != nil {
*cfg.HistorySplit.TriggerAfterTurns = 5
}
// Cloned should not be affected
if cloned.Keys[0] != "key1" {
@@ -336,15 +313,6 @@ func TestConfigCloneIsDeepCopy(t *testing.T) {
if cloned.ModelAliases["claude-sonnet-4-6"] != "deepseek-v4-flash" {
t.Fatalf("clone model aliases was affected: %#v", cloned.ModelAliases)
}
if cloned.Compat.StripReferenceMarkers == nil || *cloned.Compat.StripReferenceMarkers {
t.Fatalf("clone compat was affected: %#v", cloned.Compat.StripReferenceMarkers)
}
if cloned.HistorySplit.Enabled == nil || !*cloned.HistorySplit.Enabled {
t.Fatalf("clone history split enabled was affected: %#v", cloned.HistorySplit.Enabled)
}
if cloned.HistorySplit.TriggerAfterTurns == nil || *cloned.HistorySplit.TriggerAfterTurns != 2 {
t.Fatalf("clone history split trigger was affected: %#v", cloned.HistorySplit.TriggerAfterTurns)
}
}
func TestConfigCloneNilMaps(t *testing.T) {
@@ -483,53 +451,9 @@ func TestStoreFindAccountNotFound(t *testing.T) {
}
}
func TestStoreCompatWideInputStrictOutputDefaultTrue(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[]}`)
store := LoadStore()
if !store.CompatWideInputStrictOutput() {
t.Fatal("expected default wide_input_strict_output=true when unset")
}
}
func TestStoreCompatWideInputStrictOutputCanDisable(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[],"compat":{"wide_input_strict_output":false}}`)
store := LoadStore()
if store.CompatWideInputStrictOutput() {
t.Fatal("expected wide_input_strict_output=false when explicitly configured")
}
snap := store.Snapshot()
data, err := snap.MarshalJSON()
if err != nil {
t.Fatalf("marshal failed: %v", err)
}
var out map[string]any
if err := json.Unmarshal(data, &out); err != nil {
t.Fatalf("decode failed: %v", err)
}
rawCompat, ok := out["compat"].(map[string]any)
if !ok {
t.Fatalf("expected compat in marshaled output, got %#v", out)
}
if rawCompat["wide_input_strict_output"] != false {
t.Fatalf("expected explicit false in compat, got %#v", rawCompat)
}
}
func TestStoreCompatStripReferenceMarkersDefaultTrue(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[]}`)
store := LoadStore()
if !store.CompatStripReferenceMarkers() {
t.Fatal("expected default strip_reference_markers=true when unset")
}
}
func TestStoreCompatStripReferenceMarkersCanDisable(t *testing.T) {
func TestStoreIgnoresRemovedCompatConfig(t *testing.T) {
t.Setenv("DS2API_CONFIG_JSON", `{"keys":["k1"],"accounts":[],"compat":{"strip_reference_markers":false}}`)
store := LoadStore()
if store.CompatStripReferenceMarkers() {
t.Fatal("expected strip_reference_markers=false when explicitly configured")
}
snap := store.Snapshot()
data, err := snap.MarshalJSON()
@@ -540,12 +464,8 @@ func TestStoreCompatStripReferenceMarkersCanDisable(t *testing.T) {
if err := json.Unmarshal(data, &out); err != nil {
t.Fatalf("decode failed: %v", err)
}
rawCompat, ok := out["compat"].(map[string]any)
if !ok {
t.Fatalf("expected compat in marshaled output, got %#v", out)
}
if rawCompat["strip_reference_markers"] != false {
t.Fatalf("expected explicit false in compat, got %#v", rawCompat)
if _, ok := out["compat"]; ok {
t.Fatalf("expected removed compat field not to marshal, got %#v", out)
}
}

View File

@@ -144,6 +144,44 @@ func TestLoadStoreIgnoresLegacyConfigJSONEnv(t *testing.T) {
}
}
func TestExplicitMissingConfigPathBootstrapsEmptyFileBackedStore(t *testing.T) {
path := t.TempDir() + "/config.json"
t.Setenv("DS2API_CONFIG_JSON", "")
t.Setenv("DS2API_CONFIG_PATH", path)
store, err := LoadStoreWithError()
if err != nil {
t.Fatalf("expected missing explicit config path to bootstrap, got: %v", err)
}
if store.IsEnvBacked() {
t.Fatal("expected bootstrap store to be file-backed")
}
if store.ConfigPath() != path {
t.Fatalf("ConfigPath() = %q, want %q", store.ConfigPath(), path)
}
if len(store.Keys()) != 0 || len(store.Accounts()) != 0 {
t.Fatalf("expected empty bootstrap config, got keys=%d accounts=%d", len(store.Keys()), len(store.Accounts()))
}
if _, statErr := os.Stat(path); !errors.Is(statErr, os.ErrNotExist) {
t.Fatalf("expected bootstrap not to create config until first save, stat err=%v", statErr)
}
if err := store.Update(func(c *Config) error {
c.Keys = []string{"first-key"}
return nil
}); err != nil {
t.Fatalf("update should persist bootstrap config: %v", err)
}
content, err := os.ReadFile(path)
if err != nil {
t.Fatalf("expected first update to write config: %v", err)
}
if !strings.Contains(string(content), "first-key") {
t.Fatalf("expected saved config to contain first key, got: %s", content)
}
}
func TestEnvBackedStoreWritebackBootstrapsMissingConfigFile(t *testing.T) {
tmp, err := os.CreateTemp(t.TempDir(), "config-*.json")
if err != nil {

View File

@@ -52,11 +52,12 @@ func loadStore() (*Store, error) {
func loadConfig() (Config, bool, error) {
rawCfg := strings.TrimSpace(os.Getenv("DS2API_CONFIG_JSON"))
path := ConfigPath()
if rawCfg != "" {
cfg, err := parseConfigString(rawCfg)
if err != nil {
if !IsVercel() && envWritebackEnabled() {
if fileCfg, fileErr := loadConfigFromFile(ConfigPath()); fileErr == nil {
if fileCfg, fileErr := loadConfigFromFile(path); fileErr == nil {
return fileCfg, false, nil
}
}
@@ -67,7 +68,7 @@ func loadConfig() (Config, bool, error) {
if IsVercel() || !envWritebackEnabled() {
return cfg, true, err
}
content, fileErr := os.ReadFile(ConfigPath())
content, fileErr := os.ReadFile(path)
if fileErr == nil {
var fileCfg Config
if unmarshalErr := json.Unmarshal(content, &fileCfg); unmarshalErr == nil {
@@ -79,7 +80,7 @@ func loadConfig() (Config, bool, error) {
if validateErr := ValidateConfig(cfg); validateErr != nil {
return cfg, true, validateErr
}
if writeErr := writeConfigFile(ConfigPath(), cfg.Clone()); writeErr == nil {
if writeErr := writeConfigFile(path, cfg.Clone()); writeErr == nil {
return cfg, false, err
} else {
Logger.Warn("[config] env writeback bootstrap failed", "error", writeErr)
@@ -87,7 +88,7 @@ func loadConfig() (Config, bool, error) {
}
return cfg, true, err
}
cfg, err := loadConfigFromFile(ConfigPath())
cfg, err := loadConfigFromFile(path)
if err != nil {
if shouldTryLegacyContainerConfigPath() {
legacyPath := legacyContainerConfigPath()
@@ -100,6 +101,10 @@ func loadConfig() (Config, bool, error) {
// Vercel may start without writable/present config; keep in-memory bootstrap config.
return Config{}, true, nil
}
if shouldBootstrapMissingConfigFile(err) {
Logger.Warn("[config] config file missing; starting with empty file-backed config", "path", path)
return Config{}, false, nil
}
return Config{}, false, err
}
if IsVercel() {
@@ -109,6 +114,10 @@ func loadConfig() (Config, bool, error) {
return cfg, false, nil
}
func shouldBootstrapMissingConfigFile(err error) bool {
return errors.Is(err, os.ErrNotExist) && strings.TrimSpace(os.Getenv("DS2API_CONFIG_PATH")) != ""
}
func loadConfigFromFile(path string) (Config, error) {
content, err := os.ReadFile(path)
if err != nil {

View File

@@ -21,24 +21,6 @@ func (s *Store) ModelAliases() map[string]string {
return out
}
func (s *Store) CompatWideInputStrictOutput() bool {
s.mu.RLock()
defer s.mu.RUnlock()
if s.cfg.Compat.WideInputStrictOutput == nil {
return true
}
return *s.cfg.Compat.WideInputStrictOutput
}
func (s *Store) CompatStripReferenceMarkers() bool {
s.mu.RLock()
defer s.mu.RUnlock()
if s.cfg.Compat.StripReferenceMarkers == nil {
return true
}
return *s.cfg.Compat.StripReferenceMarkers
}
func (s *Store) ToolcallMode() string {
return "feature_match"
}
@@ -163,14 +145,6 @@ func (s *Store) AutoDeleteSessions() bool {
return s.AutoDeleteMode() != "none"
}
func (s *Store) HistorySplitEnabled() bool {
return false
}
func (s *Store) HistorySplitTriggerAfterTurns() int {
return 1
}
func (s *Store) CurrentInputFileEnabled() bool {
s.mu.RLock()
defer s.mu.RUnlock()

View File

@@ -2,21 +2,6 @@ package config
import "testing"
func TestStoreHistorySplitAccessors(t *testing.T) {
enabled := true
turns := 3
store := &Store{cfg: Config{HistorySplit: HistorySplitConfig{
Enabled: &enabled,
TriggerAfterTurns: &turns,
}}}
if store.HistorySplitEnabled() {
t.Fatal("expected history split to stay disabled")
}
if got := store.HistorySplitTriggerAfterTurns(); got != 1 {
t.Fatalf("history split trigger_after_turns=%d want=1", got)
}
}
func TestStoreCurrentInputFileAccessors(t *testing.T) {
store := &Store{cfg: Config{}}
if !store.CurrentInputFileEnabled() {
@@ -40,12 +25,6 @@ func TestStoreCurrentInputFileAccessors(t *testing.T) {
if got := store.CurrentInputFileMinChars(); got != 12345 {
t.Fatalf("current input file min_chars=%d want=12345", got)
}
historyEnabled := true
store.cfg.HistorySplit.Enabled = &historyEnabled
if !store.CurrentInputFileEnabled() {
t.Fatal("expected history split config to not suppress current input file mode")
}
}
func TestStoreThinkingInjectionAccessors(t *testing.T) {

View File

@@ -22,6 +22,9 @@ const (
var fileReadySleep = time.Sleep
// ErrUploadFileNotFound indicates that DeepSeek returned no matching uploaded file.
var ErrUploadFileNotFound = errors.New("uploaded file not found")
func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, result *UploadFileResult) error {
if result == nil || strings.TrimSpace(result.ID) == "" {
return nil
@@ -42,7 +45,7 @@ func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, r
return fmt.Errorf("waiting for file %s to become ready: %w", result.ID, err)
}
fetched, err := c.fetchUploadedFile(pollCtx, a, result.ID)
fetched, err := c.FetchUploadedFile(pollCtx, a, result.ID)
if err == nil && fetched != nil {
mergeUploadFileResults(result, fetched)
if isReadyUploadFileStatus(result.Status) {
@@ -65,7 +68,8 @@ func (c *Client) waitForUploadedFile(ctx context.Context, a *auth.RequestAuth, r
return fmt.Errorf("file %s did not become ready: %w", result.ID, lastErr)
}
func (c *Client) fetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*UploadFileResult, error) {
// FetchUploadedFile returns metadata for an uploaded DeepSeek file by ID.
func (c *Client) FetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*UploadFileResult, error) {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return nil, errors.New("file id is required")
@@ -92,7 +96,7 @@ func (c *Client) fetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fil
result := extractFetchedUploadFileResult(resp, fileID)
if result == nil || strings.TrimSpace(result.ID) == "" {
return nil, errors.New("fetch files succeeded without matching file data")
return nil, ErrUploadFileNotFound
}
result.Raw = resp
return result, nil

View File

@@ -23,6 +23,7 @@ type UploadFileRequest struct {
Filename string
ContentType string
Purpose string
ModelType string
Data []byte
}
@@ -54,6 +55,7 @@ func (c *Client) UploadFile(ctx context.Context, a *auth.RequestAuth, req Upload
contentType = "application/octet-stream"
}
purpose := strings.TrimSpace(req.Purpose)
modelType := strings.ToLower(strings.TrimSpace(req.ModelType))
body, contentTypeHeader, err := buildUploadMultipartBody(filename, contentType, req.Data)
if err != nil {
return nil, err
@@ -64,6 +66,9 @@ func (c *Client) UploadFile(ctx context.Context, a *auth.RequestAuth, req Upload
"purpose": purpose,
"bytes": len(req.Data),
}
if modelType != "" {
capturePayload["model_type"] = modelType
}
captureSession := c.capture.Start("deepseek_upload_file", dsprotocol.DeepSeekUploadFileURL, a.AccountID, capturePayload)
attempts := 0
refreshed := false
@@ -81,6 +86,9 @@ func (c *Client) UploadFile(ctx context.Context, a *auth.RequestAuth, req Upload
}
headers := c.authHeaders(a.DeepSeekToken)
headers["Content-Type"] = contentTypeHeader
if modelType != "" {
headers["x-model-type"] = modelType
}
headers["x-ds-pow-response"] = powHeader
headers["x-file-size"] = strconv.Itoa(len(req.Data))
headers["x-thinking-enabled"] = "1"

View File

@@ -82,6 +82,7 @@ func TestUploadFileUsesUploadTargetPowAndMultipartHeaders(t *testing.T) {
var seenTargetPath string
var seenContentType string
var seenFileSize string
var seenModelType string
var seenBody string
call := 0
client := &Client{
@@ -96,6 +97,7 @@ func TestUploadFileUsesUploadTargetPowAndMultipartHeaders(t *testing.T) {
seenPow = req.Header.Get("x-ds-pow-response")
seenContentType = req.Header.Get("Content-Type")
seenFileSize = req.Header.Get("x-file-size")
seenModelType = req.Header.Get("x-model-type")
seenBody = string(bodyBytes)
return &http.Response{StatusCode: http.StatusOK, Header: make(http.Header), Body: io.NopCloser(strings.NewReader(uploadResponse)), Request: req}, nil
default:
@@ -112,6 +114,7 @@ func TestUploadFileUsesUploadTargetPowAndMultipartHeaders(t *testing.T) {
Filename: "demo.txt",
ContentType: "text/plain",
Purpose: "assistants",
ModelType: "vision",
Data: []byte("hello"),
}, 1)
if err != nil {
@@ -140,6 +143,9 @@ func TestUploadFileUsesUploadTargetPowAndMultipartHeaders(t *testing.T) {
if seenFileSize != "5" {
t.Fatalf("expected x-file-size=5, got %q", seenFileSize)
}
if seenModelType != "vision" {
t.Fatalf("expected x-model-type=vision, got %q", seenModelType)
}
if !strings.HasPrefix(seenContentType, "multipart/form-data; boundary=") {
t.Fatalf("expected multipart content type, got %q", seenContentType)
}

View File

@@ -159,6 +159,6 @@ func toStringSet(in []string) map[string]struct{} {
const (
KeepAliveTimeout = 5
StreamIdleTimeout = 90
MaxKeepaliveCount = 10
StreamIdleTimeout = 300
MaxKeepaliveCount = 40
)

View File

@@ -1,6 +1,7 @@
package claude
import (
"ds2api/internal/assistantturn"
"ds2api/internal/toolcall"
"fmt"
"time"
@@ -9,6 +10,47 @@ import (
"ds2api/internal/util"
)
func BuildMessageResponseFromTurn(messageID, model string, turn assistantturn.Turn, exposeThinking bool) map[string]any {
content := make([]map[string]any, 0, 4)
if exposeThinking && turn.Thinking != "" {
content = append(content, map[string]any{"type": "thinking", "thinking": turn.Thinking})
}
stopReason := "end_turn"
if len(turn.ToolCalls) > 0 {
stopReason = "tool_use"
for i, tc := range turn.ToolCalls {
content = append(content, map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), i),
"name": tc.Name,
"input": tc.Input,
})
}
} else {
text := turn.Text
if text == "" && exposeThinking {
text = turn.Thinking
}
if text == "" {
text = "抱歉,没有生成有效的响应内容。"
}
content = append(content, map[string]any{"type": "text", "text": text})
}
return map[string]any{
"id": messageID,
"type": "message",
"role": "assistant",
"model": model,
"content": content,
"stop_reason": stopReason,
"stop_sequence": nil,
"usage": map[string]any{
"input_tokens": turn.Usage.InputTokens,
"output_tokens": turn.Usage.OutputTokens,
},
}
}
func BuildMessageResponse(messageID, model string, normalizedMessages []any, finalThinking, finalText string, toolNames []string) map[string]any {
detected := toolcall.ParseToolCalls(finalText, toolNames)
if len(detected) == 0 && finalText == "" && finalThinking != "" {

View File

@@ -208,9 +208,6 @@ func TestUpdateSettingsCurrentInputFile(t *testing.T) {
if !h.Store.CurrentInputFileEnabled() {
t.Fatal("expected current input file accessor to stay enabled")
}
if h.Store.HistorySplitEnabled() {
t.Fatal("expected history split accessor to stay disabled")
}
}
func TestUpdateSettingsCurrentInputFilePartialUpdatePreservesEnabled(t *testing.T) {

View File

@@ -21,11 +21,10 @@ func boolFrom(v any) bool {
}
}
func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *config.RuntimeConfig, *config.CompatConfig, *config.ResponsesConfig, *config.EmbeddingsConfig, *config.AutoDeleteConfig, *config.CurrentInputFileConfig, *config.ThinkingInjectionConfig, map[string]string, error) {
func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *config.RuntimeConfig, *config.ResponsesConfig, *config.EmbeddingsConfig, *config.AutoDeleteConfig, *config.CurrentInputFileConfig, *config.ThinkingInjectionConfig, map[string]string, error) {
var (
adminCfg *config.AdminConfig
runtimeCfg *config.RuntimeConfig
compatCfg *config.CompatConfig
respCfg *config.ResponsesConfig
embCfg *config.EmbeddingsConfig
autoDeleteCfg *config.AutoDeleteConfig
@@ -39,7 +38,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["jwt_expire_hours"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("admin.jwt_expire_hours", n, 1, 720, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.JWTExpireHours = n
}
@@ -51,56 +50,43 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["account_max_inflight"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.account_max_inflight", n, 1, 256, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.AccountMaxInflight = n
}
if v, exists := raw["account_max_queue"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.account_max_queue", n, 1, 200000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.AccountMaxQueue = n
}
if v, exists := raw["global_max_inflight"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.global_max_inflight", n, 1, 200000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.GlobalMaxInflight = n
}
if v, exists := raw["token_refresh_interval_hours"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("runtime.token_refresh_interval_hours", n, 1, 720, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.TokenRefreshIntervalHours = n
}
if cfg.AccountMaxInflight > 0 && cfg.GlobalMaxInflight > 0 && cfg.GlobalMaxInflight < cfg.AccountMaxInflight {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, fmt.Errorf("runtime.global_max_inflight must be >= runtime.account_max_inflight")
return nil, nil, nil, nil, nil, nil, nil, nil, fmt.Errorf("runtime.global_max_inflight must be >= runtime.account_max_inflight")
}
runtimeCfg = cfg
}
if raw, ok := req["compat"].(map[string]any); ok {
cfg := &config.CompatConfig{}
if v, exists := raw["wide_input_strict_output"]; exists {
b := boolFrom(v)
cfg.WideInputStrictOutput = &b
}
if v, exists := raw["strip_reference_markers"]; exists {
b := boolFrom(v)
cfg.StripReferenceMarkers = &b
}
compatCfg = cfg
}
if raw, ok := req["responses"].(map[string]any); ok {
cfg := &config.ResponsesConfig{}
if v, exists := raw["store_ttl_seconds"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("responses.store_ttl_seconds", n, 30, 86400, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.StoreTTLSeconds = n
}
@@ -112,7 +98,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["provider"]; exists {
p := strings.TrimSpace(fmt.Sprintf("%v", v))
if err := config.ValidateTrimmedString("embeddings.provider", p, false); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.Provider = p
}
@@ -138,7 +124,7 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["mode"]; exists {
mode := strings.ToLower(strings.TrimSpace(fmt.Sprintf("%v", v)))
if err := config.ValidateAutoDeleteMode(mode); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
if mode == "" {
mode = "none"
@@ -160,12 +146,12 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
if v, exists := raw["min_chars"]; exists {
n := intFrom(v)
if err := config.ValidateIntRange("current_input_file.min_chars", n, 0, 100000000, true); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
cfg.MinChars = n
}
if err := config.ValidateCurrentInputFileConfig(*cfg); err != nil {
return nil, nil, nil, nil, nil, nil, nil, nil, nil, err
return nil, nil, nil, nil, nil, nil, nil, nil, err
}
currentInputCfg = cfg
}
@@ -182,5 +168,5 @@ func parseSettingsUpdateRequest(req map[string]any) (*config.AdminConfig, *confi
thinkingInjCfg = cfg
}
return adminCfg, runtimeCfg, compatCfg, respCfg, embCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, nil
return adminCfg, runtimeCfg, respCfg, embCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, nil
}

View File

@@ -27,7 +27,6 @@ func (h *Handler) getSettings(w http.ResponseWriter, _ *http.Request) {
"global_max_inflight": h.Store.RuntimeGlobalMaxInflight(recommended),
"token_refresh_interval_hours": h.Store.RuntimeTokenRefreshIntervalHours(),
},
"compat": snap.Compat,
"responses": snap.Responses,
"embeddings": snap.Embeddings,
"auto_delete": snap.AutoDelete,

View File

@@ -17,7 +17,7 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
return
}
adminCfg, runtimeCfg, compatCfg, responsesCfg, embeddingsCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, err := parseSettingsUpdateRequest(req)
adminCfg, runtimeCfg, responsesCfg, embeddingsCfg, autoDeleteCfg, currentInputCfg, thinkingInjCfg, aliasMap, err := parseSettingsUpdateRequest(req)
if err != nil {
writeJSON(w, http.StatusBadRequest, map[string]any{"detail": err.Error()})
return
@@ -53,14 +53,6 @@ func (h *Handler) updateSettings(w http.ResponseWriter, r *http.Request) {
c.Runtime.TokenRefreshIntervalHours = runtimeCfg.TokenRefreshIntervalHours
}
}
if compatCfg != nil {
if compatCfg.WideInputStrictOutput != nil {
c.Compat.WideInputStrictOutput = compatCfg.WideInputStrictOutput
}
if compatCfg.StripReferenceMarkers != nil {
c.Compat.StripReferenceMarkers = compatCfg.StripReferenceMarkers
}
}
if responsesCfg != nil && responsesCfg.StoreTTLSeconds > 0 {
c.Responses.StoreTTLSeconds = responsesCfg.StoreTTLSeconds
}

View File

@@ -33,13 +33,10 @@ type ConfigStore interface {
RuntimeGlobalMaxInflight(defaultSize int) int
RuntimeTokenRefreshIntervalHours() int
AutoDeleteMode() string
HistorySplitEnabled() bool
HistorySplitTriggerAfterTurns() int
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
ThinkingInjectionEnabled() bool
ThinkingInjectionPrompt() string
CompatStripReferenceMarkers() bool
AutoDeleteSessions() bool
}

View File

@@ -0,0 +1,85 @@
package claude
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
)
type claudeCurrentInputAuth struct{}
func (claudeCurrentInputAuth) Determine(*http.Request) (*auth.RequestAuth, error) {
return &auth.RequestAuth{
DeepSeekToken: "direct-token",
CallerID: "caller:test",
TriedAccounts: map[string]bool{},
}, nil
}
func (claudeCurrentInputAuth) Release(*auth.RequestAuth) {}
type claudeCurrentInputDS struct {
uploads []dsclient.UploadFileRequest
payload map[string]any
}
func (d *claudeCurrentInputDS) CreateSession(context.Context, *auth.RequestAuth, int) (string, error) {
return "session-id", nil
}
func (d *claudeCurrentInputDS) GetPow(context.Context, *auth.RequestAuth, int) (string, error) {
return "pow", nil
}
func (d *claudeCurrentInputDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
d.uploads = append(d.uploads, req)
return &dsclient.UploadFileResult{ID: "file-claude-history"}, nil
}
func (d *claudeCurrentInputDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
d.payload = payload
return &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader("data: {\"p\":\"response/content\",\"v\":\"ok\"}\n")),
}, nil
}
func TestClaudeDirectAppliesCurrentInputFile(t *testing.T) {
ds := &claudeCurrentInputDS{}
h := &Handler{
Store: mockClaudeConfig{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}},
Auth: claudeCurrentInputAuth{},
DS: ds,
}
reqBody := `{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"hello from claude"}],"max_tokens":1024}`
req := httptest.NewRequest(http.MethodPost, "/v1/messages", strings.NewReader(reqBody))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
h.Messages(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if len(ds.uploads) != 1 {
t.Fatalf("expected one current input upload, got %d", len(ds.uploads))
}
if ds.uploads[0].Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", ds.uploads[0].Filename)
}
refIDs, _ := ds.payload["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-claude-history" {
t.Fatalf("expected uploaded history ref id, got %#v", ds.payload["ref_file_ids"])
}
prompt, _ := ds.payload["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
}

View File

@@ -17,12 +17,14 @@ type AuthResolver interface {
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type ConfigReader interface {
ModelAliases() map[string]string
CompatStripReferenceMarkers() bool
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type OpenAIChatRunner interface {

View File

@@ -7,7 +7,8 @@ type mockClaudeConfig struct {
}
func (m mockClaudeConfig) ModelAliases() map[string]string { return m.aliases }
func (mockClaudeConfig) CompatStripReferenceMarkers() bool { return true }
func (mockClaudeConfig) CurrentInputFileEnabled() bool { return true }
func (mockClaudeConfig) CurrentInputFileMinChars() int { return 0 }
func TestNormalizeClaudeRequestUsesGlobalAliasMapping(t *testing.T) {
req := map[string]any{
@@ -27,11 +28,32 @@ func TestNormalizeClaudeRequestUsesGlobalAliasMapping(t *testing.T) {
if out.Standard.ResolvedModel != "deepseek-v4-pro-search" {
t.Fatalf("resolved model mismatch: got=%q", out.Standard.ResolvedModel)
}
if out.Standard.Thinking || !out.Standard.Search {
if !out.Standard.Thinking || !out.Standard.Search {
t.Fatalf("unexpected flags: thinking=%v search=%v", out.Standard.Thinking, out.Standard.Search)
}
}
func TestNormalizeClaudeRequestDisablesThinkingWhenRequested(t *testing.T) {
req := map[string]any{
"model": "claude-opus-4-6",
"messages": []any{
map[string]any{"role": "user", "content": "hello"},
},
"thinking": map[string]any{"type": "disabled"},
}
out, err := normalizeClaudeRequest(mockClaudeConfig{
aliases: map[string]string{
"claude-opus-4-6": "deepseek-v4-pro",
},
}, req)
if err != nil {
t.Fatalf("normalizeClaudeRequest error: %v", err)
}
if out.Standard.Thinking {
t.Fatalf("expected explicit Claude thinking disable to win")
}
}
func TestNormalizeClaudeRequestEnablesThinkingWhenRequested(t *testing.T) {
req := map[string]any{
"model": "claude-opus-4-6",

View File

@@ -3,12 +3,20 @@ package claude
import (
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"net/http/httptest"
"strings"
"time"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
claudefmt "ds2api/internal/format/claude"
"ds2api/internal/httpapi/requestbody"
"ds2api/internal/promptcompat"
streamengine "ds2api/internal/stream"
"ds2api/internal/translatorcliproxy"
"ds2api/internal/util"
@@ -20,20 +28,99 @@ func (h *Handler) Messages(w http.ResponseWriter, r *http.Request) {
if strings.TrimSpace(r.Header.Get("anthropic-version")) == "" {
r.Header.Set("anthropic-version", "2023-06-01")
}
if h.OpenAI == nil {
writeClaudeError(w, http.StatusInternalServerError, "OpenAI proxy backend unavailable.")
if isClaudeVercelProxyRequest(r) && h.proxyViaOpenAI(w, r, h.Store) {
return
}
if h.proxyViaOpenAI(w, r, h.Store) {
if h.Auth == nil || h.DS == nil {
if h.OpenAI != nil && h.proxyViaOpenAI(w, r, h.Store) {
return
}
writeClaudeError(w, http.StatusInternalServerError, "Claude runtime backend unavailable.")
return
}
writeClaudeError(w, http.StatusBadGateway, "Failed to proxy Claude request.")
if h.handleClaudeDirect(w, r) {
return
}
writeClaudeError(w, http.StatusBadGateway, "Failed to handle Claude request.")
}
func isClaudeVercelProxyRequest(r *http.Request) bool {
if r == nil || r.URL == nil {
return false
}
return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1" ||
strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
}
func (h *Handler) handleClaudeDirect(w http.ResponseWriter, r *http.Request) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeClaudeError(w, http.StatusBadRequest, "invalid json")
} else {
writeClaudeError(w, http.StatusBadRequest, "invalid body")
}
return true
}
var req map[string]any
if err := json.Unmarshal(raw, &req); err != nil {
writeClaudeError(w, http.StatusBadRequest, "invalid json")
return true
}
norm, err := normalizeClaudeRequest(h.Store, req)
if err != nil {
writeClaudeError(w, http.StatusBadRequest, err.Error())
return true
}
exposeThinking := norm.Standard.Thinking
a, err := h.Auth.Determine(r)
if err != nil {
writeClaudeError(w, http.StatusUnauthorized, err.Error())
return true
}
defer h.Auth.Release(a)
if norm.Standard.Stream {
h.handleClaudeDirectStream(w, r, a, norm.Standard)
return true
}
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, norm.Standard, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeClaudeError(w, outErr.Status, outErr.Message)
return true
}
writeJSON(w, http.StatusOK, claudefmt.BuildMessageResponseFromTurn(
fmt.Sprintf("msg_%d", time.Now().UnixNano()),
norm.Standard.ResponseModel,
result.Turn,
exposeThinking,
))
return true
}
func (h *Handler) handleClaudeDirectStream(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeClaudeError(w, outErr.Status, outErr.Message)
return
}
streamReq := start.Request
h.handleClaudeStreamRealtime(w, r, start.Response, streamReq.ResponseModel, streamReq.Messages, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw)
}
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store ConfigReader) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
writeClaudeError(w, http.StatusBadRequest, "invalid body")
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeClaudeError(w, http.StatusBadRequest, "invalid json")
} else {
writeClaudeError(w, http.StatusBadRequest, "invalid body")
}
return true
}
var req map[string]any
@@ -52,7 +139,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
}
}
translatedReq := translatorcliproxy.ToOpenAI(sdktranslator.FormatClaude, translateModel, raw, stream)
translatedReq, exposeThinking := applyClaudeThinkingPolicyToOpenAIRequest(translatedReq, req, stream)
translatedReq, exposeThinking := applyClaudeThinkingPolicyToOpenAIRequest(translatedReq, req)
isVercelPrepare := strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1"
isVercelRelease := strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
@@ -127,7 +214,7 @@ func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, store C
return true
}
func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[string]any, stream bool) ([]byte, bool) {
func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[string]any) ([]byte, bool) {
req := map[string]any{}
if err := json.Unmarshal(translated, &req); err != nil {
return translated, false
@@ -137,7 +224,7 @@ func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[st
if _, translatedHasOverride := util.ResolveThinkingOverride(req); translatedHasOverride {
return translated, false
}
enabled = !stream
enabled = true
}
typ := "disabled"
if enabled {
@@ -146,9 +233,9 @@ func applyClaudeThinkingPolicyToOpenAIRequest(translated []byte, original map[st
req["thinking"] = map[string]any{"type": typ}
out, err := json.Marshal(req)
if err != nil {
return translated, ok && enabled
return translated, enabled
}
return out, ok && enabled
return out, enabled
}
func stripClaudeThinkingBlocks(raw []byte) []byte {
@@ -203,7 +290,7 @@ func (h *Handler) handleClaudeStreamRealtime(w http.ResponseWriter, r *http.Requ
messages,
thinkingEnabled,
searchEnabled,
h.compatStripReferenceMarkers(),
stripReferenceMarkersEnabled(),
toolNames,
toolsRaw,
buildClaudePromptTokenText(messages, thinkingEnabled),

View File

@@ -8,6 +8,7 @@ import (
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
"ds2api/internal/textclean"
"ds2api/internal/util"
)
@@ -21,11 +22,8 @@ type Handler struct {
OpenAI OpenAIChatRunner
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil || h.Store == nil {
return true
}
return h.Store.CompatStripReferenceMarkers()
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
var (

View File

@@ -28,6 +28,18 @@ func makeClaudeSSEHTTPResponse(lines ...string) *http.Response {
}
}
func makeClaudeContentLine(t *testing.T, text string) string {
t.Helper()
line, err := json.Marshal(map[string]any{
"p": "response/content",
"v": text,
})
if err != nil {
t.Fatalf("marshal content line failed: %v", err)
}
return "data: " + string(line)
}
func parseClaudeFrames(t *testing.T, body string) []claudeFrame {
t.Helper()
chunks := strings.Split(body, "\n\n")
@@ -71,6 +83,17 @@ func findClaudeFrames(frames []claudeFrame, event string) []claudeFrame {
return out
}
func collectClaudeTextDeltas(frames []claudeFrame) string {
var combined strings.Builder
for _, f := range findClaudeFrames(frames, "content_block_delta") {
delta, _ := f.Payload["delta"].(map[string]any)
if delta["type"] == "text_delta" {
combined.WriteString(asString(delta["text"]))
}
}
return combined.String()
}
func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T) {
h := &Handler{}
resp := makeClaudeSSEHTTPResponse(
@@ -96,8 +119,8 @@ func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T)
frames := parseClaudeFrames(t, body)
deltas := findClaudeFrames(frames, "content_block_delta")
if len(deltas) < 2 {
t.Fatalf("expected at least 2 text deltas, got=%d body=%s", len(deltas), body)
if len(deltas) < 1 {
t.Fatalf("expected at least 1 text delta, got=%d body=%s", len(deltas), body)
}
combined := strings.Builder{}
for _, f := range deltas {
@@ -111,6 +134,52 @@ func TestHandleClaudeStreamRealtimeTextIncrementsWithEventHeaders(t *testing.T)
}
}
func TestHandleClaudeStreamRealtimeToolBufferedPlainTextDoesNotRepeatFinalText(t *testing.T) {
h := &Handler{}
want := "明白\n\nBash\nIN\npwd\nOUT\nok"
resp := makeClaudeSSEHTTPResponse(
makeClaudeContentLine(t, "明"),
makeClaudeContentLine(t, "白\n\nBash\nIN\npwd\n"),
makeClaudeContentLine(t, "OUT\nok"),
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "use tool"}}, false, false, []string{"Bash"}, nil)
frames := parseClaudeFrames(t, rec.Body.String())
if got := collectClaudeTextDeltas(frames); got != want {
t.Fatalf("unexpected combined text: got %q want %q body=%s", got, want, rec.Body.String())
}
}
func TestHandleClaudeStreamRealtimeTrimsContinuationReplay(t *testing.T) {
h := &Handler{}
prefix := strings.Repeat("A", 40)
resp := makeClaudeSSEHTTPResponse(
`data: {"p":"response/content","v":"`+prefix+`"}`,
`data: {"p":"response/content","v":"`+prefix+` tail"}`,
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", nil)
h.handleClaudeStreamRealtime(rec, req, resp, "claude-sonnet-4-5", []any{map[string]any{"role": "user", "content": "hi"}}, false, false, nil, nil)
frames := parseClaudeFrames(t, rec.Body.String())
combined := strings.Builder{}
for _, f := range findClaudeFrames(frames, "content_block_delta") {
delta, _ := f.Payload["delta"].(map[string]any)
if delta["type"] == "text_delta" {
combined.WriteString(asString(delta["text"]))
}
}
if got, want := combined.String(), prefix+" tail"; got != want {
t.Fatalf("unexpected combined text: got %q want %q body=%s", got, want, rec.Body.String())
}
}
func TestHandleClaudeStreamRealtimeThinkingDelta(t *testing.T) {
h := &Handler{}
resp := makeClaudeSSEHTTPResponse(

View File

@@ -14,7 +14,8 @@ type claudeProxyStoreStub struct {
func (s claudeProxyStoreStub) ModelAliases() map[string]string { return s.aliases }
func (claudeProxyStoreStub) CompatStripReferenceMarkers() bool { return true }
func (claudeProxyStoreStub) CurrentInputFileEnabled() bool { return true }
func (claudeProxyStoreStub) CurrentInputFileMinChars() int { return 0 }
type openAIProxyStub struct {
status int
@@ -166,7 +167,7 @@ func TestClaudeProxyViaOpenAIEnablesThinkingWhenRequested(t *testing.T) {
}
}
func TestClaudeProxyViaOpenAIKeepsStreamDefaultThinkingDisabled(t *testing.T) {
func TestClaudeProxyViaOpenAIEnablesStreamThinkingByDefault(t *testing.T) {
openAI := &openAIProxyCaptureStub{}
h := &Handler{
Store: claudeProxyStoreStub{aliases: map[string]string{"claude-sonnet-4-6": "deepseek-v4-flash"}},
@@ -178,12 +179,12 @@ func TestClaudeProxyViaOpenAIKeepsStreamDefaultThinkingDisabled(t *testing.T) {
h.Messages(rec, req)
thinking, _ := openAI.seenReq["thinking"].(map[string]any)
if thinking["type"] != "disabled" {
t.Fatalf("expected Claude stream default to keep downstream thinking disabled, got %#v", openAI.seenReq)
if thinking["type"] != "enabled" {
t.Fatalf("expected Claude stream default to enable downstream thinking, got %#v", openAI.seenReq)
}
}
func TestClaudeProxyViaOpenAIStripsThinkingBlocksFromNonStreamResponse(t *testing.T) {
func TestClaudeProxyViaOpenAIExposesThinkingBlocksByDefault(t *testing.T) {
body := `{"id":"chatcmpl_1","object":"chat.completion","created":1,"model":"claude-sonnet-4-5","choices":[{"index":0,"message":{"role":"assistant","content":null,"reasoning_content":"internal reasoning","tool_calls":[{"id":"call_1","type":"function","function":{"name":"search","arguments":"{\"q\":\"x\"}"}}]},"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":1,"completion_tokens":1,"total_tokens":2}}`
h := &Handler{OpenAI: openAIProxyStub{status: 200, body: body}}
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":"hi"}],"stream":false}`))
@@ -195,14 +196,31 @@ func TestClaudeProxyViaOpenAIStripsThinkingBlocksFromNonStreamResponse(t *testin
t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
}
got := rec.Body.String()
if strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected converted Claude response to strip thinking block, got %s", got)
if !strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected converted Claude response to expose thinking block, got %s", got)
}
if !strings.Contains(got, `"tool_use"`) {
t.Fatalf("expected converted Claude response to preserve tool_use, got %s", got)
}
}
func TestClaudeProxyViaOpenAIStripsThinkingBlocksWhenDisabled(t *testing.T) {
body := `{"id":"chatcmpl_1","object":"chat.completion","created":1,"model":"claude-sonnet-4-5","choices":[{"index":0,"message":{"role":"assistant","content":"ok","reasoning_content":"internal reasoning"},"finish_reason":"stop"}],"usage":{"prompt_tokens":1,"completion_tokens":1,"total_tokens":2}}`
h := &Handler{OpenAI: openAIProxyStub{status: 200, body: body}}
req := httptest.NewRequest(http.MethodPost, "/anthropic/v1/messages", strings.NewReader(`{"model":"claude-sonnet-4-5","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"disabled"},"stream":false}`))
rec := httptest.NewRecorder()
h.Messages(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("unexpected status: %d body=%s", rec.Code, rec.Body.String())
}
got := rec.Body.String()
if strings.Contains(got, `"type":"thinking"`) {
t.Fatalf("expected disabled thinking to strip thinking block, got %s", got)
}
}
func TestClaudeProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
openAI := &openAIProxyCaptureStub{}
h := &Handler{OpenAI: openAI}

View File

@@ -32,11 +32,11 @@ func normalizeClaudeRequest(store ConfigReader, req map[string]any) (claudeNorma
dsPayload := convertClaudeToDeepSeek(payload, store)
dsModel, _ := dsPayload["model"].(string)
_, searchEnabled, ok := config.GetModelConfig(dsModel)
defaultThinkingEnabled, searchEnabled, ok := config.GetModelConfig(dsModel)
if !ok {
searchEnabled = false
}
thinkingEnabled := util.ResolveThinkingEnabled(req, false)
thinkingEnabled := util.ResolveThinkingEnabled(req, defaultThinkingEnabled)
if config.IsNoThinkingModel(dsModel) {
thinkingEnabled = false
}

View File

@@ -8,6 +8,8 @@ import (
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
)
type claudeStreamRuntime struct {
@@ -30,11 +32,18 @@ type claudeStreamRuntime struct {
thinking strings.Builder
text strings.Builder
sieve toolstream.State
rawText strings.Builder
rawThinking strings.Builder
toolDetectionThinking strings.Builder
toolCallsDetected bool
nextBlockIndex int
thinkingBlockOpen bool
thinkingBlockIndex int
textBlockOpen bool
textBlockIndex int
textEmitted bool
ended bool
upstreamErr string
}
@@ -84,8 +93,28 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
}
contentSeen := false
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlapFromBuilder(&s.toolDetectionThinking, p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
cleanedText := cleanVisibleOutput(p.Text, s.stripReferenceMarkers)
var rawTrimmed string
if p.Type == "thinking" {
rawTrimmed = sse.TrimContinuationOverlapFromBuilder(&s.rawThinking, p.Text)
} else {
rawTrimmed = sse.TrimContinuationOverlapFromBuilder(&s.rawText, p.Text)
}
if rawTrimmed == "" {
continue
}
if p.Type == "thinking" {
s.rawThinking.WriteString(rawTrimmed)
} else {
s.rawText.WriteString(rawTrimmed)
}
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
@@ -98,7 +127,7 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
if !s.thinkingEnabled {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
trimmed := sse.TrimContinuationOverlapFromBuilder(&s.thinking, cleanedText)
if trimmed == "" {
continue
}
@@ -128,44 +157,80 @@ func (s *claudeStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed == "" {
continue
}
s.text.WriteString(trimmed)
if s.bufferToolContent {
if hasUnclosedCodeFence(s.text.String()) {
continue
s.text.WriteString(cleanedText)
if !s.bufferToolContent {
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
continue
}
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
"delta": map[string]any{
"type": "text_delta",
"text": cleanedText,
},
})
s.textBlockOpen = true
s.textEmitted = true
continue
}
events := toolstream.ProcessChunk(&s.sieve, rawTrimmed, s.toolNames)
for _, evt := range events {
if len(evt.ToolCalls) > 0 {
s.closeTextBlock()
s.toolCallsDetected = true
normalized := toolcall.NormalizeParsedToolCallsForSchemas(evt.ToolCalls, s.toolsRaw)
for _, tc := range normalized {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
continue
}
if evt.Content == "" {
continue
}
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
s.closeThinkingBlock()
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "text_delta",
"text": cleaned,
},
})
s.textEmitted = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "text_delta",
"text": trimmed,
},
})
}
return streamengine.ParsedDecision{ContentSeen: contentSeen}
}
func hasUnclosedCodeFence(text string) bool {
return strings.Count(text, "```")%2 == 1
}

View File

@@ -1,13 +1,15 @@
package claude
import (
"ds2api/internal/assistantturn"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
"encoding/json"
"fmt"
"time"
streamengine "ds2api/internal/stream"
"ds2api/internal/util"
)
func (s *claudeStreamRuntime) closeThinkingBlock() {
@@ -34,6 +36,32 @@ func (s *claudeStreamRuntime) closeTextBlock() {
s.textBlockIndex = -1
}
func (s *claudeStreamRuntime) sendToolUseBlock(idx int, tc toolcall.ParsedToolCall) {
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": idx,
"content_block": map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), idx),
"name": tc.Name,
"input": map[string]any{},
},
})
inputBytes, _ := json.Marshal(tc.Input)
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": idx,
"delta": map[string]any{
"type": "input_json_delta",
"partial_json": string(inputBytes),
},
})
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
})
}
func (s *claudeStreamRuntime) finalize(stopReason string) {
if s.ended {
return
@@ -41,49 +69,83 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
s.ended = true
s.closeThinkingBlock()
s.closeTextBlock()
finalThinking := s.thinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
if s.bufferToolContent {
detected := toolcall.ParseStandaloneToolCalls(finalText, s.toolNames)
if len(detected) == 0 && finalText == "" && finalThinking != "" {
detected = toolcall.ParseStandaloneToolCalls(finalThinking, s.toolNames)
}
if len(detected) > 0 {
detected = toolcall.NormalizeParsedToolCallsForSchemas(detected, s.toolsRaw)
stopReason = "tool_use"
for i, tc := range detected {
idx := s.nextBlockIndex + i
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": idx,
"content_block": map[string]any{
"type": "tool_use",
"id": fmt.Sprintf("toolu_%d_%d", time.Now().Unix(), idx),
"name": tc.Name,
"input": map[string]any{},
},
})
inputBytes, _ := json.Marshal(tc.Input)
for _, evt := range toolstream.Flush(&s.sieve, s.toolNames) {
if len(evt.ToolCalls) > 0 {
s.closeTextBlock()
s.toolCallsDetected = true
normalized := toolcall.NormalizeParsedToolCallsForSchemas(evt.ToolCalls, s.toolsRaw)
for _, tc := range normalized {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
if !s.textBlockOpen {
s.textBlockIndex = s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
"type": "content_block_start",
"index": s.textBlockIndex,
"content_block": map[string]any{
"type": "text",
"text": "",
},
})
s.textBlockOpen = true
}
s.send("content_block_delta", map[string]any{
"type": "content_block_delta",
"index": idx,
"index": s.textBlockIndex,
"delta": map[string]any{
"type": "input_json_delta",
"partial_json": string(inputBytes),
"type": "text_delta",
"text": cleaned,
},
})
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
})
s.textEmitted = true
}
s.nextBlockIndex += len(detected)
} else if finalText != "" {
}
}
s.closeTextBlock()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.rawText.String(),
VisibleText: s.text.String(),
RawThinking: s.rawThinking.String(),
VisibleThinking: s.thinking.String(),
DetectionThinking: s.toolDetectionThinking.String(),
AlreadyEmittedCalls: s.toolCallsDetected,
AlreadyEmittedToolRaw: s.toolCallsDetected,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.promptTokenText,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
})
finalText := turn.Text
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsDetected,
})
if s.bufferToolContent && !s.toolCallsDetected {
if len(turn.ToolCalls) > 0 {
stopReason = "tool_use"
for _, tc := range turn.ToolCalls {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.sendToolUseBlock(idx, tc)
}
} else if finalText != "" && !s.textEmitted {
idx := s.nextBlockIndex
s.nextBlockIndex++
s.send("content_block_start", map[string]any{
@@ -102,6 +164,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
"text": finalText,
},
})
s.textEmitted = true
s.send("content_block_stop", map[string]any{
"type": "content_block_stop",
"index": idx,
@@ -109,7 +172,10 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
}
}
outputTokens := util.CountOutputTokens(finalThinking, s.model) + util.CountOutputTokens(finalText, s.model)
if outcome.HasToolCalls {
stopReason = "tool_use"
}
s.send("message_delta", map[string]any{
"type": "message_delta",
"delta": map[string]any{
@@ -117,7 +183,7 @@ func (s *claudeStreamRuntime) finalize(stopReason string) {
"stop_sequence": nil,
},
"usage": map[string]any{
"output_tokens": outputTokens,
"output_tokens": outcome.Usage.OutputTokens,
},
})
s.send("message_stop", map[string]any{"type": "message_stop"})

View File

@@ -23,7 +23,8 @@ type streamStatusClaudeStoreStub struct{}
func (streamStatusClaudeStoreStub) ModelAliases() map[string]string { return nil }
func (streamStatusClaudeStoreStub) CompatStripReferenceMarkers() bool { return true }
func (streamStatusClaudeStoreStub) CurrentInputFileEnabled() bool { return true }
func (streamStatusClaudeStoreStub) CurrentInputFileMinChars() int { return 0 }
func captureClaudeStatusMiddleware(statuses *[]int) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {

View File

@@ -33,6 +33,9 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
toolsRaw := convertGeminiTools(req["tools"])
finalPrompt, toolNames := promptcompat.BuildOpenAIPromptForAdapter(messagesRaw, toolsRaw, "", thinkingEnabled)
if len(toolNames) == 0 && len(toolsRaw) > 0 {
toolNames = []string{"__any_tool__"}
}
passThrough := collectGeminiPassThrough(req)
return promptcompat.StandardRequest{
@@ -42,6 +45,7 @@ func normalizeGeminiRequest(store ConfigReader, routeModel string, req map[strin
ResponseModel: requestedModel,
Messages: messagesRaw,
PromptTokenText: finalPrompt,
ToolsRaw: toolsRaw,
FinalPrompt: finalPrompt,
ToolNames: toolNames,
Stream: stream,

View File

@@ -17,12 +17,14 @@ type AuthResolver interface {
type DeepSeekCaller interface {
CreateSession(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
GetPow(ctx context.Context, a *auth.RequestAuth, maxAttempts int) (string, error)
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
CallCompletion(ctx context.Context, a *auth.RequestAuth, payload map[string]any, powResp string, maxAttempts int) (*http.Response, error)
}
type ConfigReader interface {
ModelAliases() map[string]string
CompatStripReferenceMarkers() bool
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type OpenAIChatRunner interface {

View File

@@ -2,8 +2,8 @@ package gemini
import (
"bytes"
"ds2api/internal/toolcall"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
@@ -11,7 +11,13 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/httpapi/requestbody"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
"ds2api/internal/toolcall"
"ds2api/internal/translatorcliproxy"
"ds2api/internal/util"
@@ -19,20 +25,94 @@ import (
)
func (h *Handler) handleGenerateContent(w http.ResponseWriter, r *http.Request, stream bool) {
if h.OpenAI == nil {
writeGeminiError(w, http.StatusInternalServerError, "OpenAI proxy backend unavailable.")
if isGeminiVercelProxyRequest(r) && h.proxyViaOpenAI(w, r, stream) {
return
}
if h.proxyViaOpenAI(w, r, stream) {
if h.Auth == nil || h.DS == nil {
if h.OpenAI != nil && h.proxyViaOpenAI(w, r, stream) {
return
}
writeGeminiError(w, http.StatusInternalServerError, "Gemini runtime backend unavailable.")
return
}
writeGeminiError(w, http.StatusBadGateway, "Failed to proxy Gemini request.")
if h.handleGeminiDirect(w, r, stream) {
return
}
writeGeminiError(w, http.StatusBadGateway, "Failed to handle Gemini request.")
}
func isGeminiVercelProxyRequest(r *http.Request) bool {
if r == nil || r.URL == nil {
return false
}
return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1" ||
strings.TrimSpace(r.URL.Query().Get("__stream_release")) == "1"
}
func (h *Handler) handleGeminiDirect(w http.ResponseWriter, r *http.Request, stream bool) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeGeminiError(w, http.StatusBadRequest, "invalid json")
} else {
writeGeminiError(w, http.StatusBadRequest, "invalid body")
}
return true
}
routeModel := strings.TrimSpace(chi.URLParam(r, "model"))
var req map[string]any
if err := json.Unmarshal(raw, &req); err != nil {
writeGeminiError(w, http.StatusBadRequest, "invalid json")
return true
}
stdReq, err := normalizeGeminiRequest(h.Store, routeModel, req, stream)
if err != nil {
writeGeminiError(w, http.StatusBadRequest, err.Error())
return true
}
a, err := h.Auth.Determine(r)
if err != nil {
writeGeminiError(w, http.StatusUnauthorized, err.Error())
return true
}
defer h.Auth.Release(a)
if stream {
h.handleGeminiDirectStream(w, r, a, stdReq)
return true
}
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeGeminiError(w, outErr.Status, outErr.Message)
return true
}
writeJSON(w, http.StatusOK, buildGeminiGenerateContentResponseFromTurn(result.Turn))
return true
}
func (h *Handler) handleGeminiDirectStream(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeGeminiError(w, outErr.Status, outErr.Message)
return
}
streamReq := start.Request
h.handleStreamGenerateContent(w, r, start.Response, streamReq.ResponseModel, streamReq.PromptTokenText, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw)
}
func (h *Handler) proxyViaOpenAI(w http.ResponseWriter, r *http.Request, stream bool) bool {
raw, err := io.ReadAll(r.Body)
if err != nil {
writeGeminiError(w, http.StatusBadRequest, "invalid body")
if errors.Is(err, requestbody.ErrInvalidUTF8Body) {
writeGeminiError(w, http.StatusBadRequest, "invalid json")
} else {
writeGeminiError(w, http.StatusBadRequest, "invalid body")
}
return true
}
routeModel := strings.TrimSpace(chi.URLParam(r, "model"))
@@ -214,7 +294,7 @@ func (h *Handler) handleNonStreamGenerateContent(w http.ResponseWriter, resp *ht
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
writeJSON(w, http.StatusOK, buildGeminiGenerateContentResponse(
model,
finalPrompt,
@@ -244,6 +324,60 @@ func buildGeminiGenerateContentResponse(model, finalPrompt, finalThinking, final
}
}
func buildGeminiGenerateContentResponseFromTurn(turn assistantturn.Turn) map[string]any {
parts := buildGeminiPartsFromTurn(turn)
return map[string]any{
"candidates": []map[string]any{
{
"index": 0,
"content": map[string]any{
"role": "model",
"parts": parts,
},
"finishReason": "STOP",
},
},
"modelVersion": turn.Model,
"usageMetadata": map[string]any{
"promptTokenCount": turn.Usage.InputTokens,
"candidatesTokenCount": turn.Usage.OutputTokens,
"totalTokenCount": turn.Usage.TotalTokens,
},
}
}
func buildGeminiPartsFromTurn(turn assistantturn.Turn) []map[string]any {
thinkingPart := func() []map[string]any {
if turn.Thinking == "" {
return nil
}
return []map[string]any{{"text": turn.Thinking, "thought": true}}
}
if len(turn.ToolCalls) > 0 {
parts := thinkingPart()
if parts == nil {
parts = make([]map[string]any, 0, len(turn.ToolCalls))
}
for _, tc := range turn.ToolCalls {
parts = append(parts, map[string]any{
"functionCall": map[string]any{
"name": tc.Name,
"args": tc.Input,
},
})
}
return parts
}
parts := thinkingPart()
if turn.Text != "" {
parts = append(parts, map[string]any{"text": turn.Text})
}
if len(parts) == 0 {
parts = append(parts, map[string]any{"text": ""})
}
return parts
}
//nolint:unused // retained for native Gemini non-stream handling path.
func buildGeminiUsage(model, finalPrompt, finalThinking, finalText string) map[string]any {
promptTokens := util.CountPromptTokens(finalPrompt, model)
@@ -262,8 +396,17 @@ func buildGeminiPartsFromFinal(finalText, finalThinking string, toolNames []stri
if len(detected) == 0 && finalThinking != "" {
detected = toolcall.ParseToolCalls(finalThinking, toolNames)
}
thinkingPart := func() []map[string]any {
if finalThinking == "" {
return nil
}
return []map[string]any{{"text": finalThinking, "thought": true}}
}
if len(detected) > 0 {
parts := make([]map[string]any, 0, len(detected))
parts := thinkingPart()
if parts == nil {
parts = make([]map[string]any, 0, len(detected))
}
for _, tc := range detected {
parts = append(parts, map[string]any{
"functionCall": map[string]any{
@@ -275,9 +418,12 @@ func buildGeminiPartsFromFinal(finalText, finalThinking string, toolNames []stri
return parts
}
text := finalText
if text == "" {
text = finalThinking
parts := thinkingPart()
if finalText != "" {
parts = append(parts, map[string]any{"text": finalText})
}
return []map[string]any{{"text": text}}
if len(parts) == 0 {
parts = append(parts, map[string]any{"text": ""})
}
return parts
}

View File

@@ -5,6 +5,7 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/textclean"
"ds2api/internal/util"
)
@@ -18,11 +19,8 @@ type Handler struct {
}
//nolint:unused // used by native Gemini stream/non-stream runtime helpers.
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil || h.Store == nil {
return true
}
return h.Store.CompatStripReferenceMarkers()
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func RegisterRoutes(r chi.Router, h *Handler) {

View File

@@ -7,13 +7,14 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
dsprotocol "ds2api/internal/deepseek/protocol"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
)
//nolint:unused // retained for native Gemini stream handling path.
func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Request, resp *http.Response, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Request, resp *http.Response, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any) {
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
@@ -28,7 +29,7 @@ func (h *Handler) handleStreamGenerateContent(w http.ResponseWriter, r *http.Req
rc := http.NewResponseController(w)
_, canFlush := w.(http.Flusher)
runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, h.compatStripReferenceMarkers(), toolNames)
runtime := newGeminiStreamRuntime(w, rc, canFlush, model, finalPrompt, thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw)
initialType := "text"
if thinkingEnabled {
@@ -64,9 +65,11 @@ type geminiStreamRuntime struct {
bufferContent bool
stripReferenceMarkers bool
toolNames []string
toolsRaw any
thinking strings.Builder
text strings.Builder
accumulator *assistantturn.Accumulator
contentFilter bool
responseMessageID int
}
//nolint:unused // retained for native Gemini stream handling path.
@@ -80,6 +83,7 @@ func newGeminiStreamRuntime(
searchEnabled bool,
stripReferenceMarkers bool,
toolNames []string,
toolsRaw any,
) *geminiStreamRuntime {
return &geminiStreamRuntime{
w: w,
@@ -92,6 +96,12 @@ func newGeminiStreamRuntime(
bufferContent: len(toolNames) > 0,
stripReferenceMarkers: stripReferenceMarkers,
toolNames: toolNames,
toolsRaw: toolsRaw,
accumulator: assistantturn.NewAccumulator(assistantturn.AccumulatorOptions{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
}),
}
}
@@ -111,35 +121,39 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
if !parsed.Parsed {
return streamengine.ParsedDecision{}
}
if parsed.ResponseMessageID > 0 {
s.responseMessageID = parsed.ResponseMessageID
}
if parsed.ContentFilter || parsed.ErrorMessage != "" || parsed.Stop {
if parsed.ContentFilter {
s.contentFilter = true
}
return streamengine.ParsedDecision{Stop: true}
}
contentSeen := false
for _, p := range parsed.Parts {
cleanedText := cleanVisibleOutput(p.Text, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
if p.Type != "thinking" && s.searchEnabled && sse.IsCitation(cleanedText) {
continue
}
contentSeen = true
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
if s.thinkingEnabled {
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
if p.VisibleText == "" || s.bufferContent {
continue
}
s.sendChunk(map[string]any{
"candidates": []map[string]any{
{
"index": 0,
"content": map[string]any{
"role": "model",
"parts": []map[string]any{{"text": p.VisibleText, "thought": true}},
},
},
},
"modelVersion": s.model,
})
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed == "" {
if p.RawText == "" || p.CitationOnly || p.VisibleText == "" {
continue
}
s.text.WriteString(trimmed)
if s.bufferContent {
continue
}
@@ -149,23 +163,39 @@ func (s *geminiStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Parse
"index": 0,
"content": map[string]any{
"role": "model",
"parts": []map[string]any{{"text": trimmed}},
"parts": []map[string]any{{"text": p.VisibleText}},
},
},
},
"modelVersion": s.model,
})
}
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}
//nolint:unused // retained for native Gemini stream handling path.
func (s *geminiStreamRuntime) finalize() {
finalThinking := s.thinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
rawText, text, rawThinking, thinking, detectionThinking := s.accumulator.Snapshot()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: rawText,
VisibleText: text,
RawThinking: rawThinking,
VisibleThinking: thinking,
DetectionThinking: detectionThinking,
ContentFilter: s.contentFilter,
ResponseMessageID: s.responseMessageID,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
})
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if s.bufferContent {
parts := buildGeminiPartsFromFinal(finalText, finalThinking, s.toolNames)
parts := buildGeminiPartsFromTurn(turn)
s.sendChunk(map[string]any{
"candidates": []map[string]any{
{
@@ -193,7 +223,11 @@ func (s *geminiStreamRuntime) finalize() {
"finishReason": "STOP",
},
},
"modelVersion": s.model,
"usageMetadata": buildGeminiUsage(s.model, s.finalPrompt, finalThinking, finalText),
"modelVersion": s.model,
"usageMetadata": map[string]any{
"promptTokenCount": outcome.Usage.InputTokens,
"candidatesTokenCount": outcome.Usage.OutputTokens,
"totalTokenCount": outcome.Usage.TotalTokens,
},
})
}

View File

@@ -13,12 +13,14 @@ import (
"github.com/go-chi/chi/v5"
"ds2api/internal/auth"
dsclient "ds2api/internal/deepseek/client"
)
type testGeminiConfig struct{}
func (testGeminiConfig) ModelAliases() map[string]string { return nil }
func (testGeminiConfig) CompatStripReferenceMarkers() bool { return true }
func (testGeminiConfig) ModelAliases() map[string]string { return nil }
func (testGeminiConfig) CurrentInputFileEnabled() bool { return true }
func (testGeminiConfig) CurrentInputFileMinChars() int { return 0 }
type testGeminiAuth struct {
a *auth.RequestAuth
@@ -44,22 +46,31 @@ func (testGeminiAuth) Release(_ *auth.RequestAuth) {}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
type testGeminiDS struct {
resp *http.Response
err error
resp *http.Response
err error
uploadCalls []dsclient.UploadFileRequest
payloads []map[string]any
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
func (m *testGeminiDS) CreateSession(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
return "session-id", nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
func (m *testGeminiDS) GetPow(_ context.Context, _ *auth.RequestAuth, _ int) (string, error) {
return "pow", nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m testGeminiDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
func (m *testGeminiDS) UploadFile(_ context.Context, _ *auth.RequestAuth, req dsclient.UploadFileRequest, _ int) (*dsclient.UploadFileResult, error) {
m.uploadCalls = append(m.uploadCalls, req)
return &dsclient.UploadFileResult{ID: "file-gemini-history"}, nil
}
//nolint:unused // reserved test double for native Gemini DS-call path coverage.
func (m *testGeminiDS) CallCompletion(_ context.Context, _ *auth.RequestAuth, payload map[string]any, _ string, _ int) (*http.Response, error) {
m.payloads = append(m.payloads, payload)
if m.err != nil {
return nil, m.err
}
@@ -123,6 +134,46 @@ func makeGeminiUpstreamResponse(lines ...string) *http.Response {
}
}
func TestGeminiDirectAppliesCurrentInputFile(t *testing.T) {
ds := &testGeminiDS{
resp: makeGeminiUpstreamResponse(`data: {"p":"response/content","v":"ok"}`),
}
h := &Handler{
Store: testGeminiConfig{},
Auth: testGeminiAuth{},
DS: ds,
}
reqBody := `{"contents":[{"role":"user","parts":[{"text":"hello from gemini"}]}]}`
req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:generateContent", strings.NewReader(reqBody))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
r := chi.NewRouter()
RegisterRoutes(r, h)
r.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected one current input upload, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", ds.uploadCalls[0].Filename)
}
if len(ds.payloads) != 1 {
t.Fatalf("expected one completion payload, got %d", len(ds.payloads))
}
refIDs, _ := ds.payloads[0]["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-gemini-history" {
t.Fatalf("expected uploaded history ref id, got %#v", ds.payloads[0]["ref_file_ids"])
}
prompt, _ := ds.payloads[0]["prompt"].(string)
if !strings.Contains(prompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %q", prompt)
}
}
func TestGeminiRoutesRegistered(t *testing.T) {
h := &Handler{
Store: testGeminiConfig{},
@@ -257,6 +308,56 @@ func TestStreamGenerateContentEmitsSSE(t *testing.T) {
}
}
func TestNativeStreamGenerateContentEmitsThoughtParts(t *testing.T) {
h := &Handler{}
resp := makeGeminiUpstreamResponse(
`data: {"p":"response/thinking_content","v":"think"}`,
`data: {"p":"response/content","v":"answer"}`,
`data: [DONE]`,
)
rec := httptest.NewRecorder()
req := httptest.NewRequest(http.MethodPost, "/v1beta/models/gemini-2.5-pro:streamGenerateContent", nil)
h.handleStreamGenerateContent(rec, req, resp, "gemini-2.5-pro", "prompt", true, false, nil, nil)
frames := extractGeminiSSEFrames(t, rec.Body.String())
if len(frames) < 2 {
t.Fatalf("expected thought and text stream frames, body=%s", rec.Body.String())
}
var gotThought, gotText string
for _, frame := range frames {
for _, part := range geminiPartsFromFrame(frame) {
if part["thought"] == true {
gotThought += asString(part["text"])
} else {
gotText += asString(part["text"])
}
}
}
if gotThought != "think" {
t.Fatalf("expected thought part, got %q body=%s", gotThought, rec.Body.String())
}
if !strings.Contains(gotText, "answer") {
t.Fatalf("expected text part answer, got %q body=%s", gotText, rec.Body.String())
}
}
func TestBuildGeminiPartsFromFinalIncludesThoughtPart(t *testing.T) {
parts := buildGeminiPartsFromFinal("answer", "think", nil)
if len(parts) != 2 {
t.Fatalf("expected thought + answer parts, got %#v", parts)
}
if parts[0]["thought"] != true || parts[0]["text"] != "think" {
t.Fatalf("expected first part to be thought, got %#v", parts[0])
}
if _, ok := parts[1]["thought"]; ok {
t.Fatalf("expected second part to be visible text, got %#v", parts[1])
}
if parts[1]["text"] != "answer" {
t.Fatalf("expected answer text, got %#v", parts[1])
}
}
func TestGeminiProxyTranslatesInlineImageToOpenAIDataURL(t *testing.T) {
openAI := &geminiOpenAISuccessStub{}
h := &Handler{Store: testGeminiConfig{}, OpenAI: openAI}
@@ -396,3 +497,21 @@ func extractGeminiSSEFrames(t *testing.T, body string) []map[string]any {
}
return out
}
func geminiPartsFromFrame(frame map[string]any) []map[string]any {
candidates, _ := frame["candidates"].([]any)
if len(candidates) == 0 {
return nil
}
c0, _ := candidates[0].(map[string]any)
content, _ := c0["content"].(map[string]any)
rawParts, _ := content["parts"].([]any)
parts := make([]map[string]any, 0, len(rawParts))
for _, raw := range rawParts {
part, _ := raw.(map[string]any)
if part != nil {
parts = append(parts, part)
}
}
return parts
}

View File

@@ -57,7 +57,7 @@ func blockChatHistoryDetailDir(t *testing.T, detailDir string) func() {
func TestChatCompletionsNonStreamPersistsHistory(t *testing.T) {
historyStore := newTestChatHistoryStore(t)
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -216,7 +216,7 @@ func TestHandleStreamContextCancelledMarksHistoryStopped(t *testing.T) {
func TestChatCompletionsSkipsAdminWebUISource(t *testing.T) {
historyStore := newTestChatHistoryStore(t)
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -248,7 +248,7 @@ func TestChatCompletionsSkipsHistoryWhenDisabled(t *testing.T) {
t.Fatalf("disable history store failed: %v", err)
}
h := &Handler{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello world"}`, `data: [DONE]`)},
ChatHistory: historyStore,
@@ -278,7 +278,6 @@ func TestChatCompletionsCurrentInputFilePersistsNeutralPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -311,16 +310,16 @@ func TestChatCompletionsCurrentInputFilePersistsNeutralPrompt(t *testing.T) {
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected current input upload to happen, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].Filename != "history.txt" {
t.Fatalf("expected history.txt upload, got %q", ds.uploadCalls[0].Filename)
if ds.uploadCalls[0].Filename != "DS2API_HISTORY.txt" {
t.Fatalf("expected DS2API_HISTORY.txt upload, got %q", ds.uploadCalls[0].Filename)
}
if full.HistoryText != string(ds.uploadCalls[0].Data) {
t.Fatalf("expected uploaded current input file to be persisted in history text")
}
if len(full.Messages) != 1 {
t.Fatalf("expected neutral prompt to be the only persisted message, got %#v", full.Messages)
t.Fatalf("expected continuation prompt to be the only persisted message, got %#v", full.Messages)
}
if !strings.Contains(full.Messages[0].Content, "Answer the latest user request directly.") {
t.Fatalf("expected neutral prompt to be persisted, got %#v", full.Messages[0])
if !strings.Contains(full.Messages[0].Content, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt to be persisted, got %#v", full.Messages[0])
}
}

View File

@@ -5,7 +5,10 @@ import (
"net/http"
"strings"
"ds2api/internal/assistantturn"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolstream"
@@ -23,6 +26,7 @@ type chatStreamRuntime struct {
refFileTokens int
toolNames []string
toolsRaw any
toolChoice promptcompat.ToolChoicePolicy
thinkingEnabled bool
searchEnabled bool
@@ -34,15 +38,11 @@ type chatStreamRuntime struct {
toolCallsEmitted bool
toolCallsDoneEmitted bool
toolSieve toolstream.State
streamToolCallIDs map[int]string
streamToolNames map[int]string
rawThinking strings.Builder
thinking strings.Builder
toolDetectionThinking strings.Builder
rawText strings.Builder
text strings.Builder
responseMessageID int
toolSieve toolstream.State
streamToolCallIDs map[int]string
streamToolNames map[int]string
accumulator shared.StreamAccumulator
responseMessageID int
finalThinking string
finalText string
@@ -92,6 +92,7 @@ func newChatStreamRuntime(
stripReferenceMarkers bool,
toolNames []string,
toolsRaw any,
toolChoice promptcompat.ToolChoicePolicy,
bufferToolContent bool,
emitEarlyToolDeltas bool,
) *chatStreamRuntime {
@@ -105,6 +106,7 @@ func newChatStreamRuntime(
finalPrompt: finalPrompt,
toolNames: toolNames,
toolsRaw: toolsRaw,
toolChoice: toolChoice,
thinkingEnabled: thinkingEnabled,
searchEnabled: searchEnabled,
stripReferenceMarkers: stripReferenceMarkers,
@@ -112,6 +114,11 @@ func newChatStreamRuntime(
emitEarlyToolDeltas: emitEarlyToolDeltas,
streamToolCallIDs: map[int]string{},
streamToolNames: map[int]string{},
accumulator: shared.StreamAccumulator{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
},
}
}
@@ -120,7 +127,13 @@ func (s *chatStreamRuntime) sendKeepAlive() {
return
}
_, _ = s.w.Write([]byte(": keep-alive\n\n"))
_ = s.rc.Flush()
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{},
nil,
))
}
func (s *chatStreamRuntime) sendChunk(v any) {
@@ -173,6 +186,15 @@ func (s *chatStreamRuntime) sendFailedChunk(status int, message, code string) {
s.sendDone()
}
func (s *chatStreamRuntime) markContextCancelled() {
s.finalErrorStatus = 499
s.finalErrorMessage = "request context cancelled"
s.finalErrorCode = string(streamengine.StopReasonContextCancelled)
s.finalThinking = s.accumulator.Thinking.String()
s.finalText = cleanVisibleOutput(s.accumulator.Text.String(), s.stripReferenceMarkers)
s.finalFinishReason = string(streamengine.StopReasonContextCancelled)
}
func (s *chatStreamRuntime) resetStreamToolCallState() {
s.streamToolCallIDs = map[int]string{}
s.streamToolNames = map[int]string{}
@@ -182,16 +204,34 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
s.finalErrorStatus = 0
s.finalErrorMessage = ""
s.finalErrorCode = ""
finalThinking := s.thinking.String()
finalToolDetectionThinking := s.toolDetectionThinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
s.finalThinking = finalThinking
s.finalText = finalText
detected := detectAssistantToolCalls(s.rawText.String(), finalText, s.rawThinking.String(), finalToolDetectionThinking, s.toolNames)
if len(detected.Calls) > 0 && !s.toolCallsDoneEmitted {
finishReason = "tool_calls"
finalThinking := s.accumulator.Thinking.String()
finalToolDetectionThinking := s.accumulator.ToolDetectionThinking.String()
finalText := s.accumulator.Text.String()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.accumulator.RawText.String(),
VisibleText: finalText,
RawThinking: s.accumulator.RawThinking.String(),
VisibleThinking: finalThinking,
DetectionThinking: finalToolDetectionThinking,
ContentFilter: finishReason == "content_filter",
ResponseMessageID: s.responseMessageID,
AlreadyEmittedCalls: s.toolCallsEmitted,
AlreadyEmittedToolRaw: s.toolCallsDoneEmitted,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
RefFileTokens: s.refFileTokens,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
ToolChoice: s.toolChoice,
})
s.finalThinking = turn.Thinking
s.finalText = turn.Text
if len(turn.ToolCalls) > 0 && !s.toolCallsDoneEmitted {
s.sendDelta(map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(detected.Calls, s.streamToolCallIDs, s.toolsRaw),
"tool_calls": formatFinalStreamToolCallsWithStableIDs(turn.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
})
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
@@ -200,7 +240,6 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
for _, evt := range toolstream.Flush(&s.toolSieve, s.toolNames) {
if len(evt.ToolCalls) > 0 {
batch.flush()
finishReason = "tool_calls"
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
s.sendDelta(map[string]any{
@@ -220,11 +259,11 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
batch.flush()
}
if len(detected.Calls) > 0 || s.toolCallsEmitted {
finishReason = "tool_calls"
}
if len(detected.Calls) == 0 && !s.toolCallsEmitted && strings.TrimSpace(finalText) == "" {
status, message, code := upstreamEmptyOutputDetail(finishReason == "content_filter", finalText, finalThinking)
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsEmitted || s.toolCallsDoneEmitted,
})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if deferEmptyOutput {
s.finalErrorStatus = status
s.finalErrorMessage = message
@@ -234,14 +273,14 @@ func (s *chatStreamRuntime) finalize(finishReason string, deferEmptyOutput bool)
s.sendFailedChunk(status, message, code)
return true
}
usage := openaifmt.BuildChatUsageForModel(s.model, s.finalPrompt, finalThinking, finalText, s.refFileTokens)
s.finalFinishReason = finishReason
usage := assistantturn.OpenAIChatUsage(turn)
s.finalFinishReason = outcome.FinishReason
s.finalUsage = usage
s.sendChunk(openaifmt.BuildChatStreamChunk(
s.completionID,
s.created,
s.model,
[]map[string]any{openaifmt.BuildChatStreamFinishChoice(0, finishReason)},
[]map[string]any{openaifmt.BuildChatStreamFinishChoice(0, outcome.FinishReason)},
usage,
))
s.sendDone()
@@ -256,7 +295,7 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
s.responseMessageID = parsed.ResponseMessageID
}
if parsed.ContentFilter {
if strings.TrimSpace(s.text.String()) == "" {
if strings.TrimSpace(s.accumulator.Text.String()) == "" {
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReason("content_filter")}
}
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
@@ -268,98 +307,65 @@ func (s *chatStreamRuntime) onParsed(parsed sse.LineResult) streamengine.ParsedD
return streamengine.ParsedDecision{Stop: true, StopReason: streamengine.StopReasonHandlerRequested}
}
contentSeen := false
batch := chatDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
rawTrimmed := sse.TrimContinuationOverlap(s.rawThinking.String(), p.Text)
if rawTrimmed != "" {
s.rawThinking.WriteString(rawTrimmed)
contentSeen = true
}
if s.thinkingEnabled {
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
batch.append("reasoning_content", trimmed)
}
batch.append("reasoning_content", p.VisibleText)
continue
}
if p.RawText == "" {
continue
}
if p.CitationOnly {
continue
}
if !s.bufferToolContent {
batch.append("content", p.VisibleText)
} else {
rawTrimmed := sse.TrimContinuationOverlap(s.rawText.String(), p.Text)
if rawTrimmed == "" {
continue
}
s.rawText.WriteString(rawTrimmed)
contentSeen = true
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if s.searchEnabled && sse.IsCitation(cleanedText) {
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed != "" {
s.text.WriteString(trimmed)
}
if !s.bufferToolContent {
if trimmed == "" {
events := toolstream.ProcessChunk(&s.toolSieve, p.RawText, s.toolNames)
for _, evt := range events {
if len(evt.ToolCallDeltas) > 0 {
if !s.emitEarlyToolDeltas {
continue
}
filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.streamToolNames)
if len(filtered) == 0 {
continue
}
formatted := formatIncrementalStreamToolCallDeltas(filtered, s.streamToolCallIDs)
if len(formatted) == 0 {
continue
}
batch.flush()
tcDelta := map[string]any{
"tool_calls": formatted,
}
s.toolCallsEmitted = true
s.sendDelta(tcDelta)
continue
}
batch.append("content", trimmed)
} else {
events := toolstream.ProcessChunk(&s.toolSieve, rawTrimmed, s.toolNames)
for _, evt := range events {
if len(evt.ToolCallDeltas) > 0 {
if !s.emitEarlyToolDeltas {
continue
}
filtered := filterIncrementalToolCallDeltasByAllowed(evt.ToolCallDeltas, s.streamToolNames)
if len(filtered) == 0 {
continue
}
formatted := formatIncrementalStreamToolCallDeltas(filtered, s.streamToolCallIDs)
if len(formatted) == 0 {
continue
}
batch.flush()
tcDelta := map[string]any{
"tool_calls": formatted,
}
s.toolCallsEmitted = true
s.sendDelta(tcDelta)
if len(evt.ToolCalls) > 0 {
batch.flush()
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(evt.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
}
s.sendDelta(tcDelta)
s.resetStreamToolCallState()
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
if len(evt.ToolCalls) > 0 {
batch.flush()
s.toolCallsEmitted = true
s.toolCallsDoneEmitted = true
tcDelta := map[string]any{
"tool_calls": formatFinalStreamToolCallsWithStableIDs(evt.ToolCalls, s.streamToolCallIDs, s.toolsRaw),
}
s.sendDelta(tcDelta)
s.resetStreamToolCallState()
continue
}
if evt.Content != "" {
cleaned := cleanVisibleOutput(evt.Content, s.stripReferenceMarkers)
if cleaned == "" || (s.searchEnabled && sse.IsCitation(cleaned)) {
continue
}
batch.append("content", cleaned)
}
batch.append("content", cleaned)
}
}
}
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}

View File

@@ -0,0 +1,87 @@
package chat
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"ds2api/internal/promptcompat"
)
func TestChatStreamKeepAliveEmitsEmptyChoiceDataFrame(t *testing.T) {
rec := httptest.NewRecorder()
runtime := newChatStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"chatcmpl-test",
time.Now().Unix(),
"deepseek-v4-flash",
"prompt",
false,
false,
true,
nil,
nil,
promptcompat.DefaultToolChoicePolicy(),
false,
false,
)
runtime.sendKeepAlive()
body := rec.Body.String()
if !strings.Contains(body, ": keep-alive\n\n") {
t.Fatalf("expected keep-alive comment, got %q", body)
}
frames, done := parseSSEDataFrames(t, body)
if done {
t.Fatalf("keep-alive must not emit [DONE], body=%q", body)
}
if len(frames) != 1 {
t.Fatalf("expected one data frame, got %d body=%q", len(frames), body)
}
if got := asString(frames[0]["id"]); got != "chatcmpl-test" {
t.Fatalf("expected completion id to be preserved, got %q", got)
}
if got := asString(frames[0]["object"]); got != "chat.completion.chunk" {
t.Fatalf("expected chat chunk object, got %q", got)
}
choices, _ := frames[0]["choices"].([]any)
if len(choices) != 0 {
t.Fatalf("expected empty choices heartbeat, got %#v", choices)
}
}
func TestChatStreamFinalizeEnforcesRequiredToolChoice(t *testing.T) {
rec := httptest.NewRecorder()
runtime := newChatStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"chatcmpl-test",
time.Now().Unix(),
"deepseek-v4-flash",
"prompt",
false,
false,
true,
[]string{"Write"},
nil,
promptcompat.ToolChoicePolicy{Mode: promptcompat.ToolChoiceRequired},
true,
false,
)
if !runtime.finalize("stop", false) {
t.Fatalf("expected terminal error to be written")
}
if runtime.finalErrorCode != "tool_choice_violation" {
t.Fatalf("expected tool_choice_violation, got %q body=%s", runtime.finalErrorCode, rec.Body.String())
}
if !strings.Contains(rec.Body.String(), "tool_choice requires") {
t.Fatalf("expected tool choice error in stream body, got %s", rec.Body.String())
}
}

View File

@@ -7,10 +7,12 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
)
@@ -26,6 +28,7 @@ type chatNonStreamResult struct {
body map[string]any
finishReason string
responseMessageID int
outputError *assistantturn.OutputError
}
func (h *Handler) handleNonStreamWithRetry(w http.ResponseWriter, ctx context.Context, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) {
@@ -86,35 +89,40 @@ func (h *Handler) collectChatNonStreamAttempt(w http.ResponseWriter, resp *http.
return chatNonStreamResult{}, false
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
finalText = replaceCitationMarkersWithLinks(finalText, result.CitationLinks)
}
detected := detectAssistantToolCalls(result.Text, finalText, result.Thinking, result.ToolDetectionThinking, toolNames)
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, finalThinking, finalText, detected.Calls, toolsRaw)
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: usagePrompt,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
})
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, usagePrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
return chatNonStreamResult{
rawThinking: result.Thinking,
rawText: result.Text,
thinking: finalThinking,
thinking: turn.Thinking,
toolDetectionThinking: result.ToolDetectionThinking,
text: finalText,
text: turn.Text,
contentFilter: result.ContentFilter,
detectedCalls: len(detected.Calls),
detectedCalls: len(turn.ToolCalls),
body: respBody,
finishReason: chatFinishReason(respBody),
responseMessageID: result.ResponseMessageID,
outputError: turn.Error,
}, true
}
func (h *Handler) finishChatNonStreamResult(w http.ResponseWriter, result chatNonStreamResult, attempts int, usagePrompt string, refFileTokens int, historySession *chatHistorySession) {
if result.detectedCalls == 0 && shouldWriteUpstreamEmptyOutputError(result.text) {
if result.detectedCalls == 0 && strings.TrimSpace(result.text) == "" {
status, message, code := upstreamEmptyOutputDetail(result.contentFilter, result.text, result.thinking)
if result.outputError != nil {
status, message, code = result.outputError.Status, result.outputError.Message, result.outputError.Code
}
if historySession != nil {
historySession.error(status, message, code, result.thinking, result.text)
}
writeUpstreamEmptyOutputError(w, result.text, result.thinking, result.contentFilter)
writeOpenAIErrorWithCode(w, status, message, code)
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "chat.completions", "stream", false, "retry_attempts", attempts, "success_source", "none", "content_filter", result.contentFilter)
return
}
@@ -146,8 +154,8 @@ func shouldRetryChatNonStream(result chatNonStreamResult, attempts int) bool {
strings.TrimSpace(result.text) == ""
}
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) {
streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, historySession)
func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) {
streamRuntime, initialType, ok := h.prepareChatStreamRuntime(w, resp, completionID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, historySession)
if !ok {
return
}
@@ -189,7 +197,7 @@ func (h *Handler) handleStreamWithRetry(w http.ResponseWriter, r *http.Request,
}
}
func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Response, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, historySession *chatHistorySession) (*chatStreamRuntime, string, bool) {
func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Response, completionID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, historySession *chatHistorySession) (*chatStreamRuntime, string, bool) {
if resp.StatusCode != http.StatusOK {
defer func() { _ = resp.Body.Close() }()
body, _ := io.ReadAll(resp.Body)
@@ -214,7 +222,8 @@ func (h *Handler) prepareChatStreamRuntime(w http.ResponseWriter, resp *http.Res
}
streamRuntime := newChatStreamRuntime(
w, rc, canFlush, completionID, time.Now().Unix(), model, finalPrompt,
thinkingEnabled, searchEnabled, h.compatStripReferenceMarkers(), toolNames, toolsRaw,
thinkingEnabled, searchEnabled, stripReferenceMarkersEnabled(), toolNames, toolsRaw,
toolChoice,
len(toolNames) > 0, h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence(),
)
streamRuntime.refFileTokens = refFileTokens
@@ -237,7 +246,7 @@ func (h *Handler) consumeChatStreamAttempt(r *http.Request, resp *http.Response,
OnParsed: func(parsed sse.LineResult) streamengine.ParsedDecision {
decision := streamRuntime.onParsed(parsed)
if historySession != nil {
historySession.progress(streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.progress(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
return decision
},
@@ -247,11 +256,15 @@ func (h *Handler) consumeChatStreamAttempt(r *http.Request, resp *http.Response,
}
},
OnContextDone: func() {
streamRuntime.markContextCancelled()
if historySession != nil {
historySession.stopped(streamRuntime.thinking.String(), streamRuntime.text.String(), string(streamengine.StopReasonContextCancelled))
historySession.stopped(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String(), string(streamengine.StopReasonContextCancelled))
}
},
})
if streamRuntime.finalErrorCode == string(streamengine.StopReasonContextCancelled) {
return true, false
}
terminalWritten := streamRuntime.finalize(finalReason, allowDeferEmpty && finalReason != "content_filter")
if terminalWritten {
recordChatStreamHistory(streamRuntime, historySession)
@@ -265,7 +278,7 @@ func recordChatStreamHistory(streamRuntime *chatStreamRuntime, historySession *c
return
}
if streamRuntime.finalErrorMessage != "" {
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
return
}
historySession.success(http.StatusOK, streamRuntime.finalThinking, streamRuntime.finalText, streamRuntime.finalFinishReason, streamRuntime.finalUsage)
@@ -274,7 +287,7 @@ func recordChatStreamHistory(streamRuntime *chatStreamRuntime, historySession *c
func failChatStreamRetry(streamRuntime *chatStreamRuntime, historySession *chatHistorySession, status int, message, code string) {
streamRuntime.sendFailedChunk(status, message, code)
if historySession != nil {
historySession.error(status, message, code, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(status, message, code, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
}
@@ -283,6 +296,10 @@ func logChatStreamTerminal(streamRuntime *chatStreamRuntime, attempts int) {
if attempts > 0 {
source = "synthetic_retry"
}
if streamRuntime.finalErrorCode == string(streamengine.StopReasonContextCancelled) {
config.Logger.Info("[openai_empty_retry] terminal cancelled", "surface", "chat.completions", "stream", true, "retry_attempts", attempts, "error_code", streamRuntime.finalErrorCode)
return
}
if streamRuntime.finalErrorMessage != "" {
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "chat.completions", "stream", true, "retry_attempts", attempts, "success_source", "none", "error_code", streamRuntime.finalErrorCode)
return

View File

@@ -0,0 +1,87 @@
package chat
import (
"context"
"net/http"
"net/http/httptest"
"testing"
"time"
"ds2api/internal/chathistory"
"ds2api/internal/promptcompat"
"ds2api/internal/stream"
)
func TestConsumeChatStreamAttemptMarksContextCancelledState(t *testing.T) {
historyStore := newTestChatHistoryStore(t)
entry, err := historyStore.Start(chathistory.StartParams{
CallerID: "caller:test",
Model: "deepseek-v4-flash",
Stream: true,
UserInput: "hello",
})
if err != nil {
t.Fatalf("start history failed: %v", err)
}
session := &chatHistorySession{
store: historyStore,
entryID: entry.ID,
startedAt: time.Now(),
lastPersist: time.Now(),
finalPrompt: "prompt",
}
ctx, cancel := context.WithCancel(context.Background())
cancel()
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", nil).WithContext(ctx)
rec := httptest.NewRecorder()
streamRuntime := newChatStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"cid-cancelled",
time.Now().Unix(),
"deepseek-v4-flash",
"prompt",
false,
false,
true,
nil,
nil,
promptcompat.DefaultToolChoicePolicy(),
false,
false,
)
resp := makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"hello"}`,
`data: [DONE]`,
)
h := &Handler{}
terminalWritten, retryable := h.consumeChatStreamAttempt(req, resp, streamRuntime, "text", false, session, true)
if !terminalWritten || retryable {
t.Fatalf("expected cancelled attempt to terminate without retry, got terminalWritten=%v retryable=%v", terminalWritten, retryable)
}
if got, want := streamRuntime.finalErrorCode, string(stream.StopReasonContextCancelled); got != want {
t.Fatalf("expected cancelled final error code %q, got %q", want, got)
}
if streamRuntime.finalErrorMessage == "" {
t.Fatalf("expected cancelled final error message to be preserved")
}
snapshot, err := historyStore.Snapshot()
if err != nil {
t.Fatalf("snapshot failed: %v", err)
}
if len(snapshot.Items) != 1 {
t.Fatalf("expected one history item, got %d", len(snapshot.Items))
}
full, err := historyStore.Get(snapshot.Items[0].ID)
if err != nil {
t.Fatalf("get detail failed: %v", err)
}
if full.Status != "stopped" {
t.Fatalf("expected stopped status, got %#v", full)
}
}

View File

@@ -12,6 +12,7 @@ import (
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/textclean"
"ds2api/internal/toolcall"
"ds2api/internal/toolstream"
)
@@ -35,11 +36,8 @@ type streamLease struct {
ExpiresAt time.Time
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil {
return true
}
return shared.CompatStripReferenceMarkers(h.Store)
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func (h *Handler) applyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
@@ -80,6 +78,10 @@ func writeOpenAIError(w http.ResponseWriter, status int, message string) {
shared.WriteOpenAIError(w, status, message)
}
func writeOpenAIErrorWithCode(w http.ResponseWriter, status int, message, code string) {
shared.WriteOpenAIErrorWithCode(w, status, message, code)
}
func openAIErrorType(status int) string {
return shared.OpenAIErrorType(status)
}
@@ -104,22 +106,10 @@ func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
return shared.CleanVisibleOutput(text, stripReferenceMarkers)
}
func replaceCitationMarkersWithLinks(text string, links map[int]string) string {
return shared.ReplaceCitationMarkersWithLinks(text, links)
}
func shouldWriteUpstreamEmptyOutputError(text string) bool {
return shared.ShouldWriteUpstreamEmptyOutputError(text)
}
func upstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
return shared.UpstreamEmptyOutputDetail(contentFilter, text, thinking)
}
func writeUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
return shared.WriteUpstreamEmptyOutputError(w, text, thinking, contentFilter)
}
func emptyOutputRetryEnabled() bool {
return shared.EmptyOutputRetryEnabled()
}

View File

@@ -8,7 +8,9 @@ import (
"strings"
"time"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
@@ -76,44 +78,44 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
}
historySession := startChatHistory(h.ChatHistory, r, a, stdReq)
sessionID, err = h.DS.CreateSession(r.Context(), a, 3)
if err != nil {
if a.UseConfigToken {
if !stdReq.Stream {
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
sessionID = result.SessionID
if outErr != nil {
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.", "error", "", "")
historySession.error(outErr.Status, outErr.Message, outErr.Code, result.Turn.Thinking, result.Turn.Text)
}
writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
} else {
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.", "error", "", "")
}
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
return
}
pow, err := h.DS.GetPow(r.Context(), a, 3)
if err != nil {
respBody := openaifmt.BuildChatCompletionWithToolCalls(result.SessionID, stdReq.ResponseModel, result.Turn.Prompt, result.Turn.Thinking, result.Turn.Text, result.Turn.ToolCalls, stdReq.ToolsRaw)
respBody["usage"] = assistantturn.OpenAIChatUsage(result.Turn)
finishReason := assistantturn.FinalizeTurn(result.Turn, assistantturn.FinalizeOptions{}).FinishReason
if historySession != nil {
historySession.error(http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).", "error", "", "")
historySession.success(http.StatusOK, result.Turn.Thinking, result.Turn.Text, finishReason, assistantturn.OpenAIChatUsage(result.Turn))
}
writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
writeJSON(w, http.StatusOK, respBody)
return
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := h.DS.CallCompletion(r.Context(), a, payload, pow, 3)
if err != nil {
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
sessionID = start.SessionID
if outErr != nil {
if historySession != nil {
historySession.error(http.StatusInternalServerError, "Failed to get completion.", "error", "", "")
historySession.error(outErr.Status, outErr.Message, outErr.Code, "", "")
}
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
refFileTokens := stdReq.RefFileTokens
if stdReq.Stream {
h.handleStreamWithRetry(w, r, a, resp, payload, pow, sessionID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, historySession)
return
}
h.handleNonStreamWithRetry(w, r.Context(), a, resp, payload, pow, sessionID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, historySession)
streamReq := start.Request
refFileTokens := streamReq.RefFileTokens
h.handleStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, sessionID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, historySession)
}
func (h *Handler) autoDeleteRemoteSession(ctx context.Context, a *auth.RequestAuth, sessionID string) {
@@ -161,33 +163,29 @@ func (h *Handler) handleNonStream(w http.ResponseWriter, resp *http.Response, co
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
finalThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
finalText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
finalText = replaceCitationMarkersWithLinks(finalText, result.CitationLinks)
}
detected := detectAssistantToolCalls(result.Text, finalText, result.Thinking, result.ToolDetectionThinking, toolNames)
if shouldWriteUpstreamEmptyOutputError(finalText) && len(detected.Calls) == 0 {
status, message, code := upstreamEmptyOutputDetail(result.ContentFilter, finalText, finalThinking)
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: finalPrompt,
RefFileTokens: refFileTokens,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
ToolChoice: promptcompat.DefaultToolChoicePolicy(),
})
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if historySession != nil {
historySession.error(status, message, code, finalThinking, finalText)
historySession.error(status, message, code, turn.Thinking, turn.Text)
}
writeUpstreamEmptyOutputError(w, finalText, finalThinking, result.ContentFilter)
writeOpenAIErrorWithCode(w, status, message, code)
return
}
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, finalPrompt, finalThinking, finalText, detected.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(respBody, refFileTokens)
}
finishReason := "stop"
if choices, ok := respBody["choices"].([]map[string]any); ok && len(choices) > 0 {
if fr, _ := choices[0]["finish_reason"].(string); strings.TrimSpace(fr) != "" {
finishReason = fr
}
}
respBody := openaifmt.BuildChatCompletionWithToolCalls(completionID, model, finalPrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
respBody["usage"] = assistantturn.OpenAIChatUsage(turn)
if historySession != nil {
historySession.success(http.StatusOK, finalThinking, finalText, finishReason, openaifmt.BuildChatUsageForModel(model, finalPrompt, finalThinking, finalText, refFileTokens))
historySession.success(http.StatusOK, turn.Thinking, turn.Text, outcome.FinishReason, assistantturn.OpenAIChatUsage(turn))
}
writeJSON(w, http.StatusOK, respBody)
}
@@ -215,7 +213,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
created := time.Now().Unix()
bufferToolContent := len(toolNames) > 0
emitEarlyToolDeltas := h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence()
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
initialType := "text"
if thinkingEnabled {
initialType = "thinking"
@@ -234,6 +232,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
stripReferenceMarkers,
toolNames,
toolsRaw,
promptcompat.DefaultToolChoicePolicy(),
bufferToolContent,
emitEarlyToolDeltas,
)
@@ -254,7 +253,7 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
OnParsed: func(parsed sse.LineResult) streamengine.ParsedDecision {
decision := streamRuntime.onParsed(parsed)
if historySession != nil {
historySession.progress(streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.progress(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
}
return decision
},
@@ -268,14 +267,14 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
return
}
if streamRuntime.finalErrorMessage != "" {
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.thinking.String(), streamRuntime.text.String())
historySession.error(streamRuntime.finalErrorStatus, streamRuntime.finalErrorMessage, streamRuntime.finalErrorCode, streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String())
return
}
historySession.success(http.StatusOK, streamRuntime.finalThinking, streamRuntime.finalText, streamRuntime.finalFinishReason, streamRuntime.finalUsage)
},
OnContextDone: func() {
if historySession != nil {
historySession.stopped(streamRuntime.thinking.String(), streamRuntime.text.String(), string(streamengine.StopReasonContextCancelled))
historySession.stopped(streamRuntime.accumulator.Thinking.String(), streamRuntime.accumulator.Text.String(), string(streamengine.StopReasonContextCancelled))
}
},
})

View File

@@ -75,7 +75,6 @@ func TestChatCompletionsAutoDeleteModes(t *testing.T) {
}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: tc.mode,
},
Auth: streamStatusAuthStub{},
@@ -123,7 +122,6 @@ func TestAutoDeleteRemoteSessionIgnoresCanceledParentContext(t *testing.T) {
ds := &autoDeleteCtxDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: "single",
},
DS: ds,

View File

@@ -12,25 +12,18 @@ import (
type mockOpenAIConfig struct {
aliases map[string]string
wideInput bool
autoDeleteMode string
toolMode string
earlyEmit string
responsesTTL int
embedProv string
historySplitEnabled bool
historySplitTurns int
currentInputEnabled bool
currentInputMin int
thinkingInjection *bool
thinkingPrompt string
}
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) CompatWideInputStrictOutput() bool {
return m.wideInput
}
func (m mockOpenAIConfig) CompatStripReferenceMarkers() bool { return true }
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) ToolcallMode() string { return m.toolMode }
func (m mockOpenAIConfig) ToolcallEarlyEmitConfidence() string { return m.earlyEmit }
func (m mockOpenAIConfig) ResponsesStoreTTLSeconds() int { return m.responsesTTL }
@@ -41,14 +34,7 @@ func (m mockOpenAIConfig) AutoDeleteMode() string {
}
return m.autoDeleteMode
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) HistorySplitEnabled() bool { return m.historySplitEnabled }
func (m mockOpenAIConfig) HistorySplitTriggerAfterTurns() int {
if m.historySplitTurns <= 0 {
return 1
}
return m.historySplitTurns
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) CurrentInputFileEnabled() bool { return m.currentInputEnabled }
func (m mockOpenAIConfig) CurrentInputFileMinChars() int {
return m.currentInputMin

View File

@@ -94,7 +94,6 @@ func TestHandleVercelStreamPrepareAppliesCurrentInputFile(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -130,8 +129,8 @@ func TestHandleVercelStreamPrepareAppliesCurrentInputFile(t *testing.T) {
t.Fatalf("expected payload object, got %#v", body["payload"])
}
promptText, _ := payload["prompt"].(string)
if !strings.Contains(promptText, "Answer the latest user request directly.") {
t.Fatalf("expected neutral prompt, got %s", promptText)
if !strings.Contains(promptText, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation prompt, got %s", promptText)
}
if strings.Contains(promptText, "first user turn") || strings.Contains(promptText, "latest user turn") {
t.Fatalf("expected original turns hidden from prompt, got %s", promptText)
@@ -151,7 +150,6 @@ func TestHandleVercelStreamPrepareMapsCurrentInputFileManagedAuthFailureTo401(t
}
h := &Handler{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusManagedAuthStub{},

View File

@@ -109,13 +109,10 @@ func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Reque
"final_prompt": stdReq.FinalPrompt,
"thinking_enabled": stdReq.Thinking,
"search_enabled": stdReq.Search,
"compat": map[string]any{
"strip_reference_markers": h.compatStripReferenceMarkers(),
},
"tool_names": stdReq.ToolNames,
"deepseek_token": a.DeepSeekToken,
"pow_header": powHeader,
"payload": payload,
"tool_names": stdReq.ToolNames,
"deepseek_token": a.DeepSeekToken,
"pow_header": powHeader,
"payload": payload,
})
}

View File

@@ -1,6 +1,7 @@
package openai
import (
"strings"
"testing"
"ds2api/internal/promptcompat"
@@ -8,25 +9,18 @@ import (
type mockOpenAIConfig struct {
aliases map[string]string
wideInput bool
autoDeleteMode string
toolMode string
earlyEmit string
responsesTTL int
embedProv string
historySplitEnabled bool
historySplitTurns int
currentInputEnabled bool
currentInputMin int
thinkingInjection *bool
thinkingPrompt string
}
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) CompatWideInputStrictOutput() bool {
return m.wideInput
}
func (m mockOpenAIConfig) CompatStripReferenceMarkers() bool { return true }
func (m mockOpenAIConfig) ModelAliases() map[string]string { return m.aliases }
func (m mockOpenAIConfig) ToolcallMode() string { return m.toolMode }
func (m mockOpenAIConfig) ToolcallEarlyEmitConfidence() string { return m.earlyEmit }
func (m mockOpenAIConfig) ResponsesStoreTTLSeconds() int { return m.responsesTTL }
@@ -37,14 +31,7 @@ func (m mockOpenAIConfig) AutoDeleteMode() string {
}
return m.autoDeleteMode
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) HistorySplitEnabled() bool { return m.historySplitEnabled }
func (m mockOpenAIConfig) HistorySplitTriggerAfterTurns() int {
if m.historySplitTurns <= 0 {
return 1
}
return m.historySplitTurns
}
func (m mockOpenAIConfig) AutoDeleteSessions() bool { return false }
func (m mockOpenAIConfig) CurrentInputFileEnabled() bool { return m.currentInputEnabled }
func (m mockOpenAIConfig) CurrentInputFileMinChars() int {
return m.currentInputMin
@@ -62,7 +49,6 @@ func TestNormalizeOpenAIChatRequestWithConfigInterface(t *testing.T) {
aliases: map[string]string{
"my-model": "deepseek-v4-flash-search",
},
wideInput: true,
}
req := map[string]any{
"model": "my-model",
@@ -81,7 +67,7 @@ func TestNormalizeOpenAIChatRequestWithConfigInterface(t *testing.T) {
}
func TestNormalizeOpenAIChatRequestDisablesThinkingForNoThinkingModel(t *testing.T) {
cfg := mockOpenAIConfig{wideInput: true}
cfg := mockOpenAIConfig{}
req := map[string]any{
"model": "deepseek-v4-pro-nothinking",
"messages": []any{map[string]any{"role": "user", "content": "hello"}},
@@ -102,28 +88,22 @@ func TestNormalizeOpenAIChatRequestDisablesThinkingForNoThinkingModel(t *testing
}
}
func TestNormalizeOpenAIResponsesRequestWideInputPolicyFromInterface(t *testing.T) {
func TestNormalizeOpenAIResponsesRequestAlwaysAcceptsWideInput(t *testing.T) {
req := map[string]any{
"model": "deepseek-v4-flash",
"input": "hi",
}
_, err := promptcompat.NormalizeOpenAIResponsesRequest(mockOpenAIConfig{
aliases: map[string]string{},
wideInput: false,
}, req, "")
if err == nil {
t.Fatal("expected error when wide input is disabled and only input is provided")
}
out, err := promptcompat.NormalizeOpenAIResponsesRequest(mockOpenAIConfig{
aliases: map[string]string{},
wideInput: true,
aliases: map[string]string{},
}, req, "")
if err != nil {
t.Fatalf("unexpected error when wide input is enabled: %v", err)
t.Fatalf("unexpected error for wide input request: %v", err)
}
if out.Surface != "openai_responses" {
t.Fatalf("unexpected surface: %q", out.Surface)
}
if !strings.Contains(out.FinalPrompt, "<User>hi") {
t.Fatalf("unexpected final prompt: %q", out.FinalPrompt)
}
}

View File

@@ -94,6 +94,9 @@ func TestPreprocessInlineFileInputsReplacesDataURLAndCollectsRefFileIDs(t *testi
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected 1 upload, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].ModelType != "default" {
t.Fatalf("expected default model type when request omits model, got %q", ds.uploadCalls[0].ModelType)
}
if ds.lastCtx != ctx {
t.Fatalf("expected upload to use request context")
}
@@ -148,8 +151,8 @@ func TestPreprocessInlineFileInputsDeduplicatesIdenticalPayloads(t *testing.T) {
func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
reqBody := `{"model":"deepseek-v4-flash","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
reqBody := `{"model":"deepseek-v4-vision","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
req.Header.Set("Authorization", "Bearer direct-token")
req.Header.Set("Content-Type", "application/json")
@@ -163,6 +166,9 @@ func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].ModelType != "vision" {
t.Fatalf("expected vision model type for vision request, got %q", ds.uploadCalls[0].ModelType)
}
if ds.completionReq == nil {
t.Fatal("expected completion payload to be captured")
}
@@ -174,10 +180,10 @@ func TestChatCompletionsUploadsInlineFilesBeforeCompletion(t *testing.T) {
func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-flash","input":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"input_image","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
reqBody := `{"model":"deepseek-v4-pro","input":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"input_image","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(reqBody))
req.Header.Set("Authorization", "Bearer direct-token")
req.Header.Set("Content-Type", "application/json")
@@ -191,6 +197,9 @@ func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
}
if ds.uploadCalls[0].ModelType != "expert" {
t.Fatalf("expected expert model type for pro request, got %q", ds.uploadCalls[0].ModelType)
}
refIDs, _ := ds.completionReq["ref_file_ids"].([]any)
if len(refIDs) != 1 || refIDs[0] != "file-inline-1" {
t.Fatalf("unexpected completion ref_file_ids: %#v", ds.completionReq["ref_file_ids"])
@@ -199,7 +208,7 @@ func TestResponsesUploadsInlineFilesBeforeCompletion(t *testing.T) {
func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
reqBody := `{"model":"deepseek-v4-flash","messages":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,%%%"}}]}],"stream":false}`
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", strings.NewReader(reqBody))
req.Header.Set("Authorization", "Bearer direct-token")
@@ -218,7 +227,7 @@ func TestChatCompletionsInlineUploadFailureReturnsBadRequest(t *testing.T) {
func TestChatCompletionsInlineUploadLimitReturnsBadRequest(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
content := []any{map[string]any{"type": "input_text", "text": "hi"}}
for i := 0; i < 51; i++ {
content = append(content, map[string]any{
@@ -257,7 +266,7 @@ func TestChatCompletionsInlineUploadLimitReturnsBadRequest(t *testing.T) {
func TestResponsesInlineUploadFailureReturnsInternalServerError(t *testing.T) {
ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-flash","input":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":false}`
@@ -280,7 +289,7 @@ func TestVercelPrepareUploadsInlineFilesBeforeLeasePayload(t *testing.T) {
t.Setenv("VERCEL", "1")
t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
ds := &inlineUploadDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
reqBody := `{"model":"deepseek-v4-flash","messages":[{"role":"user","content":[{"type":"input_text","text":"hi"},{"type":"image_url","image_url":{"url":"data:image/png;base64,QUJDRA=="}}]}],"stream":true}`

View File

@@ -12,6 +12,7 @@ import (
"strings"
"ds2api/internal/auth"
"ds2api/internal/config"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
@@ -42,6 +43,7 @@ type inlineUploadState struct {
ctx context.Context
handler *Handler
auth *auth.RequestAuth
modelType string
uploadedByID map[string]string
uploadCount int
inlineFileBytes int
@@ -58,10 +60,19 @@ func (h *Handler) PreprocessInlineFileInputs(ctx context.Context, a *auth.Reques
if h == nil || h.DS == nil || len(req) == 0 {
return nil
}
modelType := "default"
if requestedModel, ok := req["model"].(string); ok {
if resolvedModel, ok := config.ResolveModel(h.Store, requestedModel); ok {
if resolvedType, ok := config.GetModelType(resolvedModel); ok {
modelType = resolvedType
}
}
}
state := &inlineUploadState{
ctx: ctx,
handler: h,
auth: a,
modelType: modelType,
uploadedByID: map[string]string{},
}
for _, key := range []string{"messages", "input", "attachments"} {
@@ -174,6 +185,7 @@ func (s *inlineUploadState) uploadInlineFile(file inlineDecodedFile) (string, er
result, err := s.handler.DS.UploadFile(s.ctx, s.auth, dsclient.UploadFileRequest{
Filename: file.Filename,
ContentType: contentType,
ModelType: s.modelType,
Data: file.Data,
}, 3)
if err != nil {

View File

@@ -1,13 +1,18 @@
package files
import (
"context"
"errors"
"io"
"net/http"
"strings"
"time"
"github.com/go-chi/chi/v5"
"ds2api/internal/auth"
"ds2api/internal/chathistory"
"ds2api/internal/config"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/httpapi/openai/shared"
)
@@ -21,6 +26,10 @@ type Handler struct {
ChatHistory *chathistory.Store
}
type fileFetcher interface {
FetchUploadedFile(ctx context.Context, a *auth.RequestAuth, fileID string) (*dsclient.UploadFileResult, error)
}
func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
a, err := h.Auth.Determine(r)
if err != nil {
@@ -66,10 +75,12 @@ func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
if contentType == "" && len(data) > 0 {
contentType = http.DetectContentType(data)
}
modelType := resolveUploadModelType(h.Store, r)
result, err := h.DS.UploadFile(r.Context(), a, dsclient.UploadFileRequest{
Filename: header.Filename,
ContentType: contentType,
Purpose: strings.TrimSpace(r.FormValue("purpose")),
ModelType: modelType,
Data: data,
}, 3)
if err != nil {
@@ -82,6 +93,70 @@ func (h *Handler) UploadFile(w http.ResponseWriter, r *http.Request) {
shared.WriteJSON(w, http.StatusOK, buildOpenAIFileObject(result))
}
func (h *Handler) RetrieveFile(w http.ResponseWriter, r *http.Request) {
a, err := h.Auth.Determine(r)
if err != nil {
status := http.StatusUnauthorized
detail := err.Error()
if err == auth.ErrNoAccount {
status = http.StatusTooManyRequests
}
shared.WriteOpenAIError(w, status, detail)
return
}
defer h.Auth.Release(a)
fileID := strings.TrimSpace(chi.URLParam(r, "file_id"))
if fileID == "" {
shared.WriteOpenAIError(w, http.StatusBadRequest, "file_id is required")
return
}
fetcher, ok := h.DS.(fileFetcher)
if !ok {
shared.WriteOpenAIError(w, http.StatusNotImplemented, "file retrieval is not available")
return
}
result, err := fetcher.FetchUploadedFile(r.Context(), a, fileID)
if err != nil {
if errors.Is(err, dsclient.ErrUploadFileNotFound) {
shared.WriteOpenAIError(w, http.StatusNotFound, "file not found")
return
}
shared.WriteOpenAIError(w, http.StatusInternalServerError, "Failed to retrieve file.")
return
}
if result != nil && result.AccountID == "" {
result.AccountID = a.AccountID
}
shared.WriteJSON(w, http.StatusOK, buildOpenAIFileObject(result))
}
func resolveUploadModelType(store shared.ConfigReader, r *http.Request) string {
for _, candidate := range []string{r.FormValue("model_type"), r.Header.Get("X-Model-Type")} {
if modelType := normalizeUploadModelType(candidate); modelType != "" {
return modelType
}
}
requestedModel := strings.TrimSpace(r.FormValue("model"))
if requestedModel != "" {
if resolvedModel, ok := config.ResolveModel(store, requestedModel); ok {
if modelType, ok := config.GetModelType(resolvedModel); ok {
return modelType
}
}
}
return "default"
}
func normalizeUploadModelType(raw string) string {
switch strings.ToLower(strings.TrimSpace(raw)) {
case "default", "expert", "vision":
return strings.ToLower(strings.TrimSpace(raw))
default:
return ""
}
}
func buildOpenAIFileObject(result *dsclient.UploadFileResult) map[string]any {
if result == nil {
obj := map[string]any{

View File

@@ -43,6 +43,7 @@ func (managedFilesAuthStub) Release(_ *auth.RequestAuth) {}
type filesRouteDSStub struct {
lastReq dsclient.UploadFileRequest
upload *dsclient.UploadFileResult
fetched *dsclient.UploadFileResult
err error
}
@@ -65,6 +66,16 @@ func (m *filesRouteDSStub) UploadFile(_ context.Context, _ *auth.RequestAuth, re
return &dsclient.UploadFileResult{ID: "file-123", Filename: req.Filename, Bytes: int64(len(req.Data)), Purpose: req.Purpose, Status: "uploaded"}, nil
}
func (m *filesRouteDSStub) FetchUploadedFile(_ context.Context, _ *auth.RequestAuth, fileID string) (*dsclient.UploadFileResult, error) {
if m.err != nil {
return nil, m.err
}
if m.fetched != nil {
return m.fetched, nil
}
return &dsclient.UploadFileResult{ID: fileID, Filename: "notes.txt", Bytes: 11, Purpose: "assistants", Status: "processed"}, nil
}
func (m *filesRouteDSStub) CallCompletion(_ context.Context, _ *auth.RequestAuth, _ map[string]any, _ string, _ int) (*http.Response, error) {
return nil, errors.New("not implemented")
}
@@ -77,7 +88,7 @@ func (m *filesRouteDSStub) DeleteAllSessionsForToken(_ context.Context, _ string
return nil
}
func newMultipartUploadRequest(t *testing.T, purpose string, filename string, data []byte) *http.Request {
func newMultipartUploadRequest(t *testing.T, purpose string, filename string, data []byte, model string) *http.Request {
t.Helper()
var body bytes.Buffer
writer := multipart.NewWriter(&body)
@@ -86,6 +97,11 @@ func newMultipartUploadRequest(t *testing.T, purpose string, filename string, da
t.Fatalf("write purpose failed: %v", err)
}
}
if model != "" {
if err := writer.WriteField("model", model); err != nil {
t.Fatalf("write model failed: %v", err)
}
}
part, err := writer.CreateFormFile("file", filename)
if err != nil {
t.Fatalf("create form file failed: %v", err)
@@ -104,11 +120,11 @@ func newMultipartUploadRequest(t *testing.T, purpose string, filename string, da
func TestFilesRouteUploadSuccess(t *testing.T) {
ds := &filesRouteDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"), "deepseek-v4-vision")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
@@ -121,6 +137,9 @@ func TestFilesRouteUploadSuccess(t *testing.T) {
if ds.lastReq.Purpose != "assistants" {
t.Fatalf("expected purpose assistants, got %q", ds.lastReq.Purpose)
}
if ds.lastReq.ModelType != "vision" {
t.Fatalf("expected vision model type, got %q", ds.lastReq.ModelType)
}
if string(ds.lastReq.Data) != "hello world" {
t.Fatalf("unexpected uploaded data: %q", string(ds.lastReq.Data))
}
@@ -141,11 +160,11 @@ func TestFilesRouteUploadSuccess(t *testing.T) {
func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
ds := &filesRouteDSStub{}
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: managedFilesAuthStub{}, DS: ds}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: managedFilesAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"))
req := newMultipartUploadRequest(t, "assistants", "notes.txt", []byte("hello world"), "deepseek-v4-vision")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
@@ -161,8 +180,56 @@ func TestFilesRouteUploadIncludesAccountIDForManagedAccount(t *testing.T) {
}
}
func TestFilesRouteRetrieveSuccess(t *testing.T) {
ds := &filesRouteDSStub{fetched: &dsclient.UploadFileResult{
ID: "file-123",
Filename: "notes.txt",
Bytes: 11,
Purpose: "assistants",
Status: "processed",
}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: managedFilesAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := httptest.NewRequest(http.MethodGet, "/v1/files/file-123", nil)
req.Header.Set("Authorization", "Bearer direct-token")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
var out map[string]any
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
t.Fatalf("decode response failed: %v body=%s", err, rec.Body.String())
}
if out["id"] != "file-123" || out["filename"] != "notes.txt" || out["status"] != "processed" {
t.Fatalf("unexpected file object: %#v", out)
}
if out["account_id"] != "acct-123" {
t.Fatalf("expected account_id acct-123, got %#v", out["account_id"])
}
}
func TestFilesRouteRetrieveNotFound(t *testing.T) {
ds := &filesRouteDSStub{err: dsclient.ErrUploadFileNotFound}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: ds}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
req := httptest.NewRequest(http.MethodGet, "/v1/files/missing-file", nil)
req.Header.Set("Authorization", "Bearer direct-token")
rec := httptest.NewRecorder()
r.ServeHTTP(rec, req)
if rec.Code != http.StatusNotFound {
t.Fatalf("expected 404, got %d body=%s", rec.Code, rec.Body.String())
}
}
func TestFilesRouteRejectsNonMultipart(t *testing.T) {
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)
@@ -178,7 +245,7 @@ func TestFilesRouteRejectsNonMultipart(t *testing.T) {
}
func TestFilesRouteRequiresFileField(t *testing.T) {
h := &openAITestSurface{Store: mockOpenAIConfig{wideInput: true}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
h := &openAITestSurface{Store: mockOpenAIConfig{}, Auth: streamStatusAuthStub{}, DS: &filesRouteDSStub{}}
r := chi.NewRouter()
registerOpenAITestRoutes(r, h)

View File

@@ -7,6 +7,7 @@ import (
"strings"
"ds2api/internal/auth"
"ds2api/internal/config"
dsclient "ds2api/internal/deepseek/client"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
@@ -18,8 +19,22 @@ const (
currentInputPurpose = "assistants"
)
type CurrentInputConfigReader interface {
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
}
type CurrentInputUploader interface {
UploadFile(ctx context.Context, a *auth.RequestAuth, req dsclient.UploadFileRequest, maxAttempts int) (*dsclient.UploadFileResult, error)
}
type Service struct {
Store CurrentInputConfigReader
DS CurrentInputUploader
}
func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
if s.DS == nil || s.Store == nil || a == nil || !s.Store.CurrentInputFileEnabled() {
if stdReq.CurrentInputFileApplied || s.DS == nil || s.Store == nil || a == nil || !s.Store.CurrentInputFileEnabled() {
return stdReq, nil
}
threshold := s.Store.CurrentInputFileMinChars()
@@ -35,10 +50,15 @@ func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth,
if strings.TrimSpace(fileText) == "" {
return stdReq, errors.New("current user input file produced empty transcript")
}
modelType := "default"
if resolvedType, ok := config.GetModelType(stdReq.ResolvedModel); ok {
modelType = resolvedType
}
result, err := s.DS.UploadFile(ctx, a, dsclient.UploadFileRequest{
Filename: currentInputFilename,
ContentType: currentInputContentType,
Purpose: currentInputPurpose,
ModelType: modelType,
Data: []byte(fileText),
}, 3)
if err != nil {
@@ -62,7 +82,7 @@ func (s Service) ApplyCurrentInputFile(ctx context.Context, a *auth.RequestAuth,
stdReq.RefFileIDs = prependUniqueRefFileID(stdReq.RefFileIDs, fileID)
stdReq.FinalPrompt, stdReq.ToolNames = promptcompat.BuildOpenAIPrompt(messages, stdReq.ToolsRaw, "", stdReq.ToolChoice, stdReq.Thinking)
// Token accounting must reflect the actual downstream context:
// the uploaded history.txt file content + the neutral live prompt.
// the uploaded DS2API_HISTORY.txt file content + the continuation live prompt.
stdReq.PromptTokenText = fileText + "\n" + stdReq.FinalPrompt
return stdReq, nil
}
@@ -87,5 +107,22 @@ func latestUserInputForFile(messages []any) (int, string) {
}
func currentInputFilePrompt() string {
return "The current request and prior conversation context have already been provided. Answer the latest user request directly."
return "Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly."
}
func prependUniqueRefFileID(existing []string, fileID string) []string {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return existing
}
out := make([]string, 0, len(existing)+1)
out = append(out, fileID)
for _, id := range existing {
trimmed := strings.TrimSpace(id)
if trimmed == "" || strings.EqualFold(trimmed, fileID) {
continue
}
out = append(out, trimmed)
}
return out
}

View File

@@ -1,90 +0,0 @@
package history
import (
"context"
"strings"
"ds2api/internal/auth"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
)
type Service struct {
Store shared.ConfigReader
DS shared.DeepSeekCaller
}
// Apply is retained for legacy compatibility only. The active split path is
// current input file handling in ApplyCurrentInputFile.
func (s Service) Apply(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
return stdReq, nil
}
func SplitOpenAIHistoryMessages(messages []any, triggerAfterTurns int) ([]any, []any) {
if triggerAfterTurns <= 0 {
triggerAfterTurns = 1
}
lastUserIndex := -1
userTurns := 0
for i, raw := range messages {
msg, ok := raw.(map[string]any)
if !ok {
continue
}
role := strings.ToLower(strings.TrimSpace(shared.AsString(msg["role"])))
if role != "user" {
continue
}
userTurns++
lastUserIndex = i
}
if userTurns <= triggerAfterTurns || lastUserIndex < 0 {
return messages, nil
}
promptMessages := make([]any, 0, len(messages)-lastUserIndex)
historyMessages := make([]any, 0, lastUserIndex)
for i, raw := range messages {
msg, ok := raw.(map[string]any)
if !ok {
if i >= lastUserIndex {
promptMessages = append(promptMessages, raw)
} else {
historyMessages = append(historyMessages, raw)
}
continue
}
role := strings.ToLower(strings.TrimSpace(shared.AsString(msg["role"])))
switch role {
case "system", "developer":
promptMessages = append(promptMessages, raw)
default:
if i >= lastUserIndex {
promptMessages = append(promptMessages, raw)
} else {
historyMessages = append(historyMessages, raw)
}
}
}
if len(promptMessages) == 0 {
return messages, nil
}
return promptMessages, historyMessages
}
func prependUniqueRefFileID(existing []string, fileID string) []string {
fileID = strings.TrimSpace(fileID)
if fileID == "" {
return existing
}
out := make([]string, 0, len(existing)+1)
out = append(out, fileID)
for _, id := range existing {
trimmed := strings.TrimSpace(id)
if trimmed == "" || strings.EqualFold(trimmed, fileID) {
continue
}
out = append(out, trimmed)
}
return out
}

View File

@@ -61,57 +61,34 @@ func (streamStatusManagedAuthStub) DetermineCaller(_ *http.Request) (*auth.Reque
func (streamStatusManagedAuthStub) Release(_ *auth.RequestAuth) {}
func TestBuildOpenAICurrentInputContextTranscriptUsesInjectedFileWrapper(t *testing.T) {
_, historyMessages := splitOpenAIHistoryMessages(historySplitTestMessages(), 1)
transcript := buildOpenAICurrentInputContextTranscript(historyMessages)
func TestBuildOpenAICurrentInputContextTranscriptUsesNumberedHistorySections(t *testing.T) {
transcript := buildOpenAICurrentInputContextTranscript(historySplitTestMessages())
if strings.Contains(transcript, "[file content end]") || strings.Contains(transcript, "[file content begin]") || strings.Contains(transcript, "[file name]:") {
t.Fatalf("expected plain transcript without file wrapper tags, got %q", transcript)
t.Fatalf("expected transcript without file wrapper tags, got %q", transcript)
}
if !strings.Contains(transcript, "<begin▁of▁sentence>") {
t.Fatalf("expected serialized conversation markers, got %q", transcript)
if !strings.Contains(transcript, "# DS2API_HISTORY.txt") {
t.Fatalf("expected history transcript header, got %q", transcript)
}
if !strings.Contains(transcript, "first user turn") || !strings.Contains(transcript, "tool result") {
t.Fatalf("expected historical turns preserved, got %q", transcript)
if !strings.Contains(transcript, "Prior conversation history and tool progress.") {
t.Fatalf("expected history transcript description, got %q", transcript)
}
if !strings.Contains(transcript, "[reasoning_content]") || !strings.Contains(transcript, "hidden reasoning") {
t.Fatalf("expected reasoning block preserved, got %q", transcript)
}
if !strings.Contains(transcript, "<|DSML|tool_calls>") {
t.Fatalf("expected tool calls preserved, got %q", transcript)
}
}
func TestSplitOpenAIHistoryMessagesUsesLatestUserTurn(t *testing.T) {
messages := []any{
map[string]any{"role": "system", "content": "system instructions"},
map[string]any{"role": "user", "content": "first user turn"},
map[string]any{"role": "assistant", "content": "first assistant turn"},
map[string]any{"role": "user", "content": "middle user turn"},
map[string]any{"role": "assistant", "content": "middle assistant turn"},
map[string]any{"role": "user", "content": "latest user turn"},
}
promptMessages, historyMessages := splitOpenAIHistoryMessages(messages, 1)
if len(promptMessages) == 0 || len(historyMessages) == 0 {
t.Fatalf("expected both prompt and history messages, got prompt=%d history=%d", len(promptMessages), len(historyMessages))
}
promptText, _ := promptcompat.BuildOpenAIPrompt(promptMessages, nil, "", defaultToolChoicePolicy(), true)
if !strings.Contains(promptText, "latest user turn") {
t.Fatalf("expected latest user turn in prompt, got %s", promptText)
}
if strings.Contains(promptText, "middle user turn") {
t.Fatalf("expected middle user turn to be moved into history, got %s", promptText)
}
historyText := buildOpenAICurrentInputContextTranscript(historyMessages)
if !strings.Contains(historyText, "middle user turn") {
t.Fatalf("expected middle user turn in split history, got %s", historyText)
}
if strings.Contains(historyText, "latest user turn") {
t.Fatalf("expected latest user turn to remain live, got %s", historyText)
for _, want := range []string{
"=== 1. SYSTEM ===",
"=== 2. USER ===",
"=== 3. ASSISTANT ===",
"=== 4. TOOL ===",
"=== 5. USER ===",
"first user turn",
"tool result",
"latest user turn",
"[reasoning_content]",
"hidden reasoning",
"<|DSML|tool_calls>",
} {
if !strings.Contains(transcript, want) {
t.Fatalf("expected transcript to contain %q, got %q", want, transcript)
}
}
}
@@ -119,7 +96,6 @@ func TestApplyCurrentInputFileSkipsShortInputWhenThresholdNotReached(t *testing.
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 10,
},
@@ -152,7 +128,6 @@ func TestApplyThinkingInjectionAppendsLatestUserPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
thinkingInjection: boolPtr(true),
},
DS: ds,
@@ -184,7 +159,6 @@ func TestApplyThinkingInjectionUsesCustomPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
thinkingInjection: boolPtr(true),
thinkingPrompt: "custom thinking format",
},
@@ -214,13 +188,12 @@ func TestApplyCurrentInputFileDisabledPassThrough(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: false,
},
DS: ds,
}
req := map[string]any{
"model": "deepseek-v4-flash",
"model": "deepseek-v4-vision",
"messages": historySplitTestMessages(),
}
stdReq, err := promptcompat.NormalizeOpenAIChatRequest(h.Store, req, "")
@@ -243,11 +216,10 @@ func TestApplyCurrentInputFileDisabledPassThrough(t *testing.T) {
}
}
func TestApplyCurrentInputFileUploadsFirstTurnWithInjectedWrapper(t *testing.T) {
func TestApplyCurrentInputFileUploadsFirstTurnWithNumberedHistoryTranscript(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 10,
thinkingInjection: boolPtr(true),
@@ -273,15 +245,21 @@ func TestApplyCurrentInputFileUploadsFirstTurnWithInjectedWrapper(t *testing.T)
t.Fatalf("expected 1 current input upload, got %d", len(ds.uploadCalls))
}
upload := ds.uploadCalls[0]
if upload.Filename != "history.txt" {
if upload.Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", upload.Filename)
}
uploadedText := string(upload.Data)
if strings.Contains(uploadedText, "[file content end]") || strings.Contains(uploadedText, "[file content begin]") || strings.Contains(uploadedText, "[file name]:") {
t.Fatalf("expected uploaded transcript without file wrapper tags, got %q", uploadedText)
}
if !strings.Contains(uploadedText, "<begin▁of▁sentence><User>first turn content that is long enough") {
t.Fatalf("expected serialized current user turn markers, got %q", uploadedText)
for _, want := range []string{
"# DS2API_HISTORY.txt",
"=== 1. USER ===",
"first turn content that is long enough",
} {
if !strings.Contains(uploadedText, want) {
t.Fatalf("expected uploaded transcript to contain %q, got %q", want, uploadedText)
}
}
if !strings.Contains(uploadedText, promptcompat.ThinkingInjectionMarker) {
t.Fatalf("expected thinking injection in current input file, got %q", uploadedText)
@@ -290,11 +268,11 @@ func TestApplyCurrentInputFileUploadsFirstTurnWithInjectedWrapper(t *testing.T)
if strings.Contains(out.FinalPrompt, "first turn content that is long enough") {
t.Fatalf("expected current input text to be replaced in live prompt, got %s", out.FinalPrompt)
}
if strings.Contains(out.FinalPrompt, "CURRENT_USER_INPUT.txt") || strings.Contains(out.FinalPrompt, "history.txt") || strings.Contains(out.FinalPrompt, "Read that file") {
if strings.Contains(out.FinalPrompt, "CURRENT_USER_INPUT.txt") || strings.Contains(out.FinalPrompt, "Read that file") {
t.Fatalf("expected live prompt not to instruct file reads, got %s", out.FinalPrompt)
}
if !strings.Contains(out.FinalPrompt, "Answer the latest user request directly.") {
t.Fatalf("expected neutral continuation instruction in live prompt, got %s", out.FinalPrompt)
if !strings.Contains(out.FinalPrompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation-oriented prompt in live prompt, got %s", out.FinalPrompt)
}
if len(out.RefFileIDs) != 1 || out.RefFileIDs[0] != "file-inline-1" {
t.Fatalf("expected current input file id in ref_file_ids, got %#v", out.RefFileIDs)
@@ -302,13 +280,15 @@ func TestApplyCurrentInputFileUploadsFirstTurnWithInjectedWrapper(t *testing.T)
if !strings.Contains(out.PromptTokenText, "first turn content that is long enough") {
t.Fatalf("expected prompt token text to preserve original full context, got %q", out.PromptTokenText)
}
if !strings.Contains(out.PromptTokenText, "# DS2API_HISTORY.txt") || !strings.Contains(out.PromptTokenText, "=== 1. USER ===") {
t.Fatalf("expected prompt token text to include numbered history transcript, got %q", out.PromptTokenText)
}
}
func TestApplyCurrentInputFilePreservesFullContextPromptForTokenCounting(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 0,
thinkingInjection: boolPtr(true),
@@ -316,7 +296,7 @@ func TestApplyCurrentInputFilePreservesFullContextPromptForTokenCounting(t *test
DS: ds,
}
req := map[string]any{
"model": "deepseek-v4-flash",
"model": "deepseek-v4-vision",
"messages": historySplitTestMessages(),
}
stdReq, err := promptcompat.NormalizeOpenAIChatRequest(h.Store, req, "")
@@ -337,10 +317,13 @@ func TestApplyCurrentInputFilePreservesFullContextPromptForTokenCounting(t *test
t.Fatalf("expected prompt token text to contain file context with full conversation, got %q", out.PromptTokenText)
}
if strings.Contains(out.PromptTokenText, "[file content end]") || strings.Contains(out.PromptTokenText, "[file name]:") {
t.Fatalf("expected prompt token text to use raw transcript without wrapper tags, got %q", out.PromptTokenText)
t.Fatalf("expected prompt token text to omit file wrapper tags, got %q", out.PromptTokenText)
}
if !strings.Contains(out.PromptTokenText, "Answer the latest user request directly.") {
t.Fatalf("expected prompt token text to also include neutral live prompt, got %q", out.PromptTokenText)
if !strings.Contains(out.PromptTokenText, "# DS2API_HISTORY.txt") || !strings.Contains(out.PromptTokenText, "=== 1. SYSTEM ===") {
t.Fatalf("expected prompt token text to include numbered history transcript, got %q", out.PromptTokenText)
}
if !strings.Contains(out.PromptTokenText, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected prompt token text to also include continuation prompt, got %q", out.PromptTokenText)
}
if strings.Contains(out.FinalPrompt, "first user turn") || strings.Contains(out.FinalPrompt, "latest user turn") {
t.Fatalf("expected live prompt to hide original turns, got %q", out.FinalPrompt)
@@ -351,7 +334,6 @@ func TestApplyCurrentInputFileUploadsFullContextFile(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
currentInputMin: 0,
thinkingInjection: boolPtr(true),
@@ -359,7 +341,7 @@ func TestApplyCurrentInputFileUploadsFullContextFile(t *testing.T) {
DS: ds,
}
req := map[string]any{
"model": "deepseek-v4-flash",
"model": "deepseek-v4-vision",
"messages": historySplitTestMessages(),
}
stdReq, err := promptcompat.NormalizeOpenAIChatRequest(h.Store, req, "")
@@ -378,20 +360,23 @@ func TestApplyCurrentInputFileUploadsFullContextFile(t *testing.T) {
t.Fatalf("expected one current input upload, got %d", len(ds.uploadCalls))
}
upload := ds.uploadCalls[0]
if upload.Filename != "history.txt" {
t.Fatalf("expected history.txt upload, got %q", upload.Filename)
if upload.Filename != "DS2API_HISTORY.txt" {
t.Fatalf("expected DS2API_HISTORY.txt upload, got %q", upload.Filename)
}
if upload.ModelType != "vision" {
t.Fatalf("expected vision model type for vision request, got %q", upload.ModelType)
}
uploadedText := string(upload.Data)
for _, want := range []string{"system instructions", "first user turn", "hidden reasoning", "tool result", "latest user turn", promptcompat.ThinkingInjectionMarker} {
for _, want := range []string{"# DS2API_HISTORY.txt", "=== 1. SYSTEM ===", "=== 2. USER ===", "=== 3. ASSISTANT ===", "=== 4. TOOL ===", "=== 5. USER ===", "system instructions", "first user turn", "hidden reasoning", "tool result", "latest user turn", promptcompat.ThinkingInjectionMarker} {
if !strings.Contains(uploadedText, want) {
t.Fatalf("expected full context file to contain %q, got %q", want, uploadedText)
}
}
if strings.Contains(out.FinalPrompt, "first user turn") || strings.Contains(out.FinalPrompt, "latest user turn") || strings.Contains(out.FinalPrompt, "CURRENT_USER_INPUT.txt") || strings.Contains(out.FinalPrompt, "history.txt") || strings.Contains(out.FinalPrompt, "Read that file") {
t.Fatalf("expected live prompt to use only a neutral continuation instruction, got %s", out.FinalPrompt)
if strings.Contains(out.FinalPrompt, "first user turn") || strings.Contains(out.FinalPrompt, "latest user turn") || strings.Contains(out.FinalPrompt, "CURRENT_USER_INPUT.txt") || strings.Contains(out.FinalPrompt, "Read that file") {
t.Fatalf("expected live prompt to use only a continuation instruction, got %s", out.FinalPrompt)
}
if !strings.Contains(out.FinalPrompt, "Answer the latest user request directly.") {
t.Fatalf("expected neutral continuation instruction in live prompt, got %s", out.FinalPrompt)
if !strings.Contains(out.FinalPrompt, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation-oriented prompt in live prompt, got %s", out.FinalPrompt)
}
}
@@ -399,7 +384,6 @@ func TestApplyCurrentInputFileCarriesHistoryText(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
DS: ds,
@@ -423,13 +407,15 @@ func TestApplyCurrentInputFileCarriesHistoryText(t *testing.T) {
if out.HistoryText != string(ds.uploadCalls[0].Data) {
t.Fatalf("expected current input file flow to preserve uploaded text in history, got %q", out.HistoryText)
}
if !strings.Contains(out.HistoryText, "# DS2API_HISTORY.txt") || !strings.Contains(out.HistoryText, "=== 1. SYSTEM ===") {
t.Fatalf("expected history text to use numbered transcript format, got %q", out.HistoryText)
}
}
func TestChatCompletionsCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -454,7 +440,7 @@ func TestChatCompletionsCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *t
t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
}
upload := ds.uploadCalls[0]
if upload.Filename != "history.txt" {
if upload.Filename != "DS2API_HISTORY.txt" {
t.Fatalf("unexpected upload filename: %q", upload.Filename)
}
if upload.Purpose != "assistants" {
@@ -462,7 +448,10 @@ func TestChatCompletionsCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *t
}
historyText := string(upload.Data)
if strings.Contains(historyText, "[file content end]") || strings.Contains(historyText, "[file content begin]") || strings.Contains(historyText, "[file name]:") {
t.Fatalf("expected plain history transcript without wrapper tags, got %s", historyText)
t.Fatalf("expected history transcript without file wrapper tags, got %s", historyText)
}
if !strings.Contains(historyText, "# DS2API_HISTORY.txt") || !strings.Contains(historyText, "=== 1. SYSTEM ===") {
t.Fatalf("expected history transcript to use numbered sections, got %s", historyText)
}
if !strings.Contains(historyText, "latest user turn") {
t.Fatalf("expected full context to include latest turn, got %s", historyText)
@@ -471,8 +460,8 @@ func TestChatCompletionsCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *t
t.Fatal("expected completion payload to be captured")
}
promptText, _ := ds.completionReq["prompt"].(string)
if !strings.Contains(promptText, "Answer the latest user request directly.") {
t.Fatalf("expected neutral completion prompt, got %s", promptText)
if !strings.Contains(promptText, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation-oriented prompt, got %s", promptText)
}
if strings.Contains(promptText, "first user turn") || strings.Contains(promptText, "latest user turn") {
t.Fatalf("expected prompt to hide original turns, got %s", promptText)
@@ -497,7 +486,6 @@ func TestResponsesCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *testing
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -523,12 +511,16 @@ func TestResponsesCurrentInputFileUploadsContextAndKeepsNeutralPrompt(t *testing
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected 1 upload call, got %d", len(ds.uploadCalls))
}
historyText := string(ds.uploadCalls[0].Data)
if !strings.Contains(historyText, "# DS2API_HISTORY.txt") || !strings.Contains(historyText, "=== 1. SYSTEM ===") {
t.Fatalf("expected uploaded history text to use numbered transcript format, got %s", historyText)
}
if ds.completionReq == nil {
t.Fatal("expected completion payload to be captured")
}
promptText, _ := ds.completionReq["prompt"].(string)
if !strings.Contains(promptText, "Answer the latest user request directly.") {
t.Fatalf("expected neutral completion prompt, got %s", promptText)
if !strings.Contains(promptText, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") {
t.Fatalf("expected continuation-oriented prompt, got %s", promptText)
}
if strings.Contains(promptText, "first user turn") || strings.Contains(promptText, "latest user turn") {
t.Fatalf("expected prompt to hide original turns, got %s", promptText)
@@ -551,7 +543,6 @@ func TestChatCompletionsCurrentInputFileMapsManagedAuthFailureTo401(t *testing.T
}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusManagedAuthStub{},
@@ -583,7 +574,6 @@ func TestResponsesCurrentInputFileMapsDirectAuthFailureTo401(t *testing.T) {
}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -615,7 +605,6 @@ func TestChatCompletionsCurrentInputFileUploadFailureReturnsInternalServerError(
ds := &inlineUploadDSStub{uploadErr: errors.New("boom")}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
currentInputEnabled: true,
},
Auth: streamStatusAuthStub{},
@@ -644,7 +633,6 @@ func TestCurrentInputFileWorksAcrossAutoDeleteModes(t *testing.T) {
ds := &inlineUploadDSStub{}
h := &openAITestSurface{
Store: mockOpenAIConfig{
wideInput: true,
autoDeleteMode: mode,
currentInputEnabled: true,
},
@@ -669,21 +657,21 @@ func TestCurrentInputFileWorksAcrossAutoDeleteModes(t *testing.T) {
if len(ds.uploadCalls) != 1 {
t.Fatalf("expected current input upload for mode=%s, got %d", mode, len(ds.uploadCalls))
}
historyText := string(ds.uploadCalls[0].Data)
if !strings.Contains(historyText, "# DS2API_HISTORY.txt") || !strings.Contains(historyText, "=== 1. SYSTEM ===") {
t.Fatalf("expected uploaded history text to use numbered transcript format, got %s", historyText)
}
if ds.completionReq == nil {
t.Fatalf("expected completion payload for mode=%s", mode)
}
promptText, _ := ds.completionReq["prompt"].(string)
if !strings.Contains(promptText, "Answer the latest user request directly.") || strings.Contains(promptText, "first user turn") || strings.Contains(promptText, "latest user turn") {
if !strings.Contains(promptText, "Continue from the latest state in the attached DS2API_HISTORY.txt context.") || strings.Contains(promptText, "first user turn") || strings.Contains(promptText, "latest user turn") {
t.Fatalf("unexpected prompt for mode=%s: %s", mode, promptText)
}
})
}
}
func defaultToolChoicePolicy() promptcompat.ToolChoicePolicy {
return promptcompat.DefaultToolChoicePolicy()
}
func boolPtr(v bool) *bool {
return &v
}

View File

@@ -1,7 +1,6 @@
package responses
import (
"context"
"io"
"net/http"
"strings"
@@ -10,128 +9,10 @@ import (
"ds2api/internal/auth"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
"ds2api/internal/toolcall"
)
type responsesNonStreamResult struct {
rawThinking string
rawText string
thinking string
toolDetectionThinking string
text string
contentFilter bool
parsed toolcall.ToolCallParseResult
body map[string]any
responseMessageID int
}
func (h *Handler) handleResponsesNonStreamWithRetry(w http.ResponseWriter, ctx context.Context, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
attempts := 0
currentResp := resp
usagePrompt := finalPrompt
accumulatedThinking := ""
accumulatedRawThinking := ""
accumulatedToolDetectionThinking := ""
for {
result, ok := h.collectResponsesNonStreamAttempt(w, currentResp, responseID, model, usagePrompt, thinkingEnabled, searchEnabled, toolNames, toolsRaw)
if !ok {
return
}
accumulatedThinking += sse.TrimContinuationOverlap(accumulatedThinking, result.thinking)
accumulatedRawThinking += sse.TrimContinuationOverlap(accumulatedRawThinking, result.rawThinking)
accumulatedToolDetectionThinking += sse.TrimContinuationOverlap(accumulatedToolDetectionThinking, result.toolDetectionThinking)
result.thinking = accumulatedThinking
result.rawThinking = accumulatedRawThinking
result.toolDetectionThinking = accumulatedToolDetectionThinking
result.parsed = detectAssistantToolCalls(result.rawText, result.text, result.rawThinking, result.toolDetectionThinking, toolNames)
result.body = openaifmt.BuildResponseObjectWithToolCalls(responseID, model, usagePrompt, result.thinking, result.text, result.parsed.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(result.body, refFileTokens)
}
if !shouldRetryResponsesNonStream(result, attempts) {
h.finishResponsesNonStreamResult(w, result, attempts, owner, responseID, toolChoice, traceID)
return
}
attempts++
config.Logger.Info("[openai_empty_retry] attempting synthetic retry", "surface", "responses", "stream", false, "retry_attempt", attempts, "parent_message_id", result.responseMessageID)
retryPow, powErr := h.DS.GetPow(ctx, a, 3)
if powErr != nil {
config.Logger.Warn("[openai_empty_retry] retry PoW fetch failed, falling back to original PoW", "surface", "responses", "stream", false, "retry_attempt", attempts, "error", powErr)
retryPow = pow
}
nextResp, err := h.DS.CallCompletion(ctx, a, clonePayloadForEmptyOutputRetry(payload, result.responseMessageID), retryPow, 3)
if err != nil {
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
config.Logger.Warn("[openai_empty_retry] retry request failed", "surface", "responses", "stream", false, "retry_attempt", attempts, "error", err)
return
}
usagePrompt = usagePromptWithEmptyOutputRetry(usagePrompt, attempts)
currentResp = nextResp
}
}
func (h *Handler) collectResponsesNonStreamAttempt(w http.ResponseWriter, resp *http.Response, responseID, model, usagePrompt string, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any) (responsesNonStreamResult, bool) {
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
writeOpenAIError(w, resp.StatusCode, strings.TrimSpace(string(body)))
return responsesNonStreamResult{}, false
}
result := sse.CollectStream(resp, thinkingEnabled, false)
stripReferenceMarkers := h.compatStripReferenceMarkers()
sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
sanitizedText = replaceCitationMarkersWithLinks(sanitizedText, result.CitationLinks)
}
textParsed := detectAssistantToolCalls(result.Text, sanitizedText, result.Thinking, result.ToolDetectionThinking, toolNames)
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, usagePrompt, sanitizedThinking, sanitizedText, textParsed.Calls, toolsRaw)
return responsesNonStreamResult{
rawThinking: result.Thinking,
rawText: result.Text,
thinking: sanitizedThinking,
toolDetectionThinking: result.ToolDetectionThinking,
text: sanitizedText,
contentFilter: result.ContentFilter,
parsed: textParsed,
body: responseObj,
responseMessageID: result.ResponseMessageID,
}, true
}
func (h *Handler) finishResponsesNonStreamResult(w http.ResponseWriter, result responsesNonStreamResult, attempts int, owner, responseID string, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
if len(result.parsed.Calls) == 0 && writeUpstreamEmptyOutputError(w, result.text, result.thinking, result.contentFilter) {
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "responses", "stream", false, "retry_attempts", attempts, "success_source", "none", "content_filter", result.contentFilter)
return
}
logResponsesToolPolicyRejection(traceID, toolChoice, result.parsed, "text")
if toolChoice.IsRequired() && len(result.parsed.Calls) == 0 {
writeOpenAIErrorWithCode(w, http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
return
}
h.getResponseStore().put(owner, responseID, result.body)
writeJSON(w, http.StatusOK, result.body)
source := "first_attempt"
if attempts > 0 {
source = "synthetic_retry"
}
config.Logger.Info("[openai_empty_retry] completed", "surface", "responses", "stream", false, "retry_attempts", attempts, "success_source", source)
}
func shouldRetryResponsesNonStream(result responsesNonStreamResult, attempts int) bool {
return emptyOutputRetryEnabled() &&
attempts < emptyOutputRetryMaxAttempts() &&
!result.contentFilter &&
len(result.parsed.Calls) == 0 &&
strings.TrimSpace(result.text) == ""
}
func (h *Handler) handleResponsesStreamWithRetry(w http.ResponseWriter, r *http.Request, a *auth.RequestAuth, resp *http.Response, payload map[string]any, pow, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
streamRuntime, initialType, ok := h.prepareResponsesStreamRuntime(w, resp, owner, responseID, model, finalPrompt, refFileTokens, thinkingEnabled, searchEnabled, toolNames, toolsRaw, toolChoice, traceID)
if !ok {
@@ -193,7 +74,7 @@ func (h *Handler) prepareResponsesStreamRuntime(w http.ResponseWriter, resp *htt
}
streamRuntime := newResponsesStreamRuntime(
w, rc, canFlush, responseID, model, finalPrompt, thinkingEnabled, searchEnabled,
h.compatStripReferenceMarkers(), toolNames, toolsRaw, len(toolNames) > 0,
stripReferenceMarkersEnabled(), toolNames, toolsRaw, len(toolNames) > 0,
h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence(),
toolChoice, traceID, func(obj map[string]any) {
h.getResponseStore().put(owner, responseID, obj)
@@ -222,7 +103,13 @@ func (h *Handler) consumeResponsesStreamAttempt(r *http.Request, resp *http.Resp
finalReason = "content_filter"
}
},
OnContextDone: func() {
streamRuntime.markContextCancelled()
},
})
if streamRuntime.finalErrorCode == string(streamengine.StopReasonContextCancelled) {
return true, false
}
terminalWritten := streamRuntime.finalize(finalReason, allowDeferEmpty && finalReason != "content_filter")
if terminalWritten {
return true, false
@@ -235,6 +122,10 @@ func logResponsesStreamTerminal(streamRuntime *responsesStreamRuntime, attempts
if attempts > 0 {
source = "synthetic_retry"
}
if streamRuntime.finalErrorCode == string(streamengine.StopReasonContextCancelled) {
config.Logger.Info("[openai_empty_retry] terminal cancelled", "surface", "responses", "stream", true, "retry_attempts", attempts, "error_code", streamRuntime.finalErrorCode)
return
}
if streamRuntime.failed {
config.Logger.Info("[openai_empty_retry] terminal empty output", "surface", "responses", "stream", true, "retry_attempts", attempts, "success_source", "none", "error_code", streamRuntime.finalErrorCode)
return

View File

@@ -0,0 +1,70 @@
package responses
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"ds2api/internal/promptcompat"
"ds2api/internal/stream"
)
func makeResponsesOpenAISSEHTTPResponse(lines ...string) *http.Response {
body := strings.Join(lines, "\n")
if !strings.HasSuffix(body, "\n") {
body += "\n"
}
return &http.Response{
StatusCode: http.StatusOK,
Header: make(http.Header),
Body: io.NopCloser(strings.NewReader(body)),
}
}
func TestConsumeResponsesStreamAttemptMarksContextCancelledState(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel()
req := httptest.NewRequest(http.MethodPost, "/v1/responses", nil).WithContext(ctx)
rec := httptest.NewRecorder()
streamRuntime := newResponsesStreamRuntime(
rec,
http.NewResponseController(rec),
true,
"resp-cancelled",
"deepseek-v4-flash",
"prompt",
false,
false,
true,
nil,
nil,
false,
false,
promptcompat.DefaultToolChoicePolicy(),
"",
nil,
)
resp := makeResponsesOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"hello"}`,
`data: [DONE]`,
)
h := &Handler{}
terminalWritten, retryable := h.consumeResponsesStreamAttempt(req, resp, streamRuntime, "text", false, true)
if !terminalWritten || retryable {
t.Fatalf("expected cancelled attempt to terminate without retry, got terminalWritten=%v retryable=%v", terminalWritten, retryable)
}
if !streamRuntime.failed {
t.Fatalf("expected cancelled response stream to be marked failed")
}
if got, want := streamRuntime.finalErrorCode, string(stream.StopReasonContextCancelled); got != want {
t.Fatalf("expected cancelled final error code %q, got %q", want, got)
}
if streamRuntime.finalErrorMessage == "" {
t.Fatalf("expected cancelled final error message to be preserved")
}
}

View File

@@ -11,7 +11,7 @@ import (
"ds2api/internal/httpapi/openai/history"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/toolcall"
"ds2api/internal/textclean"
"ds2api/internal/toolstream"
)
@@ -29,11 +29,8 @@ type Handler struct {
responses *responseStore
}
func (h *Handler) compatStripReferenceMarkers() bool {
if h == nil {
return true
}
return shared.CompatStripReferenceMarkers(h.Store)
func stripReferenceMarkersEnabled() bool {
return textclean.StripReferenceMarkersEnabled()
}
func (h *Handler) applyCurrentInputFile(ctx context.Context, a *auth.RequestAuth, stdReq promptcompat.StandardRequest) (promptcompat.StandardRequest, error) {
@@ -98,18 +95,6 @@ func cleanVisibleOutput(text string, stripReferenceMarkers bool) string {
return shared.CleanVisibleOutput(text, stripReferenceMarkers)
}
func replaceCitationMarkersWithLinks(text string, links map[int]string) string {
return shared.ReplaceCitationMarkersWithLinks(text, links)
}
func upstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
return shared.UpstreamEmptyOutputDetail(contentFilter, text, thinking)
}
func writeUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
return shared.WriteUpstreamEmptyOutputError(w, text, thinking, contentFilter)
}
func emptyOutputRetryEnabled() bool {
return shared.EmptyOutputRetryEnabled()
}
@@ -129,7 +114,3 @@ func usagePromptWithEmptyOutputRetry(originalPrompt string, retryAttempts int) s
func filterIncrementalToolCallDeltasByAllowed(deltas []toolstream.ToolCallDelta, seenNames map[int]string) []toolstream.ToolCallDelta {
return shared.FilterIncrementalToolCallDeltasByAllowed(deltas, seenNames)
}
func detectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking string, toolNames []string) toolcall.ToolCallParseResult {
return shared.DetectAssistantToolCalls(rawText, visibleText, exposedThinking, detectionThinking, toolNames)
}

View File

@@ -11,7 +11,9 @@ import (
"github.com/go-chi/chi/v5"
"github.com/google/uuid"
"ds2api/internal/assistantturn"
"ds2api/internal/auth"
"ds2api/internal/completionruntime"
"ds2api/internal/config"
dsprotocol "ds2api/internal/deepseek/protocol"
openaifmt "ds2api/internal/format/openai"
@@ -92,34 +94,35 @@ func (h *Handler) Responses(w http.ResponseWriter, r *http.Request) {
return
}
sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
if err != nil {
if a.UseConfigToken {
writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
} else {
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
responseID := "resp_" + strings.ReplaceAll(uuid.NewString(), "-", "")
if !stdReq.Stream {
result, outErr := completionruntime.ExecuteNonStreamWithRetry(r.Context(), h.DS, a, stdReq, completionruntime.Options{
StripReferenceMarkers: stripReferenceMarkersEnabled(),
RetryEnabled: true,
CurrentInputFile: h.Store,
})
if outErr != nil {
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
return
}
pow, err := h.DS.GetPow(r.Context(), a, 3)
if err != nil {
writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
return
}
payload := stdReq.CompletionPayload(sessionID)
resp, err := h.DS.CallCompletion(r.Context(), a, payload, pow, 3)
if err != nil {
writeOpenAIError(w, http.StatusInternalServerError, "Failed to get completion.")
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, stdReq.ResponseModel, result.Turn.Prompt, result.Turn.Thinking, result.Turn.Text, result.Turn.ToolCalls, stdReq.ToolsRaw)
responseObj["usage"] = assistantturn.OpenAIResponsesUsage(result.Turn)
h.getResponseStore().put(owner, responseID, responseObj)
writeJSON(w, http.StatusOK, responseObj)
return
}
responseID := "resp_" + strings.ReplaceAll(uuid.NewString(), "-", "")
refFileTokens := stdReq.RefFileTokens
if stdReq.Stream {
h.handleResponsesStreamWithRetry(w, r, a, resp, payload, pow, owner, responseID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, stdReq.ToolChoice, traceID)
start, outErr := completionruntime.StartCompletion(r.Context(), h.DS, a, stdReq, completionruntime.Options{
CurrentInputFile: h.Store,
})
if outErr != nil {
writeOpenAIErrorWithCode(w, outErr.Status, outErr.Message, outErr.Code)
return
}
h.handleResponsesNonStreamWithRetry(w, r.Context(), a, resp, payload, pow, owner, responseID, stdReq.ResponseModel, stdReq.PromptTokenText, refFileTokens, stdReq.Thinking, stdReq.Search, stdReq.ToolNames, stdReq.ToolsRaw, stdReq.ToolChoice, traceID)
streamReq := start.Request
refFileTokens := streamReq.RefFileTokens
h.handleResponsesStreamWithRetry(w, r, a, start.Response, start.Payload, start.Pow, owner, responseID, streamReq.ResponseModel, streamReq.PromptTokenText, refFileTokens, streamReq.Thinking, streamReq.Search, streamReq.ToolNames, streamReq.ToolsRaw, streamReq.ToolChoice, traceID)
}
func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Response, owner, responseID, model, finalPrompt string, refFileTokens int, thinkingEnabled, searchEnabled bool, toolNames []string, toolsRaw any, toolChoice promptcompat.ToolChoicePolicy, traceID string) {
@@ -130,28 +133,26 @@ func (h *Handler) handleResponsesNonStream(w http.ResponseWriter, resp *http.Res
return
}
result := sse.CollectStream(resp, thinkingEnabled, true)
stripReferenceMarkers := h.compatStripReferenceMarkers()
sanitizedThinking := cleanVisibleOutput(result.Thinking, stripReferenceMarkers)
sanitizedText := cleanVisibleOutput(result.Text, stripReferenceMarkers)
if searchEnabled {
sanitizedText = replaceCitationMarkersWithLinks(sanitizedText, result.CitationLinks)
}
textParsed := detectAssistantToolCalls(result.Text, sanitizedText, result.Thinking, result.ToolDetectionThinking, toolNames)
if len(textParsed.Calls) == 0 && writeUpstreamEmptyOutputError(w, sanitizedText, sanitizedThinking, result.ContentFilter) {
return
}
logResponsesToolPolicyRejection(traceID, toolChoice, textParsed, "text")
callCount := len(textParsed.Calls)
if toolChoice.IsRequired() && callCount == 0 {
writeOpenAIErrorWithCode(w, http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
turn := assistantturn.BuildTurnFromCollected(result, assistantturn.BuildOptions{
Model: model,
Prompt: finalPrompt,
RefFileTokens: refFileTokens,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkersEnabled(),
ToolNames: toolNames,
ToolsRaw: toolsRaw,
ToolChoice: toolChoice,
})
logResponsesToolPolicyRejection(traceID, toolChoice, turn.ParsedToolCalls, "text")
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{})
if outcome.ShouldFail {
writeOpenAIErrorWithCode(w, outcome.Error.Status, outcome.Error.Message, outcome.Error.Code)
return
}
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, finalPrompt, sanitizedThinking, sanitizedText, textParsed.Calls, toolsRaw)
if refFileTokens > 0 {
addRefFileTokensToUsage(responseObj, refFileTokens)
}
responseObj := openaifmt.BuildResponseObjectWithToolCalls(responseID, model, finalPrompt, turn.Thinking, turn.Text, turn.ToolCalls, toolsRaw)
responseObj["usage"] = assistantturn.OpenAIResponsesUsage(turn)
h.getResponseStore().put(owner, responseID, responseObj)
writeJSON(w, http.StatusOK, responseObj)
}
@@ -176,7 +177,7 @@ func (h *Handler) handleResponsesStream(w http.ResponseWriter, r *http.Request,
}
bufferToolContent := len(toolNames) > 0
emitEarlyToolDeltas := h.toolcallFeatureMatchEnabled() && h.toolcallEarlyEmitHighConfidence()
stripReferenceMarkers := h.compatStripReferenceMarkers()
stripReferenceMarkers := stripReferenceMarkersEnabled()
streamRuntime := newResponsesStreamRuntime(
w,

View File

@@ -1,12 +1,14 @@
package responses
import (
"ds2api/internal/assistantturn"
"ds2api/internal/toolcall"
"net/http"
"strings"
"ds2api/internal/config"
openaifmt "ds2api/internal/format/openai"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/promptcompat"
"ds2api/internal/sse"
streamengine "ds2api/internal/stream"
@@ -36,31 +38,27 @@ type responsesStreamRuntime struct {
toolCallsEmitted bool
toolCallsDoneEmitted bool
sieve toolstream.State
rawThinking strings.Builder
thinking strings.Builder
toolDetectionThinking strings.Builder
rawText strings.Builder
text strings.Builder
visibleText strings.Builder
responseMessageID int
streamToolCallIDs map[int]string
functionItemIDs map[int]string
functionOutputIDs map[int]int
functionArgs map[int]string
functionDone map[int]bool
functionAdded map[int]bool
functionNames map[int]string
messageItemID string
messageOutputID int
nextOutputID int
messageAdded bool
messagePartAdded bool
sequence int
failed bool
finalErrorStatus int
finalErrorMessage string
finalErrorCode string
sieve toolstream.State
accumulator shared.StreamAccumulator
visibleText strings.Builder
responseMessageID int
streamToolCallIDs map[int]string
functionItemIDs map[int]string
functionOutputIDs map[int]int
functionArgs map[int]string
functionDone map[int]bool
functionAdded map[int]bool
functionNames map[int]string
messageItemID string
messageOutputID int
nextOutputID int
messageAdded bool
messagePartAdded bool
sequence int
failed bool
finalErrorStatus int
finalErrorMessage string
finalErrorCode string
persistResponse func(obj map[string]any)
}
@@ -108,6 +106,11 @@ func newResponsesStreamRuntime(
toolChoice: toolChoice,
traceID: traceID,
persistResponse: persistResponse,
accumulator: shared.StreamAccumulator{
ThinkingEnabled: thinkingEnabled,
SearchEnabled: searchEnabled,
StripReferenceMarkers: stripReferenceMarkers,
},
}
}
@@ -139,6 +142,13 @@ func (s *responsesStreamRuntime) failResponse(status int, message, code string)
s.sendDone()
}
func (s *responsesStreamRuntime) markContextCancelled() {
s.failed = true
s.finalErrorStatus = 499
s.finalErrorMessage = "request context cancelled"
s.finalErrorCode = string(streamengine.StopReasonContextCancelled)
}
func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput bool) bool {
s.failed = false
s.finalErrorStatus = 0
@@ -148,11 +158,31 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
s.processToolStreamEvents(toolstream.Flush(&s.sieve, s.toolNames), true, true)
}
finalThinking := s.thinking.String()
finalToolDetectionThinking := s.toolDetectionThinking.String()
finalText := cleanVisibleOutput(s.text.String(), s.stripReferenceMarkers)
textParsed := detectAssistantToolCalls(s.rawText.String(), finalText, s.rawThinking.String(), finalToolDetectionThinking, s.toolNames)
detected := textParsed.Calls
finalThinking := s.accumulator.Thinking.String()
finalToolDetectionThinking := s.accumulator.ToolDetectionThinking.String()
finalText := s.accumulator.Text.String()
turn := assistantturn.BuildTurnFromStreamSnapshot(assistantturn.StreamSnapshot{
RawText: s.accumulator.RawText.String(),
VisibleText: finalText,
RawThinking: s.accumulator.RawThinking.String(),
VisibleThinking: finalThinking,
DetectionThinking: finalToolDetectionThinking,
ContentFilter: finishReason == "content_filter",
ResponseMessageID: s.responseMessageID,
AlreadyEmittedCalls: s.toolCallsEmitted,
AlreadyEmittedToolRaw: s.toolCallsDoneEmitted,
}, assistantturn.BuildOptions{
Model: s.model,
Prompt: s.finalPrompt,
RefFileTokens: s.refFileTokens,
SearchEnabled: s.searchEnabled,
StripReferenceMarkers: s.stripReferenceMarkers,
ToolNames: s.toolNames,
ToolsRaw: s.toolsRaw,
ToolChoice: s.toolChoice,
})
textParsed := turn.ParsedToolCalls
detected := turn.ToolCalls
s.logToolPolicyRejections(textParsed)
if len(detected) > 0 {
@@ -164,12 +194,11 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
s.closeMessageItem()
if s.toolChoice.IsRequired() && len(detected) == 0 {
s.failResponse(http.StatusUnprocessableEntity, "tool_choice requires at least one valid tool call.", "tool_choice_violation")
return true
}
if len(detected) == 0 && strings.TrimSpace(finalText) == "" {
status, message, code := upstreamEmptyOutputDetail(finishReason == "content_filter", finalText, finalThinking)
outcome := assistantturn.FinalizeTurn(turn, assistantturn.FinalizeOptions{
AlreadyEmittedToolCalls: s.toolCallsEmitted || s.toolCallsDoneEmitted,
})
if outcome.ShouldFail {
status, message, code := outcome.Error.Status, outcome.Error.Message, outcome.Error.Code
if deferEmptyOutput {
s.finalErrorStatus = status
s.finalErrorMessage = message
@@ -181,7 +210,7 @@ func (s *responsesStreamRuntime) finalize(finishReason string, deferEmptyOutput
}
s.closeIncompleteFunctionItems()
obj := s.buildCompletedResponseObject(finalThinking, finalText, detected)
obj := s.buildCompletedResponseObject(turn.Thinking, turn.Text, detected)
if s.persistResponse != nil {
s.persistResponse(obj)
}
@@ -221,62 +250,27 @@ func (s *responsesStreamRuntime) onParsed(parsed sse.LineResult) streamengine.Pa
return streamengine.ParsedDecision{Stop: true}
}
contentSeen := false
batch := responsesDeltaBatch{runtime: s}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlap(s.toolDetectionThinking.String(), p.Text)
if trimmed != "" {
s.toolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
accumulated := s.accumulator.Apply(parsed)
for _, p := range accumulated.Parts {
if p.Type == "thinking" {
rawTrimmed := sse.TrimContinuationOverlap(s.rawThinking.String(), p.Text)
if rawTrimmed != "" {
s.rawThinking.WriteString(rawTrimmed)
contentSeen = true
}
if !s.thinkingEnabled {
continue
}
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if cleanedText == "" {
continue
}
trimmed := sse.TrimContinuationOverlap(s.thinking.String(), cleanedText)
if trimmed == "" {
continue
}
s.thinking.WriteString(trimmed)
batch.append("reasoning", trimmed)
batch.append("reasoning", p.VisibleText)
continue
}
rawTrimmed := sse.TrimContinuationOverlap(s.rawText.String(), p.Text)
if rawTrimmed == "" {
if p.RawText == "" {
continue
}
s.rawText.WriteString(rawTrimmed)
contentSeen = true
cleanedText := cleanVisibleOutput(rawTrimmed, s.stripReferenceMarkers)
if s.searchEnabled && sse.IsCitation(cleanedText) {
if p.CitationOnly {
continue
}
trimmed := sse.TrimContinuationOverlap(s.text.String(), cleanedText)
if trimmed != "" {
s.text.WriteString(trimmed)
}
if !s.bufferToolContent {
if trimmed == "" {
continue
}
batch.append("text", trimmed)
batch.append("text", p.VisibleText)
continue
}
batch.flush()
s.processToolStreamEvents(toolstream.ProcessChunk(&s.sieve, rawTrimmed, s.toolNames), true, true)
s.processToolStreamEvents(toolstream.ProcessChunk(&s.sieve, p.RawText, s.toolNames), true, true)
}
batch.flush()
return streamengine.ParsedDecision{ContentSeen: contentSeen}
return streamengine.ParsedDecision{ContentSeen: accumulated.ContentSeen}
}

View File

@@ -35,16 +35,12 @@ type DeepSeekCaller interface {
type ConfigReader interface {
ModelAliases() map[string]string
CompatWideInputStrictOutput() bool
CompatStripReferenceMarkers() bool
ToolcallMode() string
ToolcallEarlyEmitConfidence() string
ResponsesStoreTTLSeconds() int
EmbeddingsProvider() string
AutoDeleteMode() string
AutoDeleteSessions() bool
HistorySplitEnabled() bool
HistorySplitTriggerAfterTurns() int
CurrentInputFileEnabled() bool
CurrentInputFileMinChars() int
ThinkingInjectionEnabled() bool
@@ -58,13 +54,6 @@ type Deps struct {
ChatHistory *chathistory.Store
}
func CompatStripReferenceMarkers(store ConfigReader) bool {
if store == nil {
return true
}
return store.CompatStripReferenceMarkers()
}
var WriteJSON = util.WriteJSON
var _ AuthResolver = (*auth.Resolver)(nil)

View File

@@ -0,0 +1,104 @@
package shared
import (
"strings"
"ds2api/internal/sse"
)
type StreamAccumulator struct {
ThinkingEnabled bool
SearchEnabled bool
StripReferenceMarkers bool
RawThinking strings.Builder
Thinking strings.Builder
ToolDetectionThinking strings.Builder
RawText strings.Builder
Text strings.Builder
}
type StreamPartDelta struct {
Type string
RawText string
VisibleText string
CitationOnly bool
}
type StreamAccumulatorResult struct {
ContentSeen bool
Parts []StreamPartDelta
}
func (a *StreamAccumulator) Apply(parsed sse.LineResult) StreamAccumulatorResult {
out := StreamAccumulatorResult{}
for _, p := range parsed.ToolDetectionThinkingParts {
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.ToolDetectionThinking, p.Text)
if trimmed != "" {
a.ToolDetectionThinking.WriteString(trimmed)
}
}
for _, p := range parsed.Parts {
if p.Type == "thinking" {
delta := a.applyThinkingPart(p.Text)
if delta.RawText != "" {
out.ContentSeen = true
}
if delta.RawText != "" || delta.VisibleText != "" {
out.Parts = append(out.Parts, delta)
}
continue
}
delta := a.applyTextPart(p.Text)
if delta.RawText != "" {
out.ContentSeen = true
}
if delta.RawText != "" || delta.VisibleText != "" || delta.CitationOnly {
out.Parts = append(out.Parts, delta)
}
}
return out
}
func (a *StreamAccumulator) applyThinkingPart(text string) StreamPartDelta {
rawTrimmed := sse.TrimContinuationOverlapFromBuilder(&a.RawThinking, text)
if rawTrimmed != "" {
a.RawThinking.WriteString(rawTrimmed)
}
delta := StreamPartDelta{Type: "thinking", RawText: rawTrimmed}
if !a.ThinkingEnabled || rawTrimmed == "" {
return delta
}
cleanedText := CleanVisibleOutput(rawTrimmed, a.StripReferenceMarkers)
if cleanedText == "" {
return delta
}
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.Thinking, cleanedText)
if trimmed == "" {
return delta
}
a.Thinking.WriteString(trimmed)
delta.VisibleText = trimmed
return delta
}
func (a *StreamAccumulator) applyTextPart(text string) StreamPartDelta {
rawTrimmed := sse.TrimContinuationOverlapFromBuilder(&a.RawText, text)
if rawTrimmed == "" {
return StreamPartDelta{Type: "text"}
}
a.RawText.WriteString(rawTrimmed)
delta := StreamPartDelta{Type: "text", RawText: rawTrimmed}
cleanedText := CleanVisibleOutput(rawTrimmed, a.StripReferenceMarkers)
if a.SearchEnabled && sse.IsCitation(cleanedText) {
delta.CitationOnly = true
return delta
}
trimmed := sse.TrimContinuationOverlapFromBuilder(&a.Text, cleanedText)
if trimmed == "" {
return delta
}
a.Text.WriteString(trimmed)
delta.VisibleText = trimmed
return delta
}

View File

@@ -0,0 +1,97 @@
package shared
import (
"testing"
"ds2api/internal/sse"
)
func TestStreamAccumulatorAppliesThinkingAndTextDedupe(t *testing.T) {
acc := StreamAccumulator{ThinkingEnabled: true, StripReferenceMarkers: true}
thinkingPrefix := "this is a long thinking snapshot prefix used by DeepSeek continue replay"
textPrefix := "this is a long visible answer snapshot prefix used by DeepSeek continue replay"
first := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: thinkingPrefix},
{Type: "text", Text: textPrefix},
},
})
second := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: thinkingPrefix + " next"},
{Type: "text", Text: textPrefix + " world"},
},
})
if !first.ContentSeen || !second.ContentSeen {
t.Fatalf("expected both chunks to mark content seen")
}
if got := acc.RawThinking.String(); got != thinkingPrefix+" next" {
t.Fatalf("raw thinking = %q", got)
}
if got := acc.Thinking.String(); got != thinkingPrefix+" next" {
t.Fatalf("thinking = %q", got)
}
if got := acc.RawText.String(); got != textPrefix+" world" {
t.Fatalf("raw text = %q", got)
}
if got := acc.Text.String(); got != textPrefix+" world" {
t.Fatalf("text = %q", got)
}
if got := second.Parts[0].VisibleText; got != " next" {
t.Fatalf("thinking delta = %q", got)
}
if got := second.Parts[1].VisibleText; got != " world" {
t.Fatalf("text delta = %q", got)
}
}
func TestStreamAccumulatorKeepsHiddenThinkingForToolDetection(t *testing.T) {
acc := StreamAccumulator{ThinkingEnabled: false, StripReferenceMarkers: true}
result := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{
{Type: "thinking", Text: "<tool_calls></tool_calls>"},
},
ToolDetectionThinkingParts: []sse.ContentPart{
{Type: "thinking", Text: "detect"},
{Type: "thinking", Text: " tools"},
},
})
if !result.ContentSeen {
t.Fatalf("expected hidden thinking to count as upstream content")
}
if got := acc.RawThinking.String(); got != "<tool_calls></tool_calls>" {
t.Fatalf("raw thinking = %q", got)
}
if got := acc.Thinking.String(); got != "" {
t.Fatalf("visible thinking = %q", got)
}
if got := acc.ToolDetectionThinking.String(); got != "detect tools" {
t.Fatalf("tool detection thinking = %q", got)
}
}
func TestStreamAccumulatorSuppressesCitationTextWhenSearchEnabled(t *testing.T) {
acc := StreamAccumulator{SearchEnabled: true, StripReferenceMarkers: true}
result := acc.Apply(sse.LineResult{
Parsed: true,
Parts: []sse.ContentPart{{Type: "text", Text: "[citation:1]"}},
})
if !result.ContentSeen {
t.Fatalf("expected citation chunk to mark upstream content")
}
if len(result.Parts) != 1 || !result.Parts[0].CitationOnly {
t.Fatalf("expected citation-only delta, got %#v", result.Parts)
}
if got := acc.RawText.String(); got != "[citation:1]" {
t.Fatalf("raw text = %q", got)
}
if got := acc.Text.String(); got != "" {
t.Fatalf("visible text = %q", got)
}
}

View File

@@ -1,9 +1,12 @@
package shared
import "net/http"
import (
"net/http"
"strings"
)
func ShouldWriteUpstreamEmptyOutputError(text string) bool {
return text == ""
func ShouldWriteUpstreamEmptyOutputError(text, thinking string) bool {
return strings.TrimSpace(text) == ""
}
func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int, string, string) {
@@ -18,7 +21,7 @@ func UpstreamEmptyOutputDetail(contentFilter bool, text, thinking string) (int,
}
func WriteUpstreamEmptyOutputError(w http.ResponseWriter, text, thinking string, contentFilter bool) bool {
if !ShouldWriteUpstreamEmptyOutputError(text) {
if !ShouldWriteUpstreamEmptyOutputError(text, thinking) {
return false
}
status, message, code := UpstreamEmptyOutputDetail(contentFilter, text, thinking)

View File

@@ -135,7 +135,7 @@ func captureStatusMiddleware(statuses *[]int) func(http.Handler) http.Handler {
func TestChatCompletionsStreamStatusCapturedAs200(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello"}`, "data: [DONE]")},
}
@@ -164,7 +164,7 @@ func TestChatCompletionsStreamStatusCapturedAs200(t *testing.T) {
func TestResponsesStreamStatusCapturedAs200(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"hello"}`, "data: [DONE]")},
}
@@ -193,7 +193,7 @@ func TestResponsesStreamStatusCapturedAs200(t *testing.T) {
func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"合法前缀"}`,
@@ -243,7 +243,7 @@ func TestChatCompletionsStreamContentFilterStopsNormallyWithoutLeak(t *testing.T
func TestChatCompletionsStreamEmitsFailureFrameWhenUpstreamOutputEmpty(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse("data: [DONE]")},
}
@@ -289,7 +289,7 @@ func TestChatCompletionsStreamRetriesEmptyOutputOnSameSession(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -345,11 +345,11 @@ func TestChatCompletionsStreamRetriesEmptyOutputOnSameSession(t *testing.T) {
func TestChatCompletionsNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
ds := &streamStatusDSSeqStub{resps: []*http.Response{
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"response_message_id":99}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -380,9 +380,6 @@ func TestChatCompletionsNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
if asString(message["content"]) != "visible" {
t.Fatalf("expected retry visible content, got %#v", message)
}
if !strings.Contains(asString(message["reasoning_content"]), "plan") {
t.Fatalf("expected first-attempt reasoning to be preserved, got %#v", message)
}
}
func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
@@ -391,7 +388,7 @@ func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -413,7 +410,7 @@ func TestChatCompletionsContentFilterDoesNotRetry(t *testing.T) {
func TestResponsesStreamUsageIgnoresBatchAccumulatedTokenUsage(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"hello"}`,
@@ -464,7 +461,7 @@ func TestResponsesStreamRetriesThinkingOnlyOutput(t *testing.T) {
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -499,11 +496,11 @@ func TestResponsesStreamRetriesThinkingOnlyOutput(t *testing.T) {
func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
ds := &streamStatusDSSeqStub{resps: []*http.Response{
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88,"p":"response/thinking_content","v":"plan"}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"response_message_id":88}`, "data: [DONE]"),
makeOpenAISSEHTTPResponse(`data: {"p":"response/content","v":"visible"}`, "data: [DONE]"),
}}
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: ds,
}
@@ -540,16 +537,16 @@ func TestResponsesNonStreamRetriesThinkingOnlyOutput(t *testing.T) {
if len(content) == 0 {
t.Fatalf("expected content entries, got %#v", item)
}
reasoning, _ := content[0].(map[string]any)
if asString(reasoning["type"]) != "reasoning" || !strings.Contains(asString(reasoning["text"]), "plan") {
t.Fatalf("expected preserved reasoning entry, got %#v", content)
textEntry, _ := content[0].(map[string]any)
if asString(textEntry["type"]) != "output_text" || asString(textEntry["text"]) != "visible" {
t.Fatalf("expected visible text entry, got %#v", content)
}
}
func TestResponsesNonStreamUsageIgnoresPromptAndOutputTokenUsage(t *testing.T) {
statuses := make([]int, 0, 1)
h := &openAITestSurface{
Store: mockOpenAIConfig{wideInput: true},
Store: mockOpenAIConfig{},
Auth: streamStatusAuthStub{},
DS: streamStatusDSStub{resp: makeOpenAISSEHTTPResponse(
`data: {"p":"response/content","v":"ok"}`,

View File

@@ -104,13 +104,10 @@ func registerOpenAITestRoutes(r chi.Router, h *openAITestSurface) {
r.Post("/v1/responses", h.responsesHandler().Responses)
r.Get("/v1/responses/{response_id}", h.responsesHandler().GetResponseByID)
r.Post("/v1/files", h.filesHandler().UploadFile)
r.Get("/v1/files/{file_id}", h.filesHandler().RetrieveFile)
r.Post("/v1/embeddings", h.embeddingsHandler().Embeddings)
}
func splitOpenAIHistoryMessages(messages []any, triggerAfterTurns int) ([]any, []any) {
return history.SplitOpenAIHistoryMessages(messages, triggerAfterTurns)
}
func buildOpenAICurrentInputContextTranscript(messages []any) string {
return promptcompat.BuildOpenAICurrentInputContextTranscript(messages)
}

View File

@@ -0,0 +1,134 @@
package requestbody
import (
"bytes"
"errors"
"io"
"mime"
"net/http"
"strings"
"unicode/utf8"
)
var (
ErrInvalidUTF8Body = errors.New("invalid utf-8 request body")
errRequestBodyTooLarge = errors.New("request body too large")
)
const maxJSONUTF8ValidationSize = 100 << 20
// ValidateJSONUTF8 validates complete JSON request bodies before downstream
// decoders can silently replace malformed UTF-8 or stop before trailing bytes.
func ValidateJSONUTF8(next http.Handler) http.Handler {
if next == nil {
return http.HandlerFunc(func(http.ResponseWriter, *http.Request) {})
}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if shouldValidateJSONBody(r) {
r.Body = validateAndReplayBody(r.Body)
}
next.ServeHTTP(w, r)
})
}
func shouldValidateJSONBody(r *http.Request) bool {
if r == nil || r.Body == nil {
return false
}
path := ""
if r.URL != nil {
path = r.URL.Path
}
return isJSONContentType(r.Header.Get("Content-Type")) || isKnownJSONRequestPath(r.Method, path)
}
func isJSONContentType(raw string) bool {
raw = strings.TrimSpace(raw)
if raw == "" {
return false
}
mediaType, _, err := mime.ParseMediaType(raw)
if err != nil {
mediaType = raw
}
mediaType = strings.ToLower(strings.TrimSpace(mediaType))
return strings.Contains(mediaType, "json")
}
func isKnownJSONRequestPath(method, path string) bool {
switch strings.ToUpper(strings.TrimSpace(method)) {
case http.MethodPost, http.MethodPut, http.MethodPatch, http.MethodDelete:
default:
return false
}
path = strings.TrimSpace(path)
if path == "" {
return false
}
switch {
case path == "/v1/chat/completions" || path == "/chat/completions":
return true
case path == "/v1/responses" || path == "/responses":
return true
case path == "/v1/embeddings" || path == "/embeddings":
return true
case path == "/anthropic/v1/messages" || path == "/v1/messages" || path == "/messages":
return true
case path == "/anthropic/v1/messages/count_tokens" || path == "/v1/messages/count_tokens" || path == "/messages/count_tokens":
return true
case strings.HasPrefix(path, "/v1beta/models/") || strings.HasPrefix(path, "/v1/models/"):
return strings.Contains(path, ":generateContent") || strings.Contains(path, ":streamGenerateContent")
case strings.HasPrefix(path, "/admin/"):
return true
default:
return false
}
}
func validateAndReplayBody(body io.ReadCloser) io.ReadCloser {
if body == nil {
return body
}
raw, err := io.ReadAll(io.LimitReader(body, maxJSONUTF8ValidationSize+1))
if err != nil {
return &errorReadCloser{err: err, closer: body}
}
if len(raw) > maxJSONUTF8ValidationSize {
return &errorReadCloser{err: errRequestBodyTooLarge, closer: body}
}
if !utf8.Valid(raw) {
return &errorReadCloser{err: ErrInvalidUTF8Body, closer: body}
}
return &replayReadCloser{Reader: bytes.NewReader(raw), closer: body}
}
type replayReadCloser struct {
*bytes.Reader
closer io.Closer
}
func (r *replayReadCloser) Close() error {
if r == nil || r.closer == nil {
return nil
}
return r.closer.Close()
}
type errorReadCloser struct {
err error
closer io.Closer
}
func (r *errorReadCloser) Read([]byte) (int, error) {
if r == nil || r.err == nil {
return 0, io.EOF
}
return 0, r.err
}
func (r *errorReadCloser) Close() error {
if r == nil || r.closer == nil {
return nil
}
return r.closer.Close()
}

View File

@@ -0,0 +1,158 @@
package requestbody
import (
"bytes"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
)
type singleByteReadCloser struct {
data []byte
pos int
}
func (r *singleByteReadCloser) Read(p []byte) (int, error) {
if r.pos >= len(r.data) {
return 0, io.EOF
}
p[0] = r.data[r.pos]
r.pos++
return 1, nil
}
func (r *singleByteReadCloser) Close() error {
return nil
}
func TestValidateJSONUTF8AllowsSplitMultibyteRunes(t *testing.T) {
body := []byte(`{"text":"你好"}`)
handler := ValidateJSONUTF8(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req map[string]any
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
t.Fatalf("unexpected decode error: %v", err)
}
w.WriteHeader(http.StatusNoContent)
}))
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", &singleByteReadCloser{data: body})
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusNoContent {
t.Fatalf("expected 204 for valid utf-8 json, got %d body=%q", rec.Code, rec.Body.String())
}
}
func TestValidateJSONUTF8RejectsInvalidBytesBeforeJSONDecode(t *testing.T) {
body := append([]byte(`{"text":"`), 0xff)
body = append(body, []byte(`"}`)...)
handler := ValidateJSONUTF8(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req map[string]any
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte(err.Error()))
return
}
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json; charset=utf-8")
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusBadRequest {
t.Fatalf("expected 400 for invalid utf-8 json, got %d body=%q", rec.Code, rec.Body.String())
}
if !strings.Contains(strings.ToLower(rec.Body.String()), "invalid utf-8") {
t.Fatalf("expected utf-8 validation error, got %q", rec.Body.String())
}
}
func TestValidateJSONUTF8RejectsInvalidBytesWithoutJSONContentTypeOnKnownPath(t *testing.T) {
body := append([]byte(`{"text":"`), 0xff)
body = append(body, []byte(`"}`)...)
handler := ValidateJSONUTF8(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req map[string]any
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte(err.Error()))
return
}
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", bytes.NewReader(body))
req.Header.Set("Content-Type", "text/plain")
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusBadRequest {
t.Fatalf("expected 400 for invalid utf-8 json, got %d body=%q", rec.Code, rec.Body.String())
}
if !strings.Contains(strings.ToLower(rec.Body.String()), "invalid utf-8") {
t.Fatalf("expected utf-8 validation error, got %q", rec.Body.String())
}
}
func TestValidateJSONUTF8RejectsTrailingInvalidBytesAfterJSONValue(t *testing.T) {
body := append([]byte(`{"text":"ok"}`), 0xff)
handler := ValidateJSONUTF8(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req map[string]any
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte(err.Error()))
return
}
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodPost, "/v1/chat/completions", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusBadRequest {
t.Fatalf("expected 400 for trailing invalid utf-8, got %d body=%q", rec.Code, rec.Body.String())
}
if !strings.Contains(strings.ToLower(rec.Body.String()), "invalid utf-8") {
t.Fatalf("expected utf-8 validation error, got %q", rec.Body.String())
}
}
func TestIsJSONContentType(t *testing.T) {
for _, raw := range []string{
"application/json",
"application/json; charset=utf-8",
"application/problem+json",
"application/vnd.api+json",
} {
if !isJSONContentType(raw) {
t.Fatalf("expected %q to be recognized as json", raw)
}
}
for _, raw := range []string{
"multipart/form-data; boundary=abc",
"text/plain",
"application/octet-stream",
} {
if isJSONContentType(raw) {
t.Fatalf("expected %q not to be recognized as json", raw)
}
}
}
func TestIsKnownJSONRequestPathIncludesGeminiStream(t *testing.T) {
if !isKnownJSONRequestPath(http.MethodPost, "/v1beta/models/gemini-pro:streamGenerateContent") {
t.Fatal("expected Gemini stream generate path to be recognized as json")
}
}

View File

@@ -1,7 +1,7 @@
'use strict';
const MIN_DELTA_FLUSH_CHARS = 160;
const MAX_DELTA_FLUSH_WAIT_MS = 80;
const MIN_DELTA_FLUSH_CHARS = 16;
const MAX_DELTA_FLUSH_WAIT_MS = 20;
function createChatCompletionEmitter({ res, sessionID, created, model, isClosed }) {
let firstChunkSent = false;

View File

@@ -17,7 +17,6 @@ const {
resolveToolcallPolicy,
formatIncrementalToolCallDeltas,
filterIncrementalToolCallDeltasByAllowed,
boolDefaultTrue,
resetStreamToolCallState,
} = require('./toolcall_policy');
const { createChatCompletionEmitter, createDeltaCoalescer } = require('./stream_emitter');
@@ -58,7 +57,7 @@ async function handleVercelStream(req, res, rawBody, payload) {
const toolPolicy = resolveToolcallPolicy(prep.body, payload.tools);
const toolNames = toolPolicy.toolNames;
const emitEarlyToolDeltas = toolPolicy.emitEarlyToolDeltas;
const stripReferenceMarkers = boolDefaultTrue(prep.body.compat && prep.body.compat.strip_reference_markers);
const stripReferenceMarkers = true;
if (!model || !leaseID || !deepseekToken || !initialPowHeader || !completionPayload) {
writeOpenAIError(res, 500, 'invalid vercel prepare response');

View File

@@ -2,7 +2,12 @@
const CDATA_PATTERN = /^<!\[CDATA\[([\s\S]*?)]]>$/i;
const XML_ATTR_PATTERN = /\b([a-z0-9_:-]+)\s*=\s*("([^"]*)"|'([^']*)')/gi;
const TOOL_MARKUP_NAMES = ['tool_calls', 'invoke', 'parameter'];
const TOOL_MARKUP_NAMES = [
{ raw: 'tool_calls', canonical: 'tool_calls' },
{ raw: 'tool-calls', canonical: 'tool_calls', dsmlOnly: true },
{ raw: 'invoke', canonical: 'invoke' },
{ raw: 'parameter', canonical: 'parameter' },
];
const {
toStringSafe,
@@ -437,7 +442,7 @@ function scanToolMarkupTagAt(text, start) {
const prefix = consumeToolMarkupNamePrefix(raw, lower, i);
i = prefix.next;
const dsmlLike = prefix.dsmlLike;
const { name, len } = matchToolMarkupName(lower, i);
const { name, len } = matchToolMarkupName(lower, i, dsmlLike);
if (!name) {
return null;
}
@@ -610,24 +615,31 @@ function consumeToolMarkupNamePrefixOnce(raw, lower, idx) {
return { next: idx + 1, ok: true };
}
if (lower.startsWith('dsml', idx)) {
return { next: idx + 'dsml'.length, ok: true };
let next = idx + 'dsml'.length;
if (next < raw.length && raw[next] === '-') {
next += 1;
}
return { next, ok: true };
}
return { next: idx, ok: false };
}
function hasToolMarkupNamePrefix(lowerTail) {
for (const name of TOOL_MARKUP_NAMES) {
if (lowerTail.startsWith(name) || name.startsWith(lowerTail)) {
if (lowerTail.startsWith(name.raw) || name.raw.startsWith(lowerTail)) {
return true;
}
}
return false;
}
function matchToolMarkupName(lower, start) {
function matchToolMarkupName(lower, start, dsmlLike) {
for (const name of TOOL_MARKUP_NAMES) {
if (lower.startsWith(name, start)) {
return { name, len: name.length };
if (name.dsmlOnly && !dsmlLike) {
continue;
}
if (lower.startsWith(name.raw, start)) {
return { name: name.canonical, len: name.raw.length };
}
}
return { name: '', len: 0 };
@@ -846,10 +858,18 @@ function parseMarkupValue(raw, paramName = '') {
if (cdata.ok) {
const literal = parseJSONLiteralValue(cdata.value);
if (literal.ok) {
const literalArray = coerceArrayValue(literal.value, paramName);
if (literalArray.ok) {
return literalArray.value;
}
return literal.value;
}
const structured = parseStructuredCDATAParameterValue(paramName, cdata.value);
return structured.ok ? structured.value : cdata.value;
if (structured.ok) {
return structured.value;
}
const looseArray = parseLooseJSONArrayValue(cdata.value, paramName);
return looseArray.ok ? looseArray.value : cdata.value;
}
const s = toStringSafe(extractRawTagValue(raw)).trim();
if (!s) {
@@ -862,8 +882,14 @@ function parseMarkupValue(raw, paramName = '') {
return nested;
}
if (nested && typeof nested === 'object') {
const nestedArray = coerceArrayValue(nested, paramName);
if (nestedArray.ok) {
return nestedArray.value;
}
if (isOnlyRawValue(nested)) {
return toStringSafe(nested._raw);
const rawValue = toStringSafe(nested._raw);
const looseArray = parseLooseJSONArrayValue(rawValue, paramName);
return looseArray.ok ? looseArray.value : rawValue;
}
return nested;
}
@@ -871,8 +897,16 @@ function parseMarkupValue(raw, paramName = '') {
const literal = parseJSONLiteralValue(s);
if (literal.ok) {
const literalArray = coerceArrayValue(literal.value, paramName);
if (literalArray.ok) {
return literalArray.value;
}
return literal.value;
}
const looseArray = parseLooseJSONArrayValue(s, paramName);
if (looseArray.ok) {
return looseArray.value;
}
return s;
}
@@ -1008,6 +1042,226 @@ function parseJSONLiteralValue(raw) {
}
}
function parseLooseJSONArrayValue(raw, paramName = '') {
if (preservesCDATAStringParameter(paramName)) {
return { ok: false, value: null };
}
const s = toStringSafe(raw).trim();
if (!s) {
return { ok: false, value: null };
}
const candidate = parseLooseJSONArrayCandidate(s, paramName);
if (candidate.ok) {
return candidate;
}
const segments = splitTopLevelJSONValues(s);
if (segments.length < 2) {
return { ok: false, value: null };
}
const out = [];
for (const segment of segments) {
const parsed = parseLooseArrayElementValue(segment);
if (!parsed.ok) {
return { ok: false, value: null };
}
out.push(parsed.value);
}
return { ok: true, value: out };
}
function parseLooseJSONArrayCandidate(raw, paramName = '') {
const parsed = parseLooseArrayElementValue(raw);
if (!parsed.ok) {
return { ok: false, value: null };
}
return coerceArrayValue(parsed.value, paramName);
}
function parseLooseArrayElementValue(raw) {
const s = toStringSafe(raw).trim();
if (!s) {
return { ok: false, value: null };
}
const literal = parseJSONLiteralValue(s);
if (literal.ok) {
return literal;
}
const repairedBackslashes = repairInvalidJSONBackslashes(s);
if (repairedBackslashes !== s) {
try {
const parsed = JSON.parse(repairedBackslashes);
return { ok: true, value: parsed };
} catch (_err) {
// Fall through.
}
}
const repairedLoose = repairLooseJSON(s);
if (repairedLoose !== s) {
try {
const parsed = JSON.parse(repairedLoose);
return { ok: true, value: parsed };
} catch (_err) {
// Fall through.
}
}
if (s.includes('<') && s.includes('>')) {
const parsed = parseMarkupInput(s);
if (Array.isArray(parsed)) {
return { ok: true, value: parsed };
}
if (parsed && typeof parsed === 'object') {
return { ok: true, value: parsed };
}
}
return { ok: false, value: null };
}
function coerceArrayValue(value, paramName = '') {
if (Array.isArray(value)) {
return { ok: true, value };
}
if (!value || typeof value !== 'object') {
return { ok: false, value: null };
}
const keys = Object.keys(value);
if (keys.length !== 1) {
return { ok: false, value: null };
}
if (Object.prototype.hasOwnProperty.call(value, 'item')) {
const items = value.item;
const nested = coerceArrayValue(items, '');
return nested.ok ? nested : { ok: true, value: [items] };
}
if (paramName && Object.prototype.hasOwnProperty.call(value, paramName)) {
const nested = coerceArrayValue(value[paramName], '');
if (nested.ok) {
return nested;
}
}
return { ok: false, value: null };
}
function splitTopLevelJSONValues(raw) {
const s = toStringSafe(raw).trim();
if (!s) {
return [];
}
const values = [];
let start = 0;
let depth = 0;
let inString = false;
let escaped = false;
for (let i = 0; i < s.length; i += 1) {
const ch = s[i];
if (inString) {
if (escaped) {
escaped = false;
continue;
}
if (ch === '\\') {
escaped = true;
continue;
}
if (ch === '"') {
inString = false;
}
continue;
}
if (ch === '"') {
inString = true;
continue;
}
if (ch === '{' || ch === '[') {
depth += 1;
continue;
}
if (ch === '}' || ch === ']') {
if (depth > 0) {
depth -= 1;
}
continue;
}
if (ch === ',' && depth === 0) {
const segment = s.slice(start, i).trim();
if (!segment) {
return [];
}
values.push(segment);
start = i + 1;
}
}
const last = s.slice(start).trim();
if (!last) {
return [];
}
values.push(last);
return values.length > 1 ? values : [];
}
function repairInvalidJSONBackslashes(s) {
if (!s || !s.includes('\\')) {
return s;
}
let out = '';
for (let i = 0; i < s.length; i += 1) {
const ch = s[i];
if (ch !== '\\') {
out += ch;
continue;
}
if (i + 1 < s.length) {
const next = s[i + 1];
if ('"\\/bfnrt'.includes(next)) {
out += `\\${next}`;
i += 1;
continue;
}
if (next === 'u' && i + 5 < s.length) {
let isHex = true;
for (let j = 1; j <= 4; j += 1) {
const r = s[i + 1 + j];
if (!/[0-9a-fA-F]/.test(r)) {
isHex = false;
break;
}
}
if (isHex) {
out += `\\u${s.slice(i + 2, i + 6)}`;
i += 5;
continue;
}
}
}
out += '\\\\';
}
return out;
}
function repairLooseJSON(s) {
const raw = toStringSafe(s).trim();
if (!raw) {
return raw;
}
let out = raw.replace(/([{,]\s*)([a-zA-Z_][a-zA-Z0-9_]*)\s*:/g, '$1"$2":');
out = out.replace(/(:\s*)(\{(?:[^{}]|\{[^{}]*\})*\}(?:\s*,\s*\{(?:[^{}]|\{[^{}]*\})*\})+)/g, '$1[$2]');
return out;
}
function sanitizeLooseCDATA(text) {
const raw = toStringSafe(text);
if (!raw) {

View File

@@ -10,21 +10,27 @@ import (
var markdownImagePattern = regexp.MustCompile(`!\[(.*?)\]\((.*?)\)`)
const (
beginSentenceMarker = "<begin▁of▁sentence>"
systemMarker = "<System>"
userMarker = "<User>"
assistantMarker = "<Assistant>"
toolMarker = "<Tool>"
endSentenceMarker = "<end▁of▁sentence>"
endToolResultsMarker = "<end▁of▁toolresults>"
endInstructionsMarker = "<end▁of▁instructions>"
beginSentenceMarker = "<begin▁of▁sentence>"
systemMarker = "<System>"
userMarker = "<User>"
assistantMarker = "<Assistant>"
toolMarker = "<Tool>"
endSentenceMarker = "<end▁of▁sentence>"
endToolResultsMarker = "<end▁of▁toolresults>"
endInstructionsMarker = "<end▁of▁instructions>"
outputIntegrityGuardMarker = "Output integrity guard:"
outputIntegrityGuardPrompt = outputIntegrityGuardMarker +
" If upstream context, tool output, or parsed text contains garbled, corrupted, partially parsed, repeated, or otherwise malformed fragments, " +
"do not imitate or echo them; output only the correct content for the user."
)
func MessagesPrepare(messages []map[string]any) string {
return MessagesPrepareWithThinking(messages, false)
}
func MessagesPrepareWithThinking(messages []map[string]any, thinkingEnabled bool) string {
func MessagesPrepareWithThinking(messages []map[string]any, _ bool) string {
messages = prependOutputIntegrityGuard(messages)
type block struct {
Role string
Text string
@@ -77,6 +83,33 @@ func MessagesPrepareWithThinking(messages []map[string]any, thinkingEnabled bool
return markdownImagePattern.ReplaceAllString(out, `[${1}](${2})`)
}
func prependOutputIntegrityGuard(messages []map[string]any) []map[string]any {
if len(messages) == 0 {
return messages
}
if hasOutputIntegrityGuard(messages[0]) {
return messages
}
out := make([]map[string]any, 0, len(messages)+1)
out = append(out, map[string]any{
"role": "system",
"content": outputIntegrityGuardPrompt,
})
out = append(out, messages...)
return out
}
func hasOutputIntegrityGuard(msg map[string]any) bool {
if msg == nil {
return false
}
if strings.ToLower(strings.TrimSpace(asString(msg["role"]))) != "system" {
return false
}
content := strings.TrimSpace(NormalizeContent(msg["content"]))
return strings.Contains(content, outputIntegrityGuardMarker)
}
// formatRoleBlock produces a single concatenated block: marker + text + endMarker.
// No whitespace is inserted between marker and text so role boundaries stay
// compact and predictable for downstream parsers.

View File

@@ -35,8 +35,8 @@ func TestMessagesPrepareUsesTurnSuffixes(t *testing.T) {
if !strings.HasPrefix(got, "<begin▁of▁sentence>") {
t.Fatalf("expected begin-of-sentence marker, got %q", got)
}
if !strings.Contains(got, "<System>System rule<end▁of▁instructions>") {
t.Fatalf("expected system instructions suffix, got %q", got)
if !strings.Contains(got, "<System>") || !strings.Contains(got, "<end▁of▁instructions>") || !strings.Contains(got, "System rule") {
t.Fatalf("expected system instructions to remain present, got %q", got)
}
if !strings.Contains(got, "<User>Question") {
t.Fatalf("expected user question, got %q", got)
@@ -49,6 +49,23 @@ func TestMessagesPrepareUsesTurnSuffixes(t *testing.T) {
}
}
func TestMessagesPreparePrependsOutputIntegrityGuard(t *testing.T) {
messages := []map[string]any{
{"role": "system", "content": "System rule"},
{"role": "user", "content": "Question"},
}
got := MessagesPrepare(messages)
if !strings.HasPrefix(got, beginSentenceMarker+systemMarker+outputIntegrityGuardPrompt) {
t.Fatalf("expected output integrity guard to be prepended, got %q", got)
}
if !strings.Contains(got, outputIntegrityGuardPrompt+"\n\nSystem rule") {
t.Fatalf("expected output integrity guard to precede system prompt content, got %q", got)
}
if !strings.Contains(got, "<User>Question") {
t.Fatalf("expected user question after guard, got %q", got)
}
}
func TestNormalizeContentArrayFallsBackToContentWhenTextEmpty(t *testing.T) {
got := NormalizeContent([]any{
map[string]any{"type": "text", "text": "", "content": "from-content"},

View File

@@ -1,35 +1,108 @@
package promptcompat
import (
"fmt"
"strings"
"ds2api/internal/prompt"
)
const CurrentInputContextFilename = "history.txt"
const CurrentInputContextFilename = "DS2API_HISTORY.txt"
const historyTranscriptTitle = "# DS2API_HISTORY.txt"
const historyTranscriptSummary = "Prior conversation history and tool progress."
func BuildOpenAIHistoryTranscript(messages []any) string {
return buildOpenAIInjectedFileTranscript(messages)
return buildOpenAIHistoryTranscript(messages)
}
func BuildOpenAICurrentUserInputTranscript(text string) string {
if strings.TrimSpace(text) == "" {
return ""
}
return BuildOpenAICurrentInputContextTranscript([]any{
return buildOpenAIHistoryTranscript([]any{
map[string]any{"role": "user", "content": text},
})
}
func BuildOpenAICurrentInputContextTranscript(messages []any) string {
return buildOpenAIInjectedFileTranscript(messages)
return buildOpenAIHistoryTranscript(messages)
}
func buildOpenAIInjectedFileTranscript(messages []any) string {
normalized := NormalizeOpenAIMessagesForPrompt(messages, "")
transcript := strings.TrimSpace(prompt.MessagesPrepare(normalized))
func buildOpenAIHistoryTranscript(messages []any) string {
if len(messages) == 0 {
return ""
}
var b strings.Builder
b.WriteString(historyTranscriptTitle)
b.WriteString("\n")
b.WriteString(historyTranscriptSummary)
b.WriteString("\n\n")
entry := 0
for _, raw := range messages {
msg, ok := raw.(map[string]any)
if !ok {
continue
}
role := normalizeOpenAIRoleForPrompt(strings.ToLower(strings.TrimSpace(asString(msg["role"]))))
content := strings.TrimSpace(buildOpenAIHistoryEntry(role, msg))
if content == "" {
continue
}
entry++
fmt.Fprintf(&b, "=== %d. %s ===\n%s\n\n", entry, strings.ToUpper(roleLabelForHistory(role)), content)
}
transcript := strings.TrimSpace(b.String())
if transcript == "" {
return ""
}
return transcript
return transcript + "\n"
}
func buildOpenAIHistoryEntry(role string, msg map[string]any) string {
switch role {
case "assistant":
return strings.TrimSpace(buildAssistantContentForPrompt(msg))
case "tool", "function":
return strings.TrimSpace(buildToolHistoryContent(msg))
case "system", "user":
return strings.TrimSpace(NormalizeOpenAIContentForPrompt(msg["content"]))
default:
return strings.TrimSpace(NormalizeOpenAIContentForPrompt(msg["content"]))
}
}
func buildToolHistoryContent(msg map[string]any) string {
content := strings.TrimSpace(NormalizeOpenAIContentForPrompt(msg["content"]))
parts := make([]string, 0, 2)
if name := strings.TrimSpace(asString(msg["name"])); name != "" {
parts = append(parts, "name="+name)
}
if callID := strings.TrimSpace(asString(msg["tool_call_id"])); callID != "" {
parts = append(parts, "tool_call_id="+callID)
}
header := ""
if len(parts) > 0 {
header = "[" + strings.Join(parts, " ") + "]"
}
switch {
case header != "" && content != "":
return header + "\n" + content
case header != "":
return header
default:
return content
}
}
func roleLabelForHistory(role string) string {
role = strings.ToLower(strings.TrimSpace(role))
switch role {
case "function":
return "tool"
case "":
return "unknown"
default:
return role
}
}

View File

@@ -88,6 +88,38 @@ func TestBuildOpenAIFinalPrompt_VercelPreparePathKeepsFinalAnswerInstruction(t *
}
}
func TestBuildOpenAIFinalPromptPrependsOutputIntegrityGuard(t *testing.T) {
messages := []any{
map[string]any{"role": "system", "content": "You are helpful"},
map[string]any{"role": "user", "content": "请调用工具"},
}
tools := []any{
map[string]any{
"type": "function",
"function": map[string]any{
"name": "search",
"description": "search docs",
"parameters": map[string]any{
"type": "object",
},
},
},
}
finalPrompt, _ := buildOpenAIFinalPrompt(messages, tools, "", false)
guardIdx := strings.Index(finalPrompt, "Output integrity guard")
toolIdx := strings.Index(finalPrompt, "TOOL CALL FORMAT")
if guardIdx < 0 {
t.Fatalf("expected output integrity guard in final prompt, got: %q", finalPrompt)
}
if toolIdx < 0 {
t.Fatalf("expected tool instructions in final prompt, got: %q", finalPrompt)
}
if guardIdx > toolIdx {
t.Fatalf("expected output integrity guard to precede tool instructions, got: %q", finalPrompt)
}
}
func TestBuildOpenAIFinalPromptReadLikeToolIncludesCacheGuard(t *testing.T) {
messages := []any{
map[string]any{"role": "user", "content": "请读取文件"},

View File

@@ -10,7 +10,6 @@ import (
type ConfigReader interface {
ModelAliases() map[string]string
CompatWideInputStrictOutput() bool
}
func NormalizeOpenAIChatRequest(store ConfigReader, req map[string]any, traceID string) (StandardRequest, error) {
@@ -74,17 +73,7 @@ func NormalizeOpenAIResponsesRequest(store ConfigReader, req map[string]any, tra
thinkingEnabled = false
}
// Keep width-control as an explicit policy hook even if current default is true.
allowWideInput := true
if store != nil {
allowWideInput = store.CompatWideInputStrictOutput()
}
var messagesRaw []any
if allowWideInput {
messagesRaw = ResponsesMessagesFromRequest(req)
} else if msgs, ok := req["messages"].([]any); ok && len(msgs) > 0 {
messagesRaw = msgs
}
messagesRaw := ResponsesMessagesFromRequest(req)
if len(messagesRaw) == 0 {
return StandardRequest{}, fmt.Errorf("request must include 'input' or 'messages'")
}

View File

@@ -27,6 +27,7 @@ import (
"ds2api/internal/httpapi/openai/files"
"ds2api/internal/httpapi/openai/responses"
"ds2api/internal/httpapi/openai/shared"
"ds2api/internal/httpapi/requestbody"
"ds2api/internal/webui"
)
@@ -75,6 +76,7 @@ func NewApp() (*App, error) {
r.Use(filteredLogger())
r.Use(middleware.Recoverer)
r.Use(cors)
r.Use(requestbody.ValidateJSONUTF8)
r.Use(timeout(0))
healthzHandler := func(w http.ResponseWriter, _ *http.Request) {
@@ -97,6 +99,7 @@ func NewApp() (*App, error) {
r.Post("/v1/responses", responsesHandler.Responses)
r.Get("/v1/responses/{response_id}", responsesHandler.GetResponseByID)
r.Post("/v1/files", filesHandler.UploadFile)
r.Get("/v1/files/{file_id}", filesHandler.RetrieveFile)
r.Post("/v1/embeddings", embeddingsHandler.Embeddings)
// Root OpenAI aliases support clients configured with the bare DS2API service URL.
r.Get("/models", modelsHandler.ListModels)
@@ -105,6 +108,7 @@ func NewApp() (*App, error) {
r.Post("/responses", responsesHandler.Responses)
r.Get("/responses/{response_id}", responsesHandler.GetResponseByID)
r.Post("/files", filesHandler.UploadFile)
r.Get("/files/{file_id}", filesHandler.RetrieveFile)
r.Post("/embeddings", embeddingsHandler.Embeddings)
claude.RegisterRoutes(r, claudeHandler)
gemini.RegisterRoutes(r, geminiHandler)

Some files were not shown because too many files have changed in this diff Show More