mirror of
https://github.com/CJackHwang/ds2api.git
synced 2026-05-10 11:17:41 +08:00
feat: implement managed-account rotation on 429 empty-output completion retries
This commit is contained in:
@@ -42,6 +42,7 @@ Docs: [Overview](README.en.md) / [Architecture](docs/ARCHITECTURE.en.md) / [Depl
|
||||
- Adapter responsibilities are streamlined to: **request normalization → DeepSeek invocation → protocol-shaped rendering**, reducing legacy split-logic paths.
|
||||
- Tool-calling semantics are aligned between Go and Node runtime: models should output the DSML shell `<|DSML|tool_calls>` → `<|DSML|invoke name="...">` → `<|DSML|parameter name="...">`; DS2API also accepts legacy canonical XML `<tool_calls>` → `<invoke name="...">` → `<parameter name="...">`. DSML is normalized back to XML at the parser entry, so internal parsing remains XML-based, with stream-time anti-leak filtering.
|
||||
- `Admin API` separates static config from runtime policy: `/admin/config*` for configuration state, `/admin/settings*` for runtime behavior.
|
||||
- When upstream returns a thinking-only response with no visible text, the Go main path for both streaming and non-streaming completions retries once in the same DeepSeek session: it appends the prompt suffix `"Previous reply had no visible output. Please regenerate the visible final answer or tool call now."` and sets `parent_message_id`. If that same-account retry would still end as `429 upstream_empty_output`, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once before returning 429.
|
||||
|
||||
---
|
||||
|
||||
@@ -84,7 +85,7 @@ Two header formats accepted:
|
||||
- Token is in `config.keys` → **Managed account mode**: DS2API auto-selects an account via rotation
|
||||
- Token is not in `config.keys` → **Direct token mode**: treated as a DeepSeek token directly
|
||||
|
||||
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account; if the target account does not exist or the managed-account queue is exhausted, the request returns `429`, and current responses do not include `Retry-After`. If the account exists but login/refresh fails, the request returns the underlying `401` or upstream error.
|
||||
**Optional header**: `X-Ds2-Target-Account: <email_or_mobile>` — Pin a specific managed account; if the target account does not exist or the managed-account queue is exhausted, the request returns `429`, and current responses do not include `Retry-After`. If the account exists but login/refresh fails, the request returns the underlying `401` or upstream error. Without a pinned target, managed-account completion requests try one alternate-account fresh retry before returning an empty-output 429; pinned-target requests and requests with no other available account do not switch.
|
||||
Gemini-compatible clients can also send `x-goog-api-key`, `?key=`, or `?api_key=` as the caller credential source.
|
||||
|
||||
### Admin Endpoints (`/admin/*`)
|
||||
@@ -1258,7 +1259,7 @@ Clients should handle HTTP status code plus `error` / `detail` fields.
|
||||
| Code | Meaning |
|
||||
| --- | --- |
|
||||
| `401` | Authentication failed (invalid key/token, or expired admin JWT) |
|
||||
| `429` | Too many requests (exceeded inflight + queue capacity; current responses do not include `Retry-After`) |
|
||||
| `429` | Too many requests (exceeded inflight + queue capacity, or upstream thinking-only output with no visible answer; managed-account mode first tries one alternate-account fresh retry; current responses do not include `Retry-After`) |
|
||||
| `503` | Model unavailable or upstream error |
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user