mirror of
https://github.com/CJackHwang/ds2api.git
synced 2026-05-17 22:55:10 +08:00
feat: implement managed-account rotation on 429 empty-output completion retries
This commit is contained in:
@@ -336,6 +336,7 @@ For business endpoints (`/v1/*`, `/anthropic/*`, Gemini routes), DS2API supports
|
||||
| **Direct token** | If the token is not in `config.keys`, DS2API treats it as a DeepSeek token directly |
|
||||
|
||||
Optional header `X-Ds2-Target-Account`: Pin a specific managed account (value is email or mobile).
|
||||
When no target account is pinned, if a completion would end as `429 upstream_empty_output` after the same-account empty-output retry, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once.
|
||||
Gemini routes also accept `x-goog-api-key`, or `?key=` / `?api_key=` when no auth header is present.
|
||||
|
||||
## Concurrency Model
|
||||
@@ -349,6 +350,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
|
||||
|
||||
- When inflight slots are full, requests enter a waiting queue — **no immediate 429**
|
||||
- 429 is returned only when total load exceeds inflight + queue capacity
|
||||
- Completion empty-output 429s first get the same-account compensation retry; managed-account mode also tries one alternate-account fresh retry before returning the final 429
|
||||
- `GET /admin/queue/status` returns real-time concurrency state
|
||||
|
||||
## Tool Call Adaptation
|
||||
|
||||
Reference in New Issue
Block a user