feat: implement managed-account rotation on 429 empty-output completion retries

This commit is contained in:
CJACK
2026-05-10 00:41:45 +08:00
parent 3cc7f469f3
commit ddd42e532e
15 changed files with 362 additions and 20 deletions

View File

@@ -336,6 +336,7 @@ For business endpoints (`/v1/*`, `/anthropic/*`, Gemini routes), DS2API supports
| **Direct token** | If the token is not in `config.keys`, DS2API treats it as a DeepSeek token directly |
Optional header `X-Ds2-Target-Account`: Pin a specific managed account (value is email or mobile).
When no target account is pinned, if a completion would end as `429 upstream_empty_output` after the same-account empty-output retry, managed-account mode switches to the next available account, creates a fresh session, and retries the original payload once.
Gemini routes also accept `x-goog-api-key`, or `?key=` / `?api_key=` when no auth header is present.
## Concurrency Model
@@ -349,6 +350,7 @@ Queue limit = DS2API_ACCOUNT_MAX_QUEUE (default = recommended concurrency)
- When inflight slots are full, requests enter a waiting queue — **no immediate 429**
- 429 is returned only when total load exceeds inflight + queue capacity
- Completion empty-output 429s first get the same-account compensation retry; managed-account mode also tries one alternate-account fresh retry before returning the final 429
- `GET /admin/queue/status` returns real-time concurrency state
## Tool Call Adaptation