feat: Introduce hybrid streaming for Vercel deployments using a Go prepare endpoint and Node.js stream handler to mitigate buffering.

This commit is contained in:
CJACK
2026-02-16 21:56:01 +08:00
parent d668465734
commit d70a0acaa8
10 changed files with 783 additions and 28 deletions

View File

@@ -66,6 +66,11 @@ DS2API_ADMIN_KEY=admin
# Default: enabled on local/Docker, disabled on Vercel.
# DS2API_AUTO_BUILD_WEBUI=true
# Internal auth secret used by the Vercel hybrid streaming path
# (Go prepare endpoint <-> Node stream function).
# Optional: falls back to DS2API_ADMIN_KEY when unset.
# DS2API_VERCEL_INTERNAL_SECRET=change-me
# ---------------------------------------------------------------
# Vercel sync integration (optional)
# ---------------------------------------------------------------

View File

@@ -67,6 +67,9 @@ Notes:
- Rewrites and cache headers: `vercel.json`
- Build stage runs `npm ci --prefix webui && npm run build --prefix webui` automatically
- `vercel.json` routes `/admin/assets/*` and the `/admin` page to static output, while `/admin/*` APIs still go to `api/index`
- To mitigate Go Runtime streaming buffering, `/v1/chat/completions` on Vercel is routed to `api/chat-stream.js` (Node Runtime)
- `api/chat-stream.js` automatically falls back to the Go entry for non-stream requests or requests with `tools` (internal `__go=1`)
- `api/chat-stream.js` is data-path only (stream relay + SSE conversion); auth/account/session/PoW preparation still comes from an internal Go prepare endpoint (enabled on Vercel only)
Minimum environment variables:
@@ -82,6 +85,7 @@ Optional:
- `DS2API_ACCOUNT_CONCURRENCY` (alias of the same setting)
- `DS2API_ACCOUNT_MAX_QUEUE` (waiting queue limit, default=`recommended_concurrency`)
- `DS2API_ACCOUNT_QUEUE_SIZE` (alias of the same setting)
- `DS2API_VERCEL_INTERNAL_SECRET` (optional internal auth secret for Vercel hybrid streaming path; falls back to `DS2API_ADMIN_KEY` when unset)
Recommended concurrency is computed dynamically as `account_count * per_account_inflight_limit` (default is `account_count * 2`).
When inflight slots are full, requests are queued first; with default queue size, 429 typically starts around `account_count * 4`.
@@ -139,14 +143,19 @@ If you see:
No Output Directory named "public" found after the Build completed.
```
Vercel is validating frontend output against `public`, while this repo emits WebUI assets to `static/admin`.
Vercel is validating frontend output against `public`. This repo builds WebUI into `static/admin`, and uses the parent directory `static` as Vercel output root.
`vercel.json` now explicitly sets:
```json
"outputDirectory": "static/admin"
"outputDirectory": "static"
```
If you manually changed Output Directory in Project Settings, set it to `static/admin` (or clear it and let repo config apply).
If you manually changed Output Directory in Project Settings, set it to `static` (or clear it and let repo config apply).
Vercel streaming note (important):
- Vercel Go Runtime applies platform-level buffering, so this repo uses a hybrid path on Vercel (`Go prepare + Node stream`) to restore real-time SSE behavior.
- This adaptation is Vercel-only; local and Docker remain pure Go.
## 4. Reverse Proxy (Nginx)

View File

@@ -67,6 +67,9 @@ docker-compose up -d --build
- 路由与缓存头:`vercel.json`
- 构建阶段会自动执行 `npm ci --prefix webui && npm run build --prefix webui`
- `vercel.json` 已将 `/admin/assets/*``/admin` 页面走静态产物,`/admin/*` API 仍走 `api/index`
- 为缓解 Go Runtime 的流式缓冲,`/v1/chat/completions` 在 Vercel 上会优先走 `api/chat-stream.js`Node Runtime
- `api/chat-stream.js` 对非流式请求或 `tools` 请求会自动回退到 Go 入口(内部 `__go=1`
- `api/chat-stream.js` 仅负责流式数据转发与 SSE 转换鉴权、账号选择、会话创建、PoW 计算仍由 Go 内部 prepare 接口完成(仅 Vercel 启用)
至少配置环境变量:
@@ -82,6 +85,7 @@ docker-compose up -d --build
- `DS2API_ACCOUNT_CONCURRENCY`(同上别名)
- `DS2API_ACCOUNT_MAX_QUEUE`(等待队列上限,默认=`recommended_concurrency`
- `DS2API_ACCOUNT_QUEUE_SIZE`(同上别名)
- `DS2API_VERCEL_INTERNAL_SECRET`可选Vercel 混合流式链路内部鉴权;未设置时回退使用 `DS2API_ADMIN_KEY`
并发建议值会动态按 `账号数量 × 每账号并发上限` 计算(默认即 `账号数量 × 2`)。
当 in-flight 满时,请求先进入等待队列;默认队列上限等于建议并发值,因此默认 429 阈值约为 `账号数量 × 4`
@@ -138,13 +142,18 @@ Error: Command failed: go build -ldflags -s -w -o .../bootstrap .../main__vc__go
No Output Directory named "public" found after the Build completed.
```
说明 Vercel 正在按 `public` 校验前端产物目录。当前仓库前端产物目录是 `static/admin``vercel.json` 显式配置
说明 Vercel 正在按 `public` 校验前端产物目录。当前仓库会将 WebUI 构建到 `static/admin``vercel.json` 使用上级目录 `static` 作为输出根目录
```json
"outputDirectory": "static/admin"
"outputDirectory": "static"
```
若你在项目设置里手动改过 Output Directory请同步改为 `static/admin` 或清空让仓库配置生效。
若你在项目设置里手动改过 Output Directory请同步改为 `static` 或清空让仓库配置生效。
Vercel 流式说明(重要):
- Vercel 的 Go Runtime 存在平台层响应缓冲,因此本项目在 Vercel 上采用“Go prepare + Node stream”的混合链路来恢复实时 SSE。
- 该适配只在 Vercel 生效;本地与 Docker 仍走纯 Go 链路。
## 4. 反向代理Nginx

View File

@@ -87,6 +87,8 @@ docker-compose logs -f
- 入口:`api/index.go`
- 路由重写:`vercel.json`
- `vercel.json` 会在构建阶段自动执行 `npm ci --prefix webui && npm run build --prefix webui`
- `/v1/chat/completions` 在 Vercel 上默认走 `api/chat-stream.js`Node Runtime以保证实时 SSE
- `api/chat-stream.js` 仅负责流式数据转发;鉴权、账号选择、会话/PoW 准备仍由 Go 内部 prepare 接口处理
- 至少配置:
- `DS2API_ADMIN_KEY`
- `DS2API_CONFIG_JSON`JSON 字符串或 Base64
@@ -166,6 +168,7 @@ cp config.example.json config.json
| `DS2API_WASM_PATH` | PoW wasm 文件路径 |
| `DS2API_STATIC_ADMIN_DIR` | 管理台静态文件目录 |
| `DS2API_AUTO_BUILD_WEBUI` | 启动时缺失 WebUI 时是否自动执行 npm build默认本地开启Vercel 关闭) |
| `DS2API_VERCEL_INTERNAL_SECRET` | Vercel 混合流式链路内部鉴权密钥(可选;未设置时回退用 `DS2API_ADMIN_KEY` |
| `VERCEL_TOKEN` | Vercel 同步 token可选 |
| `VERCEL_PROJECT_ID` | Vercel 项目 ID可选 |
| `VERCEL_TEAM_ID` | Vercel 团队 ID可选 |

View File

@@ -87,6 +87,8 @@ docker-compose logs -f
- Entrypoint: `api/index.go`
- Rewrites: `vercel.json`
- `vercel.json` runs `npm ci --prefix webui && npm run build --prefix webui` during build
- `/v1/chat/completions` is routed to `api/chat-stream.js` (Node Runtime) on Vercel to preserve real-time SSE
- `api/chat-stream.js` is data-path only; auth/account/session/PoW preparation still comes from an internal Go prepare endpoint
- Minimum env vars:
- `DS2API_ADMIN_KEY`
- `DS2API_CONFIG_JSON` (raw JSON or Base64)
@@ -166,6 +168,7 @@ cp config.example.json config.json
| `DS2API_WASM_PATH` | PoW wasm path |
| `DS2API_STATIC_ADMIN_DIR` | Admin static assets dir |
| `DS2API_AUTO_BUILD_WEBUI` | Auto run npm build on startup when WebUI assets are missing (default: enabled locally, disabled on Vercel) |
| `DS2API_VERCEL_INTERNAL_SECRET` | Internal auth secret for Vercel hybrid streaming path (optional; falls back to `DS2API_ADMIN_KEY` if unset) |
| `VERCEL_TOKEN` | Vercel sync token (optional) |
| `VERCEL_PROJECT_ID` | Vercel project ID (optional) |
| `VERCEL_TEAM_ID` | Vercel team ID (optional) |

543
api/chat-stream.js Normal file
View File

@@ -0,0 +1,543 @@
'use strict';
const DEEPSEEK_COMPLETION_URL = 'https://chat.deepseek.com/api/v0/chat/completion';
const BASE_HEADERS = {
Host: 'chat.deepseek.com',
'User-Agent': 'DeepSeek/1.6.11 Android/35',
Accept: 'application/json',
'Content-Type': 'application/json',
'x-client-platform': 'android',
'x-client-version': '1.6.11',
'x-client-locale': 'zh_CN',
'accept-charset': 'UTF-8',
};
const SKIP_PATTERNS = [
'quasi_status',
'elapsed_secs',
'token_usage',
'pending_fragment',
'conversation_mode',
'fragments/-1/status',
'fragments/-2/status',
'fragments/-3/status',
];
module.exports = async function handler(req, res) {
setCorsHeaders(res);
if (req.method === 'OPTIONS') {
res.statusCode = 204;
res.end();
return;
}
if (req.method !== 'POST') {
writeOpenAIError(res, 405, 'method not allowed');
return;
}
const rawBody = await readRawBody(req);
let payload;
try {
payload = JSON.parse(rawBody.toString('utf8') || '{}');
} catch (_err) {
writeOpenAIError(res, 400, 'invalid json');
return;
}
// Keep all non-stream behavior on Go side to avoid compatibility regressions.
if (!toBool(payload.stream) || (Array.isArray(payload.tools) && payload.tools.length > 0)) {
await proxyToGo(req, res, rawBody);
return;
}
const prep = await fetchStreamPrepare(req, rawBody);
if (!prep.ok) {
relayPreparedFailure(res, prep);
return;
}
const model = asString(prep.body.model) || asString(payload.model);
const sessionID = asString(prep.body.session_id) || `chatcmpl-${Date.now()}`;
const deepseekToken = asString(prep.body.deepseek_token);
const powHeader = asString(prep.body.pow_header);
const completionPayload = prep.body.payload && typeof prep.body.payload === 'object' ? prep.body.payload : null;
const finalPrompt = asString(prep.body.final_prompt);
const thinkingEnabled = toBool(prep.body.thinking_enabled);
const searchEnabled = toBool(prep.body.search_enabled);
if (!model || !deepseekToken || !powHeader || !completionPayload) {
writeOpenAIError(res, 500, 'invalid vercel prepare response');
return;
}
const completionRes = await fetch(DEEPSEEK_COMPLETION_URL, {
method: 'POST',
headers: {
...BASE_HEADERS,
authorization: `Bearer ${deepseekToken}`,
'x-ds-pow-response': powHeader,
},
body: JSON.stringify(completionPayload),
});
if (!completionRes.ok || !completionRes.body) {
const detail = await safeReadText(completionRes);
writeOpenAIError(res, 500, detail ? `Failed to get completion: ${detail}` : 'Failed to get completion.');
return;
}
res.statusCode = 200;
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache, no-transform');
res.setHeader('Connection', 'keep-alive');
res.setHeader('X-Accel-Buffering', 'no');
if (typeof res.flushHeaders === 'function') {
res.flushHeaders();
}
const created = Math.floor(Date.now() / 1000);
let firstChunkSent = false;
let currentType = thinkingEnabled ? 'thinking' : 'text';
let thinkingText = '';
let outputText = '';
const decoder = new TextDecoder();
const reader = completionRes.body.getReader();
let buffered = '';
const sendFrame = (obj) => {
res.write(`data: ${JSON.stringify(obj)}\n\n`);
if (typeof res.flush === 'function') {
res.flush();
}
};
const finish = (reason) => {
sendFrame({
id: sessionID,
object: 'chat.completion.chunk',
created,
model,
choices: [{ delta: {}, index: 0, finish_reason: reason }],
usage: buildUsage(finalPrompt, thinkingText, outputText),
});
res.write('data: [DONE]\n\n');
res.end();
};
try {
// eslint-disable-next-line no-constant-condition
while (true) {
const { value, done } = await reader.read();
if (done) {
break;
}
buffered += decoder.decode(value, { stream: true });
const lines = buffered.split('\n');
buffered = lines.pop() || '';
for (const rawLine of lines) {
const line = rawLine.trim();
if (!line.startsWith('data:')) {
continue;
}
const dataStr = line.slice(5).trim();
if (!dataStr) {
continue;
}
if (dataStr === '[DONE]') {
finish('stop');
return;
}
let chunk;
try {
chunk = JSON.parse(dataStr);
} catch (_err) {
continue;
}
if (chunk.error || chunk.code === 'content_filter') {
finish('content_filter');
return;
}
const parsed = parseChunkForContent(chunk, thinkingEnabled, currentType);
currentType = parsed.newType;
if (parsed.finished) {
finish('stop');
return;
}
for (const p of parsed.parts) {
if (!p.text) {
continue;
}
if (searchEnabled && isCitation(p.text)) {
continue;
}
const delta = {};
if (!firstChunkSent) {
delta.role = 'assistant';
firstChunkSent = true;
}
if (p.type === 'thinking') {
thinkingText += p.text;
delta.reasoning_content = p.text;
} else {
outputText += p.text;
delta.content = p.text;
}
sendFrame({
id: sessionID,
object: 'chat.completion.chunk',
created,
model,
choices: [{ delta, index: 0 }],
});
}
}
}
finish('stop');
} catch (_err) {
finish('stop');
}
};
function setCorsHeaders(res) {
res.setHeader('Access-Control-Allow-Origin', '*');
res.setHeader('Access-Control-Allow-Credentials', 'true');
res.setHeader('Access-Control-Allow-Methods', 'GET, POST, OPTIONS, PUT, DELETE');
res.setHeader('Access-Control-Allow-Headers', 'Content-Type, Authorization, X-API-Key, X-Ds2-Target-Account');
}
function header(req, key) {
if (!req || !req.headers) {
return '';
}
return asString(req.headers[key.toLowerCase()]);
}
async function readRawBody(req) {
if (Buffer.isBuffer(req.body)) {
return req.body;
}
if (typeof req.body === 'string') {
return Buffer.from(req.body);
}
if (req.body && typeof req.body === 'object') {
return Buffer.from(JSON.stringify(req.body));
}
const chunks = [];
for await (const chunk of req) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk));
}
return Buffer.concat(chunks);
}
async function fetchStreamPrepare(req, rawBody) {
const proto = asString(header(req, 'x-forwarded-proto')) || 'https';
const host = asString(header(req, 'host'));
const url = new URL(`${proto}://${host}${req.url || '/v1/chat/completions'}`);
url.searchParams.set('__go', '1');
url.searchParams.set('__stream_prepare', '1');
const upstream = await fetch(url.toString(), {
method: 'POST',
headers: {
authorization: asString(header(req, 'authorization')),
'x-api-key': asString(header(req, 'x-api-key')),
'x-ds2-target-account': asString(header(req, 'x-ds2-target-account')),
'x-ds2-internal-token': internalSecret(),
'content-type': asString(header(req, 'content-type')) || 'application/json',
},
body: rawBody,
});
const text = await upstream.text();
let body = {};
try {
body = JSON.parse(text || '{}');
} catch (_err) {
body = {};
}
return {
ok: upstream.ok,
status: upstream.status,
contentType: upstream.headers.get('content-type') || 'application/json',
text,
body,
};
}
function relayPreparedFailure(res, prep) {
res.statusCode = prep.status || 500;
res.setHeader('Content-Type', prep.contentType || 'application/json');
if (prep.text) {
res.end(prep.text);
return;
}
writeOpenAIError(res, prep.status || 500, 'vercel prepare failed');
}
async function safeReadText(resp) {
if (!resp) {
return '';
}
try {
const text = await resp.text();
return text.trim();
} catch (_err) {
return '';
}
}
function internalSecret() {
return asString(process.env.DS2API_VERCEL_INTERNAL_SECRET) || asString(process.env.DS2API_ADMIN_KEY) || 'admin';
}
function parseChunkForContent(chunk, thinkingEnabled, currentType) {
if (!chunk || typeof chunk !== 'object' || !Object.prototype.hasOwnProperty.call(chunk, 'v')) {
return { parts: [], finished: false, newType: currentType };
}
const pathValue = asString(chunk.p);
if (shouldSkipPath(pathValue)) {
return { parts: [], finished: false, newType: currentType };
}
if (pathValue === 'response/status' && asString(chunk.v) === 'FINISHED') {
return { parts: [], finished: true, newType: currentType };
}
let newType = currentType;
const parts = [];
if (pathValue === 'response/fragments' && asString(chunk.o).toUpperCase() === 'APPEND' && Array.isArray(chunk.v)) {
for (const frag of chunk.v) {
if (!frag || typeof frag !== 'object') {
continue;
}
const fragType = asString(frag.type).toUpperCase();
const content = asString(frag.content);
if (!content) {
continue;
}
if (fragType === 'THINK' || fragType === 'THINKING') {
newType = 'thinking';
parts.push({ text: content, type: 'thinking' });
} else if (fragType === 'RESPONSE') {
newType = 'text';
parts.push({ text: content, type: 'text' });
} else {
parts.push({ text: content, type: 'text' });
}
}
}
if (pathValue === 'response' && Array.isArray(chunk.v)) {
for (const item of chunk.v) {
if (!item || typeof item !== 'object') {
continue;
}
if (item.p === 'fragments' && item.o === 'APPEND' && Array.isArray(item.v)) {
for (const frag of item.v) {
const fragType = asString(frag && frag.type).toUpperCase();
if (fragType === 'THINK' || fragType === 'THINKING') {
newType = 'thinking';
} else if (fragType === 'RESPONSE') {
newType = 'text';
}
}
}
}
}
let partType = 'text';
if (pathValue === 'response/thinking_content') {
partType = 'thinking';
} else if (pathValue === 'response/content') {
partType = 'text';
} else if (pathValue.includes('response/fragments') && pathValue.includes('/content')) {
partType = newType;
} else if (!pathValue && thinkingEnabled) {
partType = newType;
}
const val = chunk.v;
if (typeof val === 'string') {
if (val === 'FINISHED' && (!pathValue || pathValue === 'status')) {
return { parts: [], finished: true, newType };
}
if (val) {
parts.push({ text: val, type: partType });
}
return { parts, finished: false, newType };
}
if (Array.isArray(val)) {
for (const entry of val) {
if (typeof entry === 'string') {
if (entry) {
parts.push({ text: entry, type: partType });
}
continue;
}
if (!entry || typeof entry !== 'object') {
continue;
}
if (asString(entry.p) === 'status' && asString(entry.v) === 'FINISHED') {
return { parts: [], finished: true, newType };
}
const content = asString(entry.content);
if (!content) {
continue;
}
const t = asString(entry.type).toUpperCase();
if (t === 'THINK' || t === 'THINKING') {
parts.push({ text: content, type: 'thinking' });
} else if (t === 'RESPONSE') {
parts.push({ text: content, type: 'text' });
} else {
parts.push({ text: content, type: partType });
}
}
return { parts, finished: false, newType };
}
if (val && typeof val === 'object') {
const resp = val.response && typeof val.response === 'object' ? val.response : val;
if (Array.isArray(resp.fragments)) {
for (const frag of resp.fragments) {
if (!frag || typeof frag !== 'object') {
continue;
}
const content = asString(frag.content);
if (!content) {
continue;
}
const t = asString(frag.type).toUpperCase();
if (t === 'THINK' || t === 'THINKING') {
newType = 'thinking';
parts.push({ text: content, type: 'thinking' });
} else if (t === 'RESPONSE') {
newType = 'text';
parts.push({ text: content, type: 'text' });
} else {
parts.push({ text: content, type: partType });
}
}
}
}
return { parts, finished: false, newType };
}
function shouldSkipPath(pathValue) {
if (pathValue === 'response/search_status') {
return true;
}
for (const p of SKIP_PATTERNS) {
if (pathValue.includes(p)) {
return true;
}
}
return false;
}
function isCitation(text) {
return asString(text).trim().startsWith('[citation:');
}
function buildUsage(prompt, thinking, output) {
const promptTokens = estimateTokens(prompt);
const reasoningTokens = estimateTokens(thinking);
const completionTokens = estimateTokens(output);
return {
prompt_tokens: promptTokens,
completion_tokens: reasoningTokens + completionTokens,
total_tokens: promptTokens + reasoningTokens + completionTokens,
completion_tokens_details: {
reasoning_tokens: reasoningTokens,
},
};
}
function estimateTokens(text) {
const t = asString(text);
if (!t) {
return 0;
}
const n = Math.floor(Array.from(t).length / 4);
return n < 1 ? 1 : n;
}
async function proxyToGo(req, res, rawBody) {
const proto = asString(header(req, 'x-forwarded-proto')) || 'https';
const host = asString(header(req, 'host'));
const url = new URL(`${proto}://${host}${req.url || '/v1/chat/completions'}`);
url.searchParams.set('__go', '1');
const upstream = await fetch(url.toString(), {
method: 'POST',
headers: {
authorization: asString(header(req, 'authorization')),
'x-api-key': asString(header(req, 'x-api-key')),
'x-ds2-target-account': asString(header(req, 'x-ds2-target-account')),
'content-type': asString(header(req, 'content-type')) || 'application/json',
},
body: rawBody,
});
res.statusCode = upstream.status;
upstream.headers.forEach((value, key) => {
if (key.toLowerCase() === 'content-length') {
return;
}
res.setHeader(key, value);
});
const bytes = Buffer.from(await upstream.arrayBuffer());
res.end(bytes);
}
function writeOpenAIError(res, status, message) {
res.statusCode = status;
res.setHeader('Content-Type', 'application/json');
res.end(
JSON.stringify({
error: {
message,
type: openAIErrorType(status),
},
}),
);
}
function openAIErrorType(status) {
switch (status) {
case 400:
return 'invalid_request_error';
case 401:
return 'authentication_error';
case 403:
return 'permission_error';
case 429:
return 'rate_limit_error';
case 503:
return 'service_unavailable_error';
default:
return status >= 500 ? 'api_error' : 'invalid_request_error';
}
}
function toBool(v) {
return v === true;
}
function asString(v) {
if (typeof v === 'string') {
return v.trim();
}
if (Array.isArray(v)) {
return asString(v[0]);
}
if (v == null) {
return '';
}
return String(v).trim();
}

View File

@@ -230,19 +230,21 @@ func collectDeepSeek(resp *http.Response, thinkingEnabled bool) (string, string)
func (h *Handler) writeClaudeStream(w http.ResponseWriter, r *http.Request, model string, messages []any, fullText string, detected []util.ParsedToolCall) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Cache-Control", "no-cache, no-transform")
w.Header().Set("Connection", "keep-alive")
flusher, hasFlusher := w.(http.Flusher)
if !hasFlusher {
config.Logger.Warn("[claude_stream] response writer does not support flush; falling back to buffered SSE")
w.Header().Set("X-Accel-Buffering", "no")
rc := http.NewResponseController(w)
canFlush := rc.Flush() == nil
if !canFlush {
config.Logger.Warn("[claude_stream] response writer does not support flush; streaming may be buffered")
}
send := func(v any) {
b, _ := json.Marshal(v)
_, _ = w.Write([]byte("data: "))
_, _ = w.Write(b)
_, _ = w.Write([]byte("\n\n"))
if hasFlusher {
flusher.Flush()
if canFlush {
_ = rc.Flush()
}
}
messageID := fmt.Sprintf("msg_%d", time.Now().UnixNano())

View File

@@ -3,10 +3,12 @@ package openai
import (
"bufio"
"context"
"crypto/subtle"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"strings"
"time"
@@ -35,6 +37,11 @@ func (h *Handler) ListModels(w http.ResponseWriter, _ *http.Request) {
}
func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
if isVercelStreamPrepareRequest(r) {
h.handleVercelStreamPrepare(w, r)
return
}
a, err := h.Auth.Determine(r)
if err != nil {
status := http.StatusUnauthorized
@@ -106,6 +113,98 @@ func (h *Handler) ChatCompletions(w http.ResponseWriter, r *http.Request) {
h.handleNonStream(w, r.Context(), resp, sessionID, model, finalPrompt, thinkingEnabled, searchEnabled, toolNames)
}
func (h *Handler) handleVercelStreamPrepare(w http.ResponseWriter, r *http.Request) {
if !config.IsVercel() {
http.NotFound(w, r)
return
}
internalSecret := vercelInternalSecret()
internalToken := strings.TrimSpace(r.Header.Get("X-Ds2-Internal-Token"))
if internalSecret == "" || subtle.ConstantTimeCompare([]byte(internalToken), []byte(internalSecret)) != 1 {
writeOpenAIError(w, http.StatusUnauthorized, "unauthorized internal request")
return
}
a, err := h.Auth.Determine(r)
if err != nil {
status := http.StatusUnauthorized
if err == auth.ErrNoAccount {
status = http.StatusTooManyRequests
}
writeOpenAIError(w, status, err.Error())
return
}
defer h.Auth.Release(a)
r = r.WithContext(auth.WithAuth(r.Context(), a))
var req map[string]any
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeOpenAIError(w, http.StatusBadRequest, "invalid json")
return
}
if !toBool(req["stream"]) {
writeOpenAIError(w, http.StatusBadRequest, "stream must be true")
return
}
if tools, ok := req["tools"].([]any); ok && len(tools) > 0 {
writeOpenAIError(w, http.StatusBadRequest, "tools are not supported by vercel stream prepare")
return
}
model, _ := req["model"].(string)
messagesRaw, _ := req["messages"].([]any)
if model == "" || len(messagesRaw) == 0 {
writeOpenAIError(w, http.StatusBadRequest, "Request must include 'model' and 'messages'.")
return
}
thinkingEnabled, searchEnabled, ok := config.GetModelConfig(model)
if !ok {
writeOpenAIError(w, http.StatusServiceUnavailable, fmt.Sprintf("Model '%s' is not available.", model))
return
}
messages := normalizeMessages(messagesRaw)
finalPrompt := util.MessagesPrepare(messages)
sessionID, err := h.DS.CreateSession(r.Context(), a, 3)
if err != nil {
if a.UseConfigToken {
writeOpenAIError(w, http.StatusUnauthorized, "Account token is invalid. Please re-login the account in admin.")
} else {
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
}
return
}
powHeader, err := h.DS.GetPow(r.Context(), a, 3)
if err != nil {
writeOpenAIError(w, http.StatusUnauthorized, "Failed to get PoW (invalid token or unknown error).")
return
}
if strings.TrimSpace(a.DeepSeekToken) == "" {
writeOpenAIError(w, http.StatusUnauthorized, "Invalid token. If this should be a DS2API key, add it to config.keys first.")
return
}
payload := map[string]any{
"chat_session_id": sessionID,
"parent_message_id": nil,
"prompt": finalPrompt,
"ref_file_ids": []any{},
"thinking_enabled": thinkingEnabled,
"search_enabled": searchEnabled,
}
writeJSON(w, http.StatusOK, map[string]any{
"session_id": sessionID,
"model": model,
"final_prompt": finalPrompt,
"thinking_enabled": thinkingEnabled,
"search_enabled": searchEnabled,
"deepseek_token": a.DeepSeekToken,
"pow_header": powHeader,
"payload": payload,
})
}
func (h *Handler) handleNonStream(w http.ResponseWriter, ctx context.Context, resp *http.Response, completionID, model, finalPrompt string, thinkingEnabled, searchEnabled bool, toolNames []string) {
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
@@ -191,11 +290,13 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
return
}
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Cache-Control", "no-cache, no-transform")
w.Header().Set("Connection", "keep-alive")
flusher, hasFlusher := w.(http.Flusher)
if !hasFlusher {
config.Logger.Warn("[stream] response writer does not support flush; falling back to buffered SSE")
w.Header().Set("X-Accel-Buffering", "no")
rc := http.NewResponseController(w)
canFlush := rc.Flush() == nil
if !canFlush {
config.Logger.Warn("[stream] response writer does not support flush; streaming may be buffered")
}
lines := make(chan []byte, 128)
@@ -232,14 +333,14 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
_, _ = w.Write([]byte("data: "))
_, _ = w.Write(b)
_, _ = w.Write([]byte("\n\n"))
if hasFlusher {
flusher.Flush()
if canFlush {
_ = rc.Flush()
}
}
sendDone := func() {
_, _ = w.Write([]byte("data: [DONE]\n\n"))
if hasFlusher {
flusher.Flush()
if canFlush {
_ = rc.Flush()
}
}
@@ -316,9 +417,9 @@ func (h *Handler) handleStream(w http.ResponseWriter, r *http.Request, resp *htt
finalize("stop")
return
}
if hasFlusher {
if canFlush {
_, _ = w.Write([]byte(": keep-alive\n\n"))
flusher.Flush()
_ = rc.Flush()
}
case line, ok := <-lines:
if !ok {
@@ -483,3 +584,20 @@ func openAIErrorType(status int) string {
return "invalid_request_error"
}
}
func isVercelStreamPrepareRequest(r *http.Request) bool {
if r == nil {
return false
}
return strings.TrimSpace(r.URL.Query().Get("__stream_prepare")) == "1"
}
func vercelInternalSecret() string {
if v := strings.TrimSpace(os.Getenv("DS2API_VERCEL_INTERNAL_SECRET")); v != "" {
return v
}
if v := strings.TrimSpace(os.Getenv("DS2API_ADMIN_KEY")); v != "" {
return v
}
return "admin"
}

View File

@@ -0,0 +1,44 @@
package openai
import (
"net/http/httptest"
"testing"
)
func TestIsVercelStreamPrepareRequest(t *testing.T) {
req := httptest.NewRequest("POST", "/v1/chat/completions?__stream_prepare=1", nil)
if !isVercelStreamPrepareRequest(req) {
t.Fatalf("expected prepare request to be detected")
}
req2 := httptest.NewRequest("POST", "/v1/chat/completions", nil)
if isVercelStreamPrepareRequest(req2) {
t.Fatalf("expected non-prepare request")
}
}
func TestVercelInternalSecret(t *testing.T) {
t.Run("prefer explicit secret", func(t *testing.T) {
t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "stream-secret")
t.Setenv("DS2API_ADMIN_KEY", "admin-fallback")
if got := vercelInternalSecret(); got != "stream-secret" {
t.Fatalf("expected explicit secret, got %q", got)
}
})
t.Run("fallback to admin key", func(t *testing.T) {
t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "")
t.Setenv("DS2API_ADMIN_KEY", "admin-fallback")
if got := vercelInternalSecret(); got != "admin-fallback" {
t.Fatalf("expected admin key fallback, got %q", got)
}
})
t.Run("default admin when env missing", func(t *testing.T) {
t.Setenv("DS2API_VERCEL_INTERNAL_SECRET", "")
t.Setenv("DS2API_ADMIN_KEY", "")
if got := vercelInternalSecret(); got != "admin" {
t.Fatalf("expected default admin fallback, got %q", got)
}
})
}

View File

@@ -1,8 +1,27 @@
{
"version": 2,
"buildCommand": "npm ci --prefix webui && npm run build --prefix webui",
"outputDirectory": "static/admin",
"outputDirectory": "static",
"functions": {
"api/chat-stream.js": {
"includeFiles": "**/sha3_wasm_bg.7b9ca65ddd.wasm"
}
},
"rewrites": [
{
"source": "/v1/chat/completions",
"has": [
{
"type": "query",
"key": "__go"
}
],
"destination": "/api/index"
},
{
"source": "/v1/chat/completions",
"destination": "/api/chat-stream"
},
{
"source": "/admin/login",
"destination": "/api/index"
@@ -44,16 +63,16 @@
"destination": "/api/index"
},
{
"source": "/admin/assets/(.*)",
"destination": "/assets/$1"
"source": "/admin",
"destination": "/admin/index.html"
},
{
"source": "/admin",
"destination": "/index.html"
"source": "/admin/assets/(.*)",
"destination": "/admin/assets/$1"
},
{
"source": "/admin/(.*)",
"destination": "/index.html"
"destination": "/admin/index.html"
},
{
"source": "/(.*)",