Promote raw stream replay into standalone simulator tool and add SSE field doc

This commit is contained in:
CJACK.
2026-04-03 10:18:48 +08:00
parent fe43f1e6ee
commit a28e833f33
13 changed files with 1364 additions and 0 deletions

View File

@@ -0,0 +1,28 @@
# 原始流数据样本目录
该目录用于存放**上游真实 SSE 原始流**样本,供本地仿真测试和解析适配使用。
## 目录规范
每个样本一个子目录:
- `meta.json`:样本元信息(问题、模型、采集时间、备注)
- `upstream.stream.sse`:完整原始 SSE 文本(`event:` / `data:` 行)
## 扩展方式
1. 抓取一次真实请求(建议开启 `DS2API_DEV_PACKET_CAPTURE=1`)。
2. 新建 `<sample-id>/` 目录并放入 `meta.json` + `upstream.stream.sse`
3. 运行独立仿真工具(可被其他测试脚本调用):
```bash
./tests/scripts/run-raw-stream-sim.sh
```
该工具会自动遍历本目录全部样本,按真实流顺序重放并验证:
- 不会把上游 `status=FINISHED` 片段当正文输出(防泄露)。
- 能正确检测 `response/status=FINISHED` 流结束信号。
- 生成可归档 JSON 报告(`artifacts/raw-stream-sim/`)。
> 注意:样本可能包含搜索结果正文与引用信息,请勿放入敏感账号/密钥。

View File

@@ -0,0 +1,55 @@
# 样本分析(广州天气 / deepseek-reasoner-search
- 样本来源:`/admin/dev/captures` 上游原始 SSE 抓包
- 采集时间UTC2026-04-03 01:28:50
- 原始字节数41043
- `FINISHED` 字符串出现次数24
- JSON `data:` chunk 数420
## 事件分布
- `ready`: 1
- `update_session`: 2
- `finish`: 1
## 高频路径Top 12
- `response/fragments/-1/content`: 13
- `response/fragments/-1`: 9
- `response`: 5
- `response/has_pending_fragment`: 4
- `response/fragments/-1/elapsed_secs`: 3
- `response/fragments/-5/status`: 2
- `response/fragments/-6/status`: 2
- `response/fragments/-3/status`: 2
- `response/fragments/-1/status`: 2
- `response/fragments/-4/status`: 2
- `response/fragments/-2/status`: 2
- `response/fragments/-5/results`: 1
## 关键泄露来源
以下状态路径会高频出现 `v=FINISHED`,如果解析器按普通文本透传,就会出现 `FINISHEDFINISHED...` 泄露:
- `response/fragments/-5/status`: 2
- `response/fragments/-6/status`: 2
- `response/fragments/-3/status`: 2
- `response/fragments/-1/status`: 2
- `response/fragments/-4/status`: 2
- `response/fragments/-2/status`: 2
- `response/fragments/-14/status`: 1
- `response/fragments/-12/status`: 1
- `response/fragments/-10/status`: 1
- `response/fragments/-9/status`: 1
- `response/fragments/-8/status`: 1
- `response/fragments/-7/status`: 1
- `response/fragments/-11/status`: 1
- `response/fragments/-16/status`: 1
- `response/fragments/-13/status`: 1
- `response/fragments/-15/status`: 1
## 适配建议
1. 跳过 `response/fragments/<index>/status`(所有 index而非仅 `-1/-2/-3`)。
2. 保留 `response/status=FINISHED` 用于结束流判定,不应当输出正文。
3. 在样本仿真测试中对全部样本执行“不得输出 `FINISHED`”断言。

View File

@@ -0,0 +1,25 @@
{
"sample_id": "guangzhou-weather-reasoner-search-20260403",
"captured_at_utc": "2026-04-03T01:28:50Z",
"request": {
"model": "deepseek-reasoner-search",
"stream": true,
"messages": [
{
"role": "user",
"content": "广州天气"
}
],
"thinking_enabled": true,
"search_enabled": true
},
"capture": {
"label": "deepseek_completion",
"url": "https://chat.deepseek.com/api/v0/chat/completion",
"status_code": 200,
"response_bytes": 41043,
"contains_finished_token": true,
"finished_token_count": 24
},
"notes": "Captured from upstream DeepSeek SSE via /admin/dev/captures with packet capture enabled. Account ID removed."
}

File diff suppressed because one or more lines are too long