ds2api

mirror of https://github.com/CJackHwang/ds2api.git synced 2026-05-05 08:55:28 +08:00

Author	SHA1	Message	Date
CJACK	112bedb05d	refactor: differentiate reference marker handling between stream and non-stream modes - Stream: strip both and [reference:N] markers to prevent leaking partial link metadata during incremental output - Non-stream: convert citation/reference markers to Markdown links for Claude Messages, Gemini generateContent, and OpenAI Chat/Responses - Remove StripReferenceMarkers option from call sites; behavior is now determined automatically by stream vs non-stream context - Extend JS runtime stripReferenceMarkersText() to also match [citation:N] - Add tests for streaming marker stripping and non-stream link conversion Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 17:53:49 +08:00
CJACK	c099a6f7bf	feat: add unified response history session management across Claude, Gemini, and OpenAI API backends Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 17:24:38 +08:00
CJACK	a7522b4188	fix: retry thinking-only empty outputs, centralize reference marker stripping - ValidateTurn no longer errors on thinking-only responses, deferring to ShouldRetryEmptyOutput which now also covers thinking-only outputs. - Empty output retry uses multi-turn follow-up with a regeneration prompt suffix and parent_message_id in the same DeepSeek session. - Centralize StripReferenceMarkersEnabled into textclean package to eliminate duplicated hardcoded booleans across 4 protocol handlers. - Log a deprecation warning when the legacy "compat" config key is used. - Document thinking-only retry and reference marker stripping in API.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 05:02:26 +08:00
CJACK	1286b02247	refactor: remove legacy compatibility configuration and UI components	2026-05-03 04:14:19 +08:00
CJACK	5f110e6910	refactor: remove legacy history split configuration and integrate current input file handling into the completion runtime pipeline.	2026-05-03 01:50:50 +08:00
CJACK	dc5bffdf89	refactor: centralize assistant turn semantics and stream accumulation into new assistantturn and completionruntime packages	2026-05-02 23:28:43 +08:00
CJACK	0156f6b45b	Merge origin/dev into PR 406	2026-05-02 21:17:02 +08:00
王	d407ccb773	perf(streaming): optimize TTFT and reduce buffering latency Core changes: - stream.go: New accumulation buffer architecture with scanner goroutine + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate - dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies - claude/stream_runtime_core.go: Integrate toolstream for incremental text - claude/stream_runtime_finalize.go: toolstream flush support - stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms) - empty_retry: Add thinking-aware empty output detection - Fix reasoning_content leak and finish_reason=null in edge cases - Fix tail content truncation when max_tokens exceeded Tests: sync test expectations with upstream for thinking content	2026-05-02 20:28:30 +08:00
CJACK	c8f7b6b371	refactor streaming accumulation and chat history UI	2026-05-02 20:15:38 +08:00
CJACK	0bca6e2cee	feat: implement context cancellation handling for chat and response stream runtimes to ensure clean termination without retries	2026-05-01 23:20:46 +08:00
王	706e68de23	fix: increase stream timeout constants for large-context models; guard against context-cancelled double-recording - Increase StreamIdleTimeout from 90s to 300s and MaxKeepaliveCount from 10 to 40 to prevent premature stream termination with DeepSeek V4 Pro (~50K token contexts) - Add r.Context().Err() check after ConsumeSSE in empty_retry_runtime (chat + responses) to prevent historySession.error() from overwriting historySession.stopped() when the request context is cancelled References: - MaxKeepaliveCount=10 creates a 50s no-content timeout that kills the stream before DeepSeek V4 Pro can produce its first token with large contexts - Hermes Agent reports 'No response from provider for 180s' because the underlying SSE connection was already terminated by ds2api at 50s - Context cancellation path: OnContextDone -> stopped(), then finalize() with empty output -> retry -> error() overwrites stopped()	2026-05-01 21:11:36 +08:00
CJACK	2671298439	fix: coalesce small stream deltas to prevent character swallowing; add read-tool cache guard Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 13:53:27 +08:00
CJACK.	7c3ff6ee7e	Merge pull request #374 from shern-point/feat/full-context-file-token-accounting Feat/full context file token accounting	2026-04-30 02:12:55 +08:00
CJACK.	63e62fd1b0	Merge pull request #372 from shern-point/feat/accurate-context-token-length Feat/accurate context token length	2026-04-30 02:11:32 +08:00
shern-point	6a778e0d35	feat: include inline-uploaded file tokens in context token accounting Track byte sizes of inline-uploaded files during PreprocessInlineFileInputs and convert them to conservative token estimates (bytes/3). RefFileTokens is threaded through StandardRequest into all OpenAI chat/responses usage builders so returned prompt_tokens/input_tokens reflect the full upstream context cost including attached files.	2026-04-30 01:42:51 +08:00
shern-point	415a2359ad	feat: route OpenAI responses usage through preserved prompt text Use the stored full-context prompt text for responses accounting so neutral placeholder prompts do not underreport returned input token counts.	2026-04-30 00:45:31 +08:00
CJACK.	33f6fef015	Fix tool-call fallback on sanitized empty text and remove history wrapper tags	2026-04-29 23:04:45 +08:00
MiY	241334c658	Fix stream compatibility and vision model exposure	2026-04-29 20:23:13 +08:00
shern-point	72c8e7e9f9	test: cover responses string-protected tool arguments Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 13:46:43 +08:00
shern-point	b9c8e90d98	refactor: thread tool schemas through responses tool outputs Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 13:46:06 +08:00
CJACK	28bb85ad63	refactor: replace history_split with current_input_file configuration Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 23:36:56 +08:00
CJACK	b82bc1311a	fix: use parent_message_id and fresh PoW headers for empty-output retry and continue Previously retry/continue requests reused the initial PoW header and lacked parent_message_id, causing them to land as disconnected root messages in the DeepSeek session instead of proper follow-up turns. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:31:51 +08:00
CJACK	0378d8c0a9	feat: add empty-output retry and Vercel auto-continue support - Auto-retry Chat/Responses streams once when upstream output is empty but not content-filtered, reusing session/token/PoW and appending a regeneration suffix to the prompt - Wire DeepSeek continue API into Vercel streams for multi-round thinking output exhaustion - Defer empty-output errors in stream finalizers to enable synthetic retry; only surface failure when the retry budget is exhausted - Track content_filter stops to avoid retry on filtered outputs - Add comprehensive tests for stream/non-stream retry, Responses retry, and content_filter no-retry - Update prompt-compatibility.md documentation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 18:00:52 +08:00
CJACK	40d5e3ebb5	测试DSML	2026-04-27 00:21:26 +08:00
CJACK.	4048c3784b	Merge pull request #320 from adnxx1wsx/main fix: fallback claude non-stream tool calls from thinking	2026-04-26 17:54:05 +08:00
MiY	a505f2cb96	fix: fallback tool calls from thinking on empty output	2026-04-26 17:45:12 +08:00
CJACK	c09a4b51a5	feat: 新增 thinking 注入配置支持，扩展设置管理与前端交互新增 promptcompat 和 OpenAI shared 层的 thinking 注入逻辑，完善配置系统的编解码与校验，更新设置管理 API 与前端 UI。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 13:35:20 +08:00
CJACK	abc96a37d8	refactor backend API structure	2026-04-26 06:58:20 +08:00

28 Commits