ds2api

mirror of https://github.com/CJackHwang/ds2api.git synced 2026-05-11 03:37:40 +08:00

Author	SHA1	Message	Date
CJACK	0b05915bb6	Merge branch 'pr-474' into dev # Conflicts: # internal/httpapi/openai/chat/handler.go # internal/httpapi/openai/chat/vercel_prepare_test.go # internal/httpapi/openai/chat/vercel_stream.go	2026-05-10 16:28:36 +08:00
CJACK	eaeb403fda	feat: align Go/Node DSML tool-call parsing drift tolerance and update API docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-10 16:17:46 +08:00
CJACK	cee8757d14	revert: replace fullwidth pipe ｜ with halfwidth \| in DSML tool markup PR #460 introduced fullwidth pipe characters (｜) in DSML tool call formatting to improve parsing robustness, but models exposed to these fullwidth pipes in system prompts exhibit significantly higher rates of tool output hallucinations. Reverting to halfwidth pipes (\|) drastically reduces tokenizer/perplexity-driven hallucinations while retaining the existing confusable-hardening in the parser. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 15:18:54 +08:00
ds2api-bot	45590d6748	style: fix gofmt formatting in vercel_prepare_test.go	2026-05-10 07:08:50 +00:00
Your Name	196e3c46f6	feat(toolcall): harden confusable candidate spans	2026-05-10 09:27:30 +07:00
ds2api-bot	df6859bddc	fix(vercel): enable auto-delete session on Vercel stream release The "delete current conversation" feature was not working on Vercel deployment because the stream flow uses a separate lease mechanism. The session_id created during prepare phase was not preserved for deletion when the stream ends. Changes: - Add SessionID field to streamLease struct to preserve session_id - Pass session_id to holdStreamLease during prepare - Modify releaseStreamLease to return auth and session_id - Call autoDeleteRemoteSession in handleVercelStreamRelease when releasing a lease with auto-delete mode enabled Closes #vercel-auto-delete	2026-05-10 02:05:05 +00:00
CJACK	77b6d83266	feat: expand tool-call parsing resilience, refine model alias resolution, and update API documentation	2026-05-10 01:35:43 +08:00
CJACK	ddd42e532e	feat: implement managed-account rotation on 429 empty-output completion retries	2026-05-10 00:41:45 +08:00
CJACK	7c66742a19	refactor: unify empty-output retry logic into shared completionruntime package and normalize protocol adapter boundary.	2026-05-10 00:10:53 +08:00
CJACK	067cf465bb	feat: integrate reasoning content into assistant tool-call messages and improve tool markup parsing for prompt compatibility	2026-05-09 23:16:07 +08:00
waiwai	1e00e482a6	fix(toolcall): eliminate strings.ToLower panics from Unicode case folding Replace all strings.ToLower usage with ASCII case-insensitive matching (hasASCIIPrefixFoldAt, indexASCIIFold, hasDSMLPrefix) to prevent slice bounds errors when Unicode characters change byte length after case folding (e.g., Turkish İ U+0130 → i + combining dot: 2 bytes → 3 bytes). Root cause: code created a strings.ToLower(text) copy, found byte positions in that copy, then used those positions to slice the original text — byte offsets that were valid in the lowercased copy became out-of-bounds in the original when case folding changed byte lengths. Files changed: - toolcalls_scan.go: remove 5 lower usages, add hasDSMLPrefix - toolcalls_parse_markup.go: remove 3 lower usages, add indexASCIIFold - toolcalls_markup.go: SanitizeLooseCDATA lower removal - toolcalls_parse.go: updateCDATAStateForStrip lower removal - tool_prompt.go: align DSML pipe characters with tool call spec - tool_prompt_test.go: fix pre-existing test character mismatch	2026-05-09 15:05:51 +08:00
CJACK.	657b9379ed	test(docs): assert ollama show id field and document ollama endpoints	2026-05-08 01:11:35 +08:00
Dinh Nguyen	d0d61a5d77	Update ollama api test	2026-05-07 14:23:12 +07:00
dinhnn	ffef451f7a	Fixbug test typing	2026-05-07 13:48:03 +07:00
Dinh Nguyen	a68a79e087	Add ollama api for copilot support	2026-05-07 09:41:46 +07:00
NgoQuocViet2001	4315b424bf	fix(openai): keep stream heartbeat choice-free	2026-05-05 21:13:38 +07:00
NgoQuocViet2001	76884c0d94	feat(admin): remember Vercel sync credentials	2026-05-04 21:28:02 +07:00
CJACK	112bedb05d	refactor: differentiate reference marker handling between stream and non-stream modes - Stream: strip both and [reference:N] markers to prevent leaking partial link metadata during incremental output - Non-stream: convert citation/reference markers to Markdown links for Claude Messages, Gemini generateContent, and OpenAI Chat/Responses - Remove StripReferenceMarkers option from call sites; behavior is now determined automatically by stream vs non-stream context - Extend JS runtime stripReferenceMarkersText() to also match [citation:N] - Add tests for streaming marker stripping and non-stream link conversion Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 17:53:49 +08:00
CJACK	c099a6f7bf	feat: add unified response history session management across Claude, Gemini, and OpenAI API backends Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 17:24:38 +08:00
CJACK	5e55cf36d8	refactor: prioritize raw model output in chat history archiving to ensure accurate capture of tool call and thinking markup	2026-05-03 15:44:17 +08:00
CJACK	a299c7d1c4	refactor: remove thinking content from empty output validation logic to enforce stricter completion requirements	2026-05-03 06:59:20 +08:00
CJACK	a7522b4188	fix: retry thinking-only empty outputs, centralize reference marker stripping - ValidateTurn no longer errors on thinking-only responses, deferring to ShouldRetryEmptyOutput which now also covers thinking-only outputs. - Empty output retry uses multi-turn follow-up with a regeneration prompt suffix and parent_message_id in the same DeepSeek session. - Centralize StripReferenceMarkersEnabled into textclean package to eliminate duplicated hardcoded booleans across 4 protocol handlers. - Log a deprecation warning when the legacy "compat" config key is used. - Document thinking-only retry and reference marker stripping in API.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-03 05:02:26 +08:00
CJACK	1286b02247	refactor: remove legacy compatibility configuration and UI components	2026-05-03 04:14:19 +08:00
CJACK	5f110e6910	refactor: remove legacy history split configuration and integrate current input file handling into the completion runtime pipeline.	2026-05-03 01:50:50 +08:00
CJACK	7c0bc9ec0f	feat: implement support for thinking blocks in Gemini API and enable thinking by default for supported models	2026-05-03 01:00:06 +08:00
CJACK	a901250de7	refactor: replace bufio.Scanner with bufio.Reader for SSE stream parsing and track emitted text to prevent redundant output blocks	2026-05-02 23:50:35 +08:00
CJACK	dc5bffdf89	refactor: centralize assistant turn semantics and stream accumulation into new assistantturn and completionruntime packages	2026-05-02 23:28:43 +08:00
CJACK	eccd8c957b	fix: prevent continuation replay overlap by trimming redundant text from thinking and response streams	2026-05-02 21:34:36 +08:00
CJACK	0156f6b45b	Merge origin/dev into PR 406	2026-05-02 21:17:02 +08:00
CJACK	e7d6807c7c	feat: emit empty completion chunk along with keep-alive heartbeat in chat stream	2026-05-02 20:54:10 +08:00
王	d407ccb773	perf(streaming): optimize TTFT and reduce buffering latency Core changes: - stream.go: New accumulation buffer architecture with scanner goroutine + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate - dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies - claude/stream_runtime_core.go: Integrate toolstream for incremental text - claude/stream_runtime_finalize.go: toolstream flush support - stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms) - empty_retry: Add thinking-aware empty output detection - Fix reasoning_content leak and finish_reason=null in edge cases - Fix tail content truncation when max_tokens exceeded Tests: sync test expectations with upstream for thinking content	2026-05-02 20:28:30 +08:00
CJACK	c8f7b6b371	refactor streaming accumulation and chat history UI	2026-05-02 20:15:38 +08:00
NgoQuocViet2001	36d0239dc6	feat(openai): retrieve uploaded file metadata	2026-05-02 14:33:42 +07:00
CJACK	e2756f800d	feat: introduce JSON UTF-8 validation middleware and prepend output integrity guard system prompt to messages	2026-05-02 02:22:34 +08:00
CJACK	55abf64717	feat: add model type support for file uploads with automatic resolution and header propagation	2026-05-02 00:55:17 +08:00
CJACK	0bca6e2cee	feat: implement context cancellation handling for chat and response stream runtimes to ensure clean termination without retries	2026-05-01 23:20:46 +08:00
CJACK.	934b40e572	Merge pull request #392 from wyv202011y/fix/timeout-and-context-cancel fix: increase stream timeout constants for large-context models; guar…	2026-05-01 23:17:31 +08:00
CJACK	dd5a0c5213	refactor: update and standardize current input file continuation prompt instructions	2026-05-01 22:27:59 +08:00
CJACK	43402e7a26	refactor: rename history file constant from HISTORY.txt to DS2API_HISTORY.txt across codebase and tests	2026-05-01 22:05:45 +08:00
CJACK	df1cfac9bc	refactor: replace history transcript format with numbered sections and rename upload file to HISTORY.txt	2026-05-01 21:15:17 +08:00
王	706e68de23	fix: increase stream timeout constants for large-context models; guard against context-cancelled double-recording - Increase StreamIdleTimeout from 90s to 300s and MaxKeepaliveCount from 10 to 40 to prevent premature stream termination with DeepSeek V4 Pro (~50K token contexts) - Add r.Context().Err() check after ConsumeSSE in empty_retry_runtime (chat + responses) to prevent historySession.error() from overwriting historySession.stopped() when the request context is cancelled References: - MaxKeepaliveCount=10 creates a 50s no-content timeout that kills the stream before DeepSeek V4 Pro can produce its first token with large contexts - Hermes Agent reports 'No response from provider for 180s' because the underlying SSE connection was already terminated by ds2api at 50s - Context cancellation path: OnContextDone -> stopped(), then finalize() with empty output -> retry -> error() overwrites stopped()	2026-05-01 21:11:36 +08:00
CJACK	2671298439	fix: coalesce small stream deltas to prevent character swallowing; add read-tool cache guard Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 13:53:27 +08:00
CJACK	92e321fe2c	修复吞字问题	2026-05-01 01:31:48 +08:00
CJACK.	95b7665643	Merge branch 'dev' into codex/run-all-tests-and-fix-failures	2026-04-30 02:39:18 +08:00
CJACK.	966f21211d	Fix nil-session guard in chat history test	2026-04-30 02:31:06 +08:00
NgoQuocViet2001	7dc3af40b2	feat(openai): add root route aliases	2026-04-30 01:24:53 +07:00
CJACK.	2f6b5ffda0	Fix current-input token text test expectation	2026-04-30 02:22:17 +08:00
CJACK.	7c3ff6ee7e	Merge pull request #374 from shern-point/feat/full-context-file-token-accounting Feat/full context file token accounting	2026-04-30 02:12:55 +08:00
CJACK.	63e62fd1b0	Merge pull request #372 from shern-point/feat/accurate-context-token-length Feat/accurate context token length	2026-04-30 02:11:32 +08:00
shern-point	6a778e0d35	feat: include inline-uploaded file tokens in context token accounting Track byte sizes of inline-uploaded files during PreprocessInlineFileInputs and convert them to conservative token estimates (bytes/3). RefFileTokens is threaded through StandardRequest into all OpenAI chat/responses usage builders so returned prompt_tokens/input_tokens reflect the full upstream context cost including attached files.	2026-04-30 01:42:51 +08:00

1 2

90 Commits