13 Commits

Author SHA1 Message Date
CJACK
112bedb05d refactor: differentiate reference marker handling between stream and non-stream modes
- Stream: strip both and [reference:N] markers to prevent
  leaking partial link metadata during incremental output
- Non-stream: convert citation/reference markers to Markdown links for
  Claude Messages, Gemini generateContent, and OpenAI Chat/Responses
- Remove StripReferenceMarkers option from call sites; behavior is now
  determined automatically by stream vs non-stream context
- Extend JS runtime stripReferenceMarkersText() to also match [citation:N]
- Add tests for streaming marker stripping and non-stream link conversion

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 17:53:49 +08:00
CJACK
c099a6f7bf feat: add unified response history session management across Claude, Gemini, and OpenAI API backends
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 17:24:38 +08:00
CJACK
a7522b4188 fix: retry thinking-only empty outputs, centralize reference marker stripping
- ValidateTurn no longer errors on thinking-only responses, deferring to
  ShouldRetryEmptyOutput which now also covers thinking-only outputs.
- Empty output retry uses multi-turn follow-up with a regeneration prompt
  suffix and parent_message_id in the same DeepSeek session.
- Centralize StripReferenceMarkersEnabled into textclean package to
  eliminate duplicated hardcoded booleans across 4 protocol handlers.
- Log a deprecation warning when the legacy "compat" config key is used.
- Document thinking-only retry and reference marker stripping in API.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 05:02:26 +08:00
CJACK
1286b02247 refactor: remove legacy compatibility configuration and UI components 2026-05-03 04:14:19 +08:00
CJACK
5f110e6910 refactor: remove legacy history split configuration and integrate current input file handling into the completion runtime pipeline. 2026-05-03 01:50:50 +08:00
CJACK
7c0bc9ec0f feat: implement support for thinking blocks in Gemini API and enable thinking by default for supported models 2026-05-03 01:00:06 +08:00
CJACK
dc5bffdf89 refactor: centralize assistant turn semantics and stream accumulation into new assistantturn and completionruntime packages 2026-05-02 23:28:43 +08:00
d407ccb773 perf(streaming): optimize TTFT and reduce buffering latency
Core changes:
- stream.go: New accumulation buffer architecture with scanner goroutine
  + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate
- dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies
- claude/stream_runtime_core.go: Integrate toolstream for incremental text
- claude/stream_runtime_finalize.go: toolstream flush support
- stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms)
- empty_retry: Add thinking-aware empty output detection
- Fix reasoning_content leak and finish_reason=null in edge cases
- Fix tail content truncation when max_tokens exceeded

Tests: sync test expectations with upstream for thinking content
2026-05-02 20:28:30 +08:00
CJACK
e2756f800d feat: introduce JSON UTF-8 validation middleware and prepend output integrity guard system prompt to messages 2026-05-02 02:22:34 +08:00
shern-point
4b4f097006 feat: use model-aware prompt counting in Gemini paths
Preserve Gemini prompt token text during normalization and remove the hardcoded DeepSeek model from native Gemini usage helpers.
2026-04-30 00:46:05 +08:00
CJACK
1602c3a43c lint 2026-04-27 13:48:55 +08:00
CJACK
90ce595325 chore: update project files 2026-04-27 02:09:11 +08:00
CJACK
abc96a37d8 refactor backend API structure 2026-04-26 06:58:20 +08:00