4 Commits

Author SHA1 Message Date
CJACK
0156f6b45b Merge origin/dev into PR 406 2026-05-02 21:17:02 +08:00
d407ccb773 perf(streaming): optimize TTFT and reduce buffering latency
Core changes:
- stream.go: New accumulation buffer architecture with scanner goroutine
  + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate
- dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies
- claude/stream_runtime_core.go: Integrate toolstream for incremental text
- claude/stream_runtime_finalize.go: toolstream flush support
- stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms)
- empty_retry: Add thinking-aware empty output detection
- Fix reasoning_content leak and finish_reason=null in edge cases
- Fix tail content truncation when max_tokens exceeded

Tests: sync test expectations with upstream for thinking content
2026-05-02 20:28:30 +08:00
CJACK
49012a227c feat: implement trimContinuationOverlap utility to remove redundant stream prefixes and add associated tests. 2026-04-06 02:23:28 +08:00
CJACK
4d36afea4c 修复接续流的增量bug 2026-04-06 02:01:41 +08:00