revert: replace fullwidth pipe | with halfwidth | in DSML tool markup

PR #460 introduced fullwidth pipe characters (|) in DSML tool call formatting
to improve parsing robustness, but models exposed to these fullwidth pipes in
system prompts exhibit significantly higher rates of tool output hallucinations.
Reverting to halfwidth pipes (|) drastically reduces tokenizer/perplexity-driven
hallucinations while retaining the existing confusable-hardening in the parser.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
CJACK
2026-05-10 15:18:54 +08:00
parent 3beb31309f
commit cee8757d14
45 changed files with 725 additions and 342 deletions

View File

@@ -646,7 +646,7 @@ test('parseChunkForContent strips citation and reference markers from fragment c
test('parseChunkForContent strips leaked thought control markers from content', () => {
const chunk = {
p: 'response/content',
v: '<▁of▁thought>A<| of_thought |>B<| end_of_thought |>C',
v: '<|▁of▁thought|>A<| of_thought |>B<| end_of_thought |>C',
};
const parsed = parseChunkForContent(chunk, false, 'text');
assert.equal(parsed.finished, false);