revert: replace fullwidth pipe | with halfwidth | in DSML tool markup

PR #460 introduced fullwidth pipe characters (|) in DSML tool call formatting
to improve parsing robustness, but models exposed to these fullwidth pipes in
system prompts exhibit significantly higher rates of tool output hallucinations.
Reverting to halfwidth pipes (|) drastically reduces tokenizer/perplexity-driven
hallucinations while retaining the existing confusable-hardening in the parser.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
CJACK
2026-05-10 15:18:54 +08:00
parent 3beb31309f
commit cee8757d14
45 changed files with 725 additions and 342 deletions

View File

@@ -18,9 +18,9 @@ test('chat history strict parser merges current input file placeholder', async (
content: 'Continue from the latest state in the attached DS2API_HISTORY.txt context. Treat it as the current working state and answer the latest user request directly.',
}],
history_text: [
'<begin▁of▁sentence>',
'<User>hello',
'<Assistant>hi<end▁of▁sentence>',
'<|begin▁of▁sentence|>',
'<|User|>hello',
'<|Assistant|>hi<|end▁of▁sentence|>',
].join(''),
};
@@ -43,9 +43,9 @@ test('chat history strict parser inserts history after system messages', async (
{ role: 'user', content: 'latest' },
],
history_text: [
'<begin▁of▁sentence>',
'<User>old',
'<Assistant>done<end▁of▁sentence>',
'<|begin▁of▁sentence|>',
'<|User|>old',
'<|Assistant|>done<|end▁of▁sentence|>',
].join(''),
};