feat: add tokenizer-based token counting utilities

Use go-tiktoken with embedded vocabularies for accurate BPE token counting. CountPromptTokens applies conservative padding so returned context token counts stay slightly above the real value instead of undercounting.
This commit is contained in:
shern-point
2026-04-30 00:44:11 +08:00
parent 52558838ef
commit bd41c8a90c
3 changed files with 96 additions and 0 deletions

5
go.mod
View File

@@ -10,6 +10,11 @@ require (
github.com/router-for-me/CLIProxyAPI/v6 v6.9.14
)
require (
github.com/dlclark/regexp2 v1.11.5 // indirect
github.com/hupe1980/go-tiktoken v0.0.10 // indirect
)
require (
github.com/klauspost/compress v1.18.5 // indirect
github.com/sirupsen/logrus v1.9.4 // indirect