Skip to content
This repository has been archived by the owner on Nov 30, 2024. It is now read-only.

Commit

Permalink
add tokenize_is_approximate() method to TokenizerEnv trait
Browse files Browse the repository at this point in the history
  • Loading branch information
mmoskal committed Nov 25, 2024
1 parent f4c1f04 commit e22222c
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions core/src/toktree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,13 @@ pub trait TokenizerEnv: Send {
fn eos_token(&self) -> TokenId {
self.tok_trie().eos_token()
}

/// If this returns true, this tokenizer may return non-canonical tokenizations
/// and should generally not be used for forcing tokens.
/// Typically, it will just use TokTrie::greedy_tokenize().
fn tokenize_is_approximate(&self) -> bool {
false
}
}

pub type TokEnv = Arc<dyn TokenizerEnv + Sync + 'static>;
Expand Down

0 comments on commit e22222c

Please sign in to comment.