Skip to content

Commit

Permalink
Comments about "hidden" and the Recogizer trait in parser.rs (#32)
Browse files Browse the repository at this point in the history
* Various parser.rs comments

* Update parser/src/earley/parser.rs

* Update parser/src/earley/parser.rs

---------

Co-authored-by: Jeffrey Kegler <[email protected]>
Co-authored-by: Michał Moskal <[email protected]>
  • Loading branch information
3 people authored Oct 30, 2024
1 parent 4b3645c commit ae5e2a7
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions parser/src/earley/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1625,6 +1625,12 @@ impl BiasComputer for DefaultBiasComputer {
}
}

// Processing of the parser and the lexer is heavily interlocked.
// The 'Recognizer' trait is used as the interface for this.
// See the documentation for TokTrie in README.md and implementation.md:
// https://github.com/microsoft/toktrie
// and
// https://github.com/microsoft/toktrie/blob/main/implementation.md .
impl<'a> Recognizer for ParserRecognizer<'a> {
#[inline(always)]
fn pop_bytes(&mut self, num: usize) {
Expand Down Expand Up @@ -1729,6 +1735,10 @@ impl Parser {
self.state.captures = std::mem::take(&mut other.state.captures);
}

// The "hidden" feature must be supported for historical reasons.
// It is used for 'gen(stop="foo')'. The result of this 'gen'
// must not include 'foo', even though the LLM generated 'foo'.
// The bytes in 'foo' are therefore said to be "hidden".
pub fn hidden_start(&self) -> usize {
let mut shared = self.shared.lock().unwrap();
self.state.hidden_start(&mut shared)
Expand Down

0 comments on commit ae5e2a7

Please sign in to comment.