Skip to content

Commit

Permalink
better readme
Browse files Browse the repository at this point in the history
  • Loading branch information
mmoskal committed Nov 8, 2024
1 parent af36f80 commit 3aea593
Showing 1 changed file with 44 additions and 11 deletions.
55 changes: 44 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,54 @@
# Low-level Guidance (llguidance)

This controller implements a context-free grammar parser with Earley's algorithm
on top of a lexer which uses [derivatives of regular expressions](https://github.com/microsoft/derivre).

It's to be used by next-generation [Guidance](https://github.com/guidance-ai/guidance) grammars.
See how it works in [plan.md](./plan.md).
This library implements constrained decoding (also called constrained sampling or
structured outputs) for Large Langauge Models (LLMs).
It can enforce arbitrary context-free grammar on the output of LLM
and is fast (on the order of 1ms of CPU time per token
(for 100k tokenizer) with negligible startup costs).

Following grammar formats are supported:
- `llguidance` - [internal (JSON-based) format](./parser/src/api.rs)
- regular expressions (following Rust regex crate [syntax](https://docs.rs/regex/latest/regex/#syntax))
- a large subset of JSON schemas
- context-free grammars in (a [subset](./parser/src/lark/README.md) of) [Lark](https://github.com/lark-parser/lark) format

The internal format is most powerful and can be generated by the following libraries:
- [Guidance](https://github.com/guidance-ai/guidance) (Python)
- [guidance.ts](https://github.com/mmoskal/guidance-ts) (TypeScript)
- hopefully more to come!

This is now available in `main` branch of Guidance.
Guidance PR: https://github.com/guidance-ai/guidance/pull/951
The library can be used from:
- [Rust](./parser/README.md), [sample](./sample_parser/src/sample_parser.rs)
- [C and C++](./parser/llguidance.h), [sample](./c_sample/c_sample.cpp)
- [Python](./python/llguidance/_lib.pyi)

The library is currently integrated in:
- [Guidance](https://github.com/guidance-ai/guidance) - library for interacting with LLMs;
uses either llama.cpp or HF Tranformers
- [LLGTRT](https://github.com/guidance-ai/llgtrt) - OpenAI-compatible REST server using NVIDIA's [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM)

The integration is ongoing in:
- onnxruntime-genai - [draft PR](https://github.com/microsoft/onnxruntime-genai/pull/1038)
- mistral.rs - [preliminary PR](https://github.com/EricLBuehler/mistral.rs/pull/899)
- llama.cpp - [branch](https://github.com/mmoskal/llama.cpp/tree/llg);
note that llama.cpp is fully integrated in Guidance above
via Python bindings

Given a context-free grammar, a tokenizer, and prefix of tokens,
llguidance computes a token mask (set of tokens from the tokenizer)
that when added to current prefix of token can lead to a valid string in
the language of the grammar.
Computing a mask takes on the order of 1ms of single-core CPU time
for a tokenizer with 100k tokens.
While this depends on the exact grammar, it holds eg. for grammars resulting from JSON schemas.
There is also no significant startup cost.

The library implements a context-free grammar parser with Earley's algorithm
on top of a lexer which uses [derivatives of regular expressions](https://github.com/microsoft/derivre).

Grammars are normally [JSON-serialized](./parser/src/api.rs).
The following libraries produce llguidance grammars:

- [guidance](https://github.com/guidance-ai/guidance) (Python)
- [guidance.ts](https://github.com/mmoskal/guidance-ts) (TypeScript)
- hopefully more to come!

## Building

- [install rust](https://www.rust-lang.org/tools/install); 1.75 or later
Expand Down

0 comments on commit 3aea593

Please sign in to comment.