Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cherry Pick] Refactor of perplexity computation #1399

Closed
wants to merge 4 commits into from
Closed

Conversation

dbogunowicz
Copy link
Contributor

#1197 to release 1.6

dbogunowicz and others added 4 commits November 2, 2023 11:22
* initial commit to unblock derek

* ready for review

* add unit test
* initial commit to unblock derek

* ready for review

* add unit test
* Add input_tokes as optional output

* Refactor Perplexity class to only compute perplexity. All other task-specific processing is handled elsewhere

* Simplify perplexity evaluation. Evaluation takes place as batch size 1 only, so no need to consider batched execution. In addition, use input_tokens from generation pipeline

* Splits wikitext at regular intervals of the same length as the sequence length

* Add argument for accumulation of negative log likelihood

* Accumulate likelihood for wikitext

* Simplification

* Add support for wikitext-style ppl evaluation

* Compute batch instead of storing until compute method. This drastically reduced memory requirements

* Remove torch dependency

* Move split of dataset into helper function

* Quality fixes

* Remove debugging prints

* Remove debugging prints

* Incorporate fixes for kv-cache

* Include doc string for accumulate

* Add support to trust-remote-code arguments

* Add support to c4

* add a missing include_prompt_logits param

* Remove unnecessary capping at sequence length (it's incorrect for cached models)

* Simplify processing for concatenated datasets

* Fix kv cache update

* Fix kv cache update

* Quality fixes

* remove batch size from pipeline instantiation

* Rename to wikitext2

* Remove trust_remote_code argument

* Remove use_deepsparse_cache argument

* Change padding of output to left in order to match padding of input ids and attention mask

* Allow trust_remote_code to be passed as argument (in some cases tokenizer can be defined by custom code)

* Move process_concatenated_datasets to helpers file

* Added support for max_text_length to speed up processing of long datasets

* Rebase w/ main

* Rebase w/ main

* Fix typo

* Rebase

* Use max_length instead of max_new_tokens

* Rebase

* Added typing and docstring

* Added typing and docstring

* Define concantenated datasets

* Add warning about batch-size not being a supported argument for some datasets

* Add unit test for pipeline and generation in ppl eval

* Add lifecycle in docstring

* Add copyright

* Style fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Quality fixes

* Rebase

* Rebase

* Re-add unit test

* Style fix

* Update unit test

* Update unit test

---------

Co-authored-by: dbogunowicz <[email protected]>
Co-authored-by: Damian <[email protected]>
Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: Rahul Tuli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants