Releases: TransformerLensOrg/TransformerLens
v1.19.0
Nice little update to fix a bug someone found, and added support for ai-forever models.
What's Changed
- Add support for ai-forever/mGPT model by @SeuperHakkerJa in #606
- moved enable hook functionality to separate functions and tested new functions by @bryce13950 in #613
Full Changelog: v1.18.0...v1.19.0
v1.18.0
Very important release for those using Gemma models. A recent upstream change caused the TransformerLens implementation to become outdated. This release fixes that issue, and includes a number of cumulative changes, and bug fixes. The only API change in this release is that you can now set override trust_remote_code
in the function from_pretrained
. Thanks to all who contributed to this release!
What's Changed
- reworked CI to publish code coverage report by @bryce13950 in #559
- Resolve SAE CI Test failures by @bryce13950 in #560
- Ci coverage location by @bryce13950 in #561
- Ci full coverage by @bryce13950 in #562
- moved coverage report download by @bryce13950 in #564
- Revert "moved coverage report download (#564)" by @bryce13950 in #565
- Othello ci by @bryce13950 in #567
- moved report to static section by @bryce13950 in #566
- Fix broken HookedSAETransformer demo links by @ckkissane in #572
- Fix Pos Slice Issue by @hannamw in #578
- Hf secret by @bryce13950 in #552
- updated pull reqeust template to account for new dev branch by @bryce13950 in #581
- updated PR template to add a note about merging from different branches by @bryce13950 in #583
- updated repo URL throughout the project by @bryce13950 in #580
- Fix docs badge in README by @ArthurConmy in #585
- added debug step by @bryce13950 in #568
- Update Gemma to reflect upstream HF changes by @cmathw in #596
- allow user to force trust_remote_code=true via from_pretrained kwargs by @Butanium in #597
Full Changelog: v1.17.0...v1.18.0
New feature: HookedSAETransformer!
v1.16.0
Lots of feature additions (thanks @joelburget for Llama support, and @sheikheddy for Llama-2-70b-chat-hf support!), and also a very helpful bugfix from @wesg52. Thanks to all contributors, especially new contributors!
What's Changed
- Add support for Llama-2-70b-chat-hf by @sheikheddy in #525
- Update loading_from_pretrained.py by @jbloomAus in #529
- Bugfix: pytest import by @tkukurin in #532
- Remove non-existing parameter from decompose_resid() documentation by @VasilGeorgiev39 in #504
- Add
@overload
toFactoredMatrix.__{,r}matmul__
by @JasonGross in #512 - Improve documentation for abstract attribute by @Felhof in #508
- Add pos_slice to run_with_cache by @VasilGeorgiev39 in #465
- Add Support for Yi-6B and Yi-34B by @collingray in #494
- updated docs to account for additional test suites by @bryce13950 in #533
- Bugfix: remove redundant assert checks by @tkukurin in #534
- Speed up !pip install transformer-lens in colab by @pavanyellow in #510
- Add Xavier and Kaiming Initializations by @Chanlaw in #537
- chore: fixing type errors and enabling mypy by @chanind in #516
- Add Mixtral by @collingray in #521
- Standardize black line length to 100, in line with other project settings by @Chanlaw in #538
- Refactor hook_points by @VasilGeorgiev39 in #505
- Fix split_qkv_input for grouped query attention by @wesg52 in #520
- locked attribution patching to 1.1.1 by @bryce13950 in #541
- Demo no position fix by @bryce13950 in #544
- Othello colab fix by @bryce13950 in #545
- Fixed Santa Coder demo by @bryce13950 in #546
- Hf token auth by @bryce13950 in #550
- Fixed device being set to cpu:0 instead of cpu by @Butanium in #551
- Add support for Llama 3 (and Llama-2-70b-hf) by @joelburget in #549
- Loading of huggingface 4-bit quantized Llama by @coolvision in #486
- removed deuplicate rearrange block by @bryce13950 in #555
- Bert demo ci by @bryce13950 in #556
New Contributors
- @sheikheddy made their first contribution in #525
- @tkukurin made their first contribution in #532
- @VasilGeorgiev39 made their first contribution in #504
- @JasonGross made their first contribution in #512
- @pavanyellow made their first contribution in #510
- @Chanlaw made their first contribution in #537
- @chanind made their first contribution in #516
- @wesg52 made their first contribution in #520
- @Butanium made their first contribution in #551
- @coolvision made their first contribution in #486
Full Changelog: v1.15.0...v1.16.0
v1.15.0: Gemma, Qwen, Phi!
What's Changed
- Support Phi Models by @cmathw in #484
- Remove redundant MLP bias assignment by @adamkarvonen in #485
- add qwen1.5 models by @andyrdt in #507
- Support Gemma Models by @cmathw in #511
- make tests pass mps by @jbloomAus in #528
New Contributors
Full Changelog: v1.14.0...v1.15.0
v1.14.0
What's Changed
- Implement RMS Layer Norm folding by @collingray in #489
- Cap Mistral's context length at 2k by @collingray in #495
New Contributors
- @collingray made their first contribution in #489
Full Changelog: v1.13.0...v1.13.1
v1.13.0
What's Changed
- Add support for CodeLlama-7b by @YuhengHuang42 in #469
- Make LLaMA 2 loadable directly from HF by @andyrdt in #458
- Fixes #371: LLAMA load on CUDA. Expected all tensors to be on the sam… by @artkpv in #461
- Extending Support for Additional Bloom Models (up to 7b) by @SeuperHakkerJa in #447
- Support mistral 7 b by @Felhof in #443
New Contributors
- @YuhengHuang42 made their first contribution in #469
- @andyrdt made their first contribution in #458
- @artkpv made their first contribution in #461
Full Changelog: v1.12.1...v1.13.0
v1.12.1
Adds Qwen, thanks @Aaquib111 and @andyrdt !
What's Changed
- Closes #478: Adding the Qwen family of models by @Aaquib111 in #477
- Add a function to convert nanogpt weights by @adamkarvonen in #475
New Contributors
- @Aaquib111 made their first contribution in #477
- @adamkarvonen made their first contribution in #475
Full Changelog: v1.12.0...v1.13.0
v1.12.0
v1.11.0
Thanks to @obalcells and @andyrdt Llama-2 models should now have 1e-4
atol logit errors rather than 1e0
errors!
We also now force PyTorch2 to be >= 2.1.1 thanks to a PyTorch issue on MPS @jettjaniak pointed out. Thanks all!
What's Changed
- Fix Grokking Notebook by @ArthurConmy in #450
- Fixed current CI issues with accuracy failing for Pythia model by @bryce13950 in #451
- Fixing Llama2 numerical errors by @obalcells in #456
- Pin PyTorch2 to be at least 2.1.1 by @ArthurConmy in #457
New Contributors
- @obalcells made their first contribution in #456
Full Changelog: v1.10.0...v1.11.0