Extending Support for Additional Bloom Models (up to 7b) #447

SeuperHakkerJa · 2023-11-14T18:48:49Z

Description

Draft Pull Request: Extending Support for Additional Bloom Models Beyond 560m, Continuing from #434

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

SeuperHakkerJa · 2023-11-18T06:22:47Z

GPT Neo is not passing the acceptance test? I'm unable to replicate this error in my local environment

=================================== FAILURES ===================================
_____________________________ test_dtypes[dtype1] ______________________________

dtype = torch.float32

    @pytest.mark.parametrize("dtype", [torch.float64, torch.float32])
    def test_dtypes(dtype):
>       check_dtype(dtype, margin=5e-5)

tests/acceptance/test_hooked_transformer.py:238: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/acceptance/test_hooked_transformer.py:226: in check_dtype
    check_performance(model, hf_model, margin)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

tl_model = HookedTransformer(
  (embed): Embed()
  (hook_embed): HookPoint()
  (blocks): ModuleList(
    (0-5): 6 x TransformerBl...(ln_final): LayerNormPre(
    (hook_scale): HookPoint()
    (hook_normalized): HookPoint()
  )
  (unembed): Unembed()
)
hf_model = GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(50304, 512)
    (emb_dropout): Dropout(p=0.0...512,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=512, out_features=50304, bias=False)
)
margin = 5e-05

    def check_performance(tl_model, hf_model, margin):
        """
        Check that the TransformerLens model and the HuggingFace have
        approximately the same confidence in the expected answer.
        """
        prompt = " Unable"
        tokens = tl_model.tokenizer(prompt, return_tensors="pt")["input_ids"].to(
            "cuda" if torch.cuda.is_available() else "cpu"
        )
    
        expected_token = tl_model.tokenizer.encode(" to")[
            0
        ]  # Assume this is the expected token to predict
    
        tl_logits = tl_model(tokens, prepend_bos=False)[0, -1].float()
        hf_logits = hf_model(tokens).logits[0, -1].float()
        tl_prob = torch.softmax(tl_logits, dim=-1)[expected_token].item()
        hf_prob = torch.softmax(hf_logits, dim=-1)[expected_token].item()
>       assert tl_prob + margin > hf_prob
E       assert (0.465097129[34](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:35)49[40](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:41)2 + 5e-05) > 0.[46](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:47)51[49](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:50)4622230[53](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:54)

tests/acceptance/test_hooked_transformer.py:204: AssertionError

alan-cooney

Thanks for adding this!

SeuperHakkerJa added 2 commits November 14, 2023 13:40

add todos

eb146b5

add alias, pass notebook test, unit test tba

cf2a0ef

Merge branch 'neelnanda-io:main' into main

203ff1d

SeuperHakkerJa marked this pull request as ready for review December 13, 2023 20:20

alan-cooney approved these changes Jan 17, 2024

View reviewed changes

alan-cooney merged commit a5147ba into TransformerLensOrg:main Jan 17, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extending Support for Additional Bloom Models (up to 7b) #447

Extending Support for Additional Bloom Models (up to 7b) #447

SeuperHakkerJa commented Nov 14, 2023 •

edited

Loading

SeuperHakkerJa commented Nov 18, 2023 •

edited

Loading

alan-cooney left a comment

Extending Support for Additional Bloom Models (up to 7b) #447

Extending Support for Additional Bloom Models (up to 7b) #447

Conversation

SeuperHakkerJa commented Nov 14, 2023 • edited Loading

Description

Type of change

Checklist:

SeuperHakkerJa commented Nov 18, 2023 • edited Loading

alan-cooney left a comment

Choose a reason for hiding this comment

SeuperHakkerJa commented Nov 14, 2023 •

edited

Loading

SeuperHakkerJa commented Nov 18, 2023 •

edited

Loading