Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending Support for Additional Bloom Models (up to 7b) #447

Merged
merged 3 commits into from
Jan 17, 2024

Conversation

SeuperHakkerJa
Copy link
Contributor

@SeuperHakkerJa SeuperHakkerJa commented Nov 14, 2023

Description

Draft Pull Request: Extending Support for Additional Bloom Models Beyond 560m, Continuing from #434

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@SeuperHakkerJa
Copy link
Contributor Author

SeuperHakkerJa commented Nov 18, 2023

GPT Neo is not passing the acceptance test? I'm unable to replicate this error in my local environment

=================================== FAILURES ===================================
_____________________________ test_dtypes[dtype1] ______________________________

dtype = torch.float32

    @pytest.mark.parametrize("dtype", [torch.float64, torch.float32])
    def test_dtypes(dtype):
>       check_dtype(dtype, margin=5e-5)

tests/acceptance/test_hooked_transformer.py:238: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/acceptance/test_hooked_transformer.py:226: in check_dtype
    check_performance(model, hf_model, margin)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

tl_model = HookedTransformer(
  (embed): Embed()
  (hook_embed): HookPoint()
  (blocks): ModuleList(
    (0-5): 6 x TransformerBl...(ln_final): LayerNormPre(
    (hook_scale): HookPoint()
    (hook_normalized): HookPoint()
  )
  (unembed): Unembed()
)
hf_model = GPTNeoXForCausalLM(
  (gpt_neox): GPTNeoXModel(
    (embed_in): Embedding(50304, 512)
    (emb_dropout): Dropout(p=0.0...512,), eps=1e-05, elementwise_affine=True)
  )
  (embed_out): Linear(in_features=512, out_features=50304, bias=False)
)
margin = 5e-05

    def check_performance(tl_model, hf_model, margin):
        """
        Check that the TransformerLens model and the HuggingFace have
        approximately the same confidence in the expected answer.
        """
        prompt = " Unable"
        tokens = tl_model.tokenizer(prompt, return_tensors="pt")["input_ids"].to(
            "cuda" if torch.cuda.is_available() else "cpu"
        )
    
        expected_token = tl_model.tokenizer.encode(" to")[
            0
        ]  # Assume this is the expected token to predict
    
        tl_logits = tl_model(tokens, prepend_bos=False)[0, -1].float()
        hf_logits = hf_model(tokens).logits[0, -1].float()
        tl_prob = torch.softmax(tl_logits, dim=-1)[expected_token].item()
        hf_prob = torch.softmax(hf_logits, dim=-1)[expected_token].item()
>       assert tl_prob + margin > hf_prob
E       assert (0.465097129[34](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:35)49[40](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:41)2 + 5e-05) > 0.[46](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:47)51[49](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:50)4622230[53](https://github.com/neelnanda-io/TransformerLens/actions/runs/6912244111/job/18807761940?pr=447#step:7:54)

tests/acceptance/test_hooked_transformer.py:204: AssertionError

@SeuperHakkerJa SeuperHakkerJa marked this pull request as ready for review December 13, 2023 20:20
Copy link
Collaborator

@alan-cooney alan-cooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

@alan-cooney alan-cooney merged commit a5147ba into TransformerLensOrg:main Jan 17, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants