Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mean_resizing = True does not work with mixed/meta initialization #3719

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jdchang1
Copy link

What does this PR do?

Transformers recently added in mean_resizing to resize_token_embeddings. This is breaking with mixed initialization in downstream training tasks that requires adding tokens to Composer Huggingface Models. This PR sets this value to False for now rather than defaulting to True.

@jdchang1 jdchang1 requested a review from a team as a code owner November 20, 2024 19:50
@jdchang1 jdchang1 requested a review from mvpatel2000 November 20, 2024 19:50
@@ -167,7 +167,7 @@ def _check_tokenizer_and_maybe_resize_embeddings(self, allow_embedding_resizing:
f' This would cause an error during training.'
f' Resizing the model embeddings to {len(self.tokenizer)} from {self.config.vocab_size}.',
)
self.model.resize_token_embeddings(len(self.tokenizer))
self.model.resize_token_embeddings(len(self.tokenizer), mean_resizing=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to gate on transformers version, or inspect the args to the func before passing this in or something i think

Copy link
Contributor

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity/if we want to fix this, could you explain why the mean_resizing doesn't work with meta but the old version does?

Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides Daniel's comment which will fix tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants