Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive embedding size #105

Open
bjherger opened this issue Dec 29, 2018 · 0 comments
Open

Archive embedding size #105

bjherger opened this issue Dec 29, 2018 · 0 comments

Comments

@bjherger
Copy link
Owner

Archive vocab size, for use when creating embedding layer. Transient issue can occur where the maximum vocab index isn't seen in the training data set, and so the embedding vectorizer has a larger vocab than the embedding matrix.

Current state

  • Embedding matrix pulls vocab size based on largest vocab index seen in training set

Future state

  • Embedding matrix pulls vocab size from either transformation pipeline or somewhere else that is explicitly set by transformation pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant