-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LanguageCrossEntropy logs nan when bash pruning.sh #34
Comments
It's weird to me why it happens.. Have you tried the original set up with 7b domains? Does it cause problems? Meanwhile I will try out the 2 domain set up once I get some compute ready. |
Hi @xiamengzhou, And NaN happens in the first batch when calculating The environment I use is the same as yours except that |
Could you try the processed data I have here: https://drive.google.com/drive/folders/1WPIRx2NGkNBDswqZZh-hwI1h-QiKVCuN |
Hi @xiamengzhou! Thanks for your reply. |
Hi, @xiamengzhou! However, during normal training (update L_prune), the nan still happens due to the same reason (missing data of some subdatasets), but L_prune can still be updated. |
Hi! It's normal to get nan for some batches when the sampled batch does not contain data for a specific domain, usually because the sampling ratio for that domain is low. |
I have a issue, I used the two data sets you provided: [book,github]
the mds_sample_redpajama is this:
and I fix the pruning.sh
then I train the model , This still happens:
The text was updated successfully, but these errors were encountered: