You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I trained the model on the WMT16 dataset and modified the parameters to the following values
The main modifications were dim and seq_len, what's more, I change the learning_step to 120000, to make the result better.
But I still got very poor results.
I wonder when I change these parameters, do I have to change other parameters along with them?
When I trained the model with your original parameters, the results were not good enough because of dim and seq_len, but they were better than the current results.
The text was updated successfully, but these errors were encountered:
zkzhou126
changed the title
training on wmt16
If there is any rule to modify the parameters
Jan 26, 2024
Hi,
Many hyper-parameters may take effects on the final results, including bsz, seq_len, dim, steps and tokenizers. Also, other techniques such as self-conditioning, length prediction, may help the training.
Hello! I trained the model on the WMT16 dataset and modified the parameters to the following values
The main modifications were dim and seq_len, what's more, I change the learning_step to 120000, to make the result better.
But I still got very poor results.
I wonder when I change these parameters, do I have to change other parameters along with them?
When I trained the model with your original parameters, the results were not good enough because of dim and seq_len, but they were better than the current results.
The text was updated successfully, but these errors were encountered: