You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
[1.13.0]
Fixed
Transformer models do not ignore --num-embed anymore as they did silently before.
As a result there is an error thrown if --num-embed != --transformer-model-size.
Fixed the attention in upper layers (--rnn-attention-in-upper-layers), which was previously not passed correctly
to the decoder.
Removed
Removed RNN parameter (un-)packing and support for FusedRNNCells (removed --use-fused-rnns flag).
These were not used, not correctly initialized, and performed worse than regular RNN cells. Moreover,
they made the code much more complex. RNN models trained with previous versions are no longer compatible.
Removed the lexical biasing functionality (Arthur ETAL'16) (removed arguments --lexical-bias
and --learn-lexical-bias).
[1.12.2]
Changed
Updated to MXNet 0.12.1, which includes an important
bug fix for CPU decoding.
[1.12.1]
Changed
Removed dependency on sacrebleu pip package. Now imports directly from contrib/.
[1.12.0]
Changed
Transformers now always use the linear output transformation after combining attention heads, even if input & output
depth do not differ.
[1.11.2]
Fixed
Fixed a bug where vocabulary slice padding was defaulting to CPU context. This was affecting decoding on GPUs with
very small vocabularies.
[1.11.1]
Fixed
Fixed an issue with the use of ignore in CrossEntropyMetric::cross_entropy_smoothed. This was affecting
runs with Eve optimizer and label smoothing. Thanks @kobenaxie for reporting.
[1.11.0]
Added
Lexicon-based target vocabulary restriction for faster decoding. New CLI for top-k lexicon creation, sockeye.lexicon.
New translate CLI argument --restrict-lexicon.