1.13.0

fhieber released this 21 Nov 15:12

cc24739

[1.13.0]

Fixed

Transformer models do not ignore --num-embed anymore as they did silently before.
As a result there is an error thrown if --num-embed != --transformer-model-size.
Fixed the attention in upper layers (--rnn-attention-in-upper-layers), which was previously not passed correctly
to the decoder.

Removed

Removed RNN parameter (un-)packing and support for FusedRNNCells (removed --use-fused-rnns flag).
These were not used, not correctly initialized, and performed worse than regular RNN cells. Moreover,
they made the code much more complex. RNN models trained with previous versions are no longer compatible.
Removed the lexical biasing functionality (Arthur ETAL'16) (removed arguments --lexical-bias
and --learn-lexical-bias).

[1.12.2]

Changed

Updated to MXNet 0.12.1, which includes an important
bug fix for CPU decoding.

[1.12.1]

Changed

Removed dependency on sacrebleu pip package. Now imports directly from contrib/.

[1.12.0]

Changed

Transformers now always use the linear output transformation after combining attention heads, even if input & output
depth do not differ.

[1.11.2]

Fixed

Fixed a bug where vocabulary slice padding was defaulting to CPU context. This was affecting decoding on GPUs with
very small vocabularies.

[1.11.1]

Fixed

Fixed an issue with the use of ignore in CrossEntropyMetric::cross_entropy_smoothed. This was affecting
runs with Eve optimizer and label smoothing. Thanks @kobenaxie for reporting.

[1.11.0]

Added

Lexicon-based target vocabulary restriction for faster decoding. New CLI for top-k lexicon creation, sockeye.lexicon.
New translate CLI argument --restrict-lexicon.

Assets 2