This repository has been archived by the owner on Aug 10, 2023. It is now read-only.
Pre-Release v0.1.1
Pre-release
Pre-release
Bias in MultiHeadAttn is removed in this release;
Only parameters of the trained model is saved, rather than state dict, trained model with v0.1.0 is not loadable as fine_tune_m without additional convertion;
Try to support RNMT, but recurrent is slow due to less efficient parallelization.