Skip to content
This repository has been archived by the owner on Aug 10, 2023. It is now read-only.

Releases: hfxunlp/transformer

v0.2.8

02 Apr 05:14
Compare
Choose a tag to compare
v0.2.8 Pre-release
Pre-release

In this release, we:
support relative postion in self-attention;
add cnfg/hyp.py and support cached data compression;
support swish activation function and several optimizers (optm/);
implement context-aware Transformer proposed in Improving the Transformer Translation Model with Document-Level Context;
replace serialization backend from torch to utils.h5serial based on h5py.

v0.2.7

25 Dec 12:20
Compare
Choose a tag to compare
v0.2.7 Pre-release
Pre-release

Bye 2019 ;-)

v0.2.6

25 Oct 10:42
Compare
Choose a tag to compare
v0.2.6 Pre-release
Pre-release

In this release, we:
clean the code by reduce common lines into utils;
fix bugs (transformer.SC and inconsistent order of ranking.);
add an argument which can disable the second linear bias of PositionwiseFFN;
add parallel.parallelMTFP which can accelerate the fine-tuning when part of parameters are frozen;
add loss.NLLLoss as an additional option for ranking.

v0.2.5

12 Sep 07:51
Compare
Choose a tag to compare
v0.2.5 Pre-release
Pre-release

In this release, we:
update the implementation of parallel modules to fix the synchronization of gradient state on GPUs and to support torch.jit;
add the support of FP16 training based on APEX, but only useful for new GPUs;
set the computation order of transformer v2 as default, since it normally performs better on large datasets;
try to add the support of sentential context under transformer/SC/.

v0.2.4

10 Jul 10:19
Compare
Choose a tag to compare
v0.2.4 Pre-release
Pre-release

fix the training script;
apply a new parameter initialization method (fully managed by this project).

v0.2.3

24 Apr 08:33
Compare
Choose a tag to compare
v0.2.3 Pre-release
Pre-release

support sorting very large training set with limited memory;
fix ensemble on GPU and the embedding tool;
remove sortgrad.

v0.2.2

02 Apr 09:39
Compare
Choose a tag to compare
v0.2.2 Pre-release
Pre-release

fix mask cache behavior of decoders;
small enhancements with the vocab size tool, typing and readme.

v0.2.1

20 Mar 08:58
Compare
Choose a tag to compare
v0.2.1 Pre-release
Pre-release

fix single GPU selection in multi-GPU configuration;
simplify Multi-Head Attention;
update default settings for the WMT 14 EN -> DE task;

v0.2.0

19 Mar 10:50
Compare
Choose a tag to compare
v0.2.0 Pre-release
Pre-release

Update document support / add reference;
First official release.

Pre-Release v0.1.7

11 Mar 16:16
Compare
Choose a tag to compare
Pre-Release v0.1.7 Pre-release
Pre-release

Set the Transformer model in the original paper as default for performance.