Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pass batch_dim_idx to deepspeed sequence parallel distributed attenti…
…on for supporting batch size larger than 1 (microsoft#433) Co-authored-by: Jinghan Yao <[email protected]>
- Loading branch information