buaahsh
|
bc140c65bb
|
fx bert moe
|
2023-03-05 07:43:58 +00:00 |
|
shumingma
|
32cb51ae38
|
v0.1.2
|
2023-03-04 01:11:34 -08:00 |
|
Shuming Ma
|
27d818674f
|
Merge pull request #19 from buaahsh/patch-1
Update README.md
|
2023-03-04 10:55:52 +08:00 |
|
Shaohan Huang
|
cbdbc1dfc8
|
Update README.md
|
2023-03-04 07:37:01 +08:00 |
|
shumingma
|
20c1e6c611
|
Bert MoE
|
2023-03-02 02:54:19 -08:00 |
|
shumingma
|
0cb9695501
|
Remove inplace
|
2023-01-18 22:44:26 -08:00 |
|
shumingma
|
9f105b591d
|
Support Pytorch LayerNorm
|
2023-01-16 20:17:28 -08:00 |
|
Shuming Ma
|
82f140a6c4
|
Merge pull request #12 from microsoft/bsz
Batch size first
|
2023-01-16 20:07:52 +08:00 |
|
shumingma
|
1a5d2c26fe
|
Batch size first
|
2023-01-05 01:19:51 -08:00 |
|
Shuming Ma
|
776b070d68
|
Merge pull request #11 from microsoft/xpos
Adding the official implementation of Xpos (https://arxiv.org/abs/2212.10554)
|
2023-01-05 11:08:13 +08:00 |
|
shumingma
|
9d968a24ed
|
Update XPos
|
2023-01-03 22:54:24 -08:00 |
|
shumingma
|
f9d98f4b68
|
Add XPOS
|
2022-12-29 20:48:43 -08:00 |
|
shumingma
|
aa36203042
|
Fix multiway checkpointing
|
2022-12-27 22:32:02 -08:00 |
|
gitnlp
|
22438a8525
|
Update README.md
|
2022-12-23 08:26:08 +08:00 |
|
Shuming Ma
|
21ed0056d7
|
Merge pull request #9 from MatthewChang/fix_output_projection_decoder
fix a bug that overrides the default constructed output_projection
|
2022-12-22 11:19:25 +08:00 |
|
Matthew Chang
|
adcd995595
|
fix a bug which overrides the default constructed output_projection when none is
passed in
|
2022-12-21 16:24:44 -06:00 |
|
shumingma
|
7e12b582e4
|
Support latest fairseq
|
2022-12-15 03:44:15 -08:00 |
|
shumingma
|
2518ea030c
|
Fix example fsdp
|
2022-12-08 04:20:27 -08:00 |
|
Shuming Ma
|
6d62bbbf67
|
Merge pull request #8 from buaahsh/main
don't need attn weight in decoder
|
2022-12-06 19:06:26 +08:00 |
|
buaahsh
|
2005ab1f26
|
don't need attn weight in decoder
|
2022-12-06 18:31:17 +08:00 |
|
shumingma
|
be167b3dda
|
Add an example for vocab
|
2022-12-01 20:40:09 -08:00 |
|
shumingma
|
7b29d32f03
|
Remove unused parameters
|
2022-11-29 21:36:03 -08:00 |
|
Shuming Ma
|
5adbe971cf
|
Merge pull request #5 from kashif/typo
fix typo
|
2022-11-29 18:00:13 +08:00 |
|
Kashif Rasul
|
e8be99f8f1
|
fix typo
|
2022-11-29 10:48:56 +01:00 |
|
Shuming Ma
|
559b5fdf56
|
Merge pull request #4 from kashif/kashif-patch-1
remove lambda
|
2022-11-29 12:21:53 +08:00 |
|
Kashif Rasul
|
c69aba2a73
|
fix call to activation_fn
|
2022-11-29 00:11:38 +01:00 |
|
Kashif Rasul
|
be14bc23a1
|
typo
|
2022-11-29 00:11:02 +01:00 |
|
Kashif Rasul
|
e7d5ec2ad7
|
remove lambda
|
2022-11-29 00:02:26 +01:00 |
|
gitnlp
|
c0ad46d7b8
|
Update README.md
|
2022-11-28 22:29:46 +08:00 |
|
gitnlp
|
800ea8d39f
|
Update README.md
|
2022-11-27 22:45:31 +08:00 |
|
gitnlp
|
8dd8055826
|
Update README.md
|
2022-11-27 22:38:02 +08:00 |
|
shumingma
|
7eca1a531c
|
Code reformatting
|
2022-11-26 09:01:02 -08:00 |
|
shumingma
|
1354614d44
|
Update config file
|
2022-11-26 08:15:08 -08:00 |
|
shumingma
|
994e4665a2
|
flake8 lint checks
|
2022-11-26 08:10:15 -08:00 |
|
shumingma
|
4714557e89
|
Update features section
|
2022-11-24 20:42:10 -08:00 |
|
shumingma
|
5cbb7980a9
|
Add features section
|
2022-11-24 01:06:46 -08:00 |
|
Li Dong
|
afd9094fb5
|
Merge pull request #1 from buaahsh/main
Fix decoder_embed_dim in Fairseq example
|
2022-11-24 15:54:50 +08:00 |
|
Shaohan Huang
|
bdf759f116
|
decoder_embed_dim -> args.decoder_embed_dim
|
2022-11-24 14:30:39 +08:00 |
|
Li Dong
|
1fce6ee98b
|
Update README.md
|
2022-11-24 13:51:25 +08:00 |
|
shumingma
|
be3cf93e84
|
Add paper link
|
2022-11-23 21:44:52 -08:00 |
|
shumingma
|
05636d0eb4
|
change pic path
|
2022-11-23 20:33:29 -08:00 |
|
shumingma
|
79284f5e8a
|
Merge branch 'main' of https://github.com/microsoft/torchscale into main
|
2022-11-23 20:31:00 -08:00 |
|
Li Dong
|
78f6e8a205
|
Update README.md
xmoe bibtex
|
2022-11-23 20:29:19 -08:00 |
|
gitnlp
|
5042ce960d
|
Update README.md
|
2022-11-23 20:28:29 -08:00 |
|
shumingma
|
ec24e55f6a
|
update pic path
|
2022-11-23 20:25:12 -08:00 |
|
Li Dong
|
660a291402
|
Update README.md
xmoe bibtex
|
2022-11-24 11:40:38 +08:00 |
|
gitnlp
|
51abba7c8b
|
Update README.md
|
2022-11-24 09:29:34 +08:00 |
|
shumingma
|
65fe50f466
|
update copyright
|
2022-11-23 08:36:55 -08:00 |
|
shumingma
|
ede048831f
|
torchscale released
|
2022-11-23 08:21:58 -08:00 |
|
shumingma
|
41f6ee5687
|
Update README.md
|
2022-11-17 01:18:20 -08:00 |
|