From 51abba7c8be0b6f37bfac1df820535f2320d21f7 Mon Sep 17 00:00:00 2001
From: gitnlp <36983436+gitnlp@users.noreply.github.com>
Date: Thu, 24 Nov 2022 09:29:34 +0800
Subject: [PATCH 1/2] Update README.md
---
README.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 9e5c774..1d2a6fb 100644
--- a/README.md
+++ b/README.md
@@ -5,8 +5,8 @@
-TorchScale is a PyTorch library that allows researchers and developeres to scale up Transformers efficiently and effectively.
-It has the implemetention of fundamental research to improve modeling generality and capability, as well as training stability and efficiency of scaling Transformers.
+TorchScale is a PyTorch library that allows researchers and developers to scale up Transformers efficiently and effectively.
+It has the implementation of fundamental research to improve modeling generality and capability, as well as training stability and efficiency of scaling Transformers.
- Stability - [**DeepNet**](https://arxiv.org/abs/2203.00555): scaling Transformers to 1,000 Layers and beyond
- Generality - [**Foundation Transformers (Magneto)**](https://arxiv.org/abs/2210.06423)
@@ -192,4 +192,4 @@ This project may contain trademarks or logos for projects, products, or services
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
-Any use of third-party trademarks or logos are subject to those third-party's policies.
\ No newline at end of file
+Any use of third-party trademarks or logos are subject to those third-party's policies.
From 660a2914029e2224669ad30ae0461a7b72b25b4c Mon Sep 17 00:00:00 2001
From: Li Dong
Date: Thu, 24 Nov 2022 11:40:38 +0800
Subject: [PATCH 2/2] Update README.md
xmoe bibtex
---
README.md | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/README.md b/README.md
index 1d2a6fb..f4bd087 100644
--- a/README.md
+++ b/README.md
@@ -154,21 +154,12 @@ If you find this repository useful, please consider citing our work:
```
```
-@article{xmoe,
- author = {Zewen Chi and
- Li Dong and
- Shaohan Huang and
- Damai Dai and
- Shuming Ma and
- Barun Patra and
- Saksham Singhal and
- Payal Bajaj and
- Xia Song and
- Furu Wei},
- title = {On the Representation Collapse of Sparse Mixture of Experts},
- journal = {CoRR},
- volume = {abs/2204.09179},
- year = {2022}
+@inproceedings{xmoe,
+ title={On the Representation Collapse of Sparse Mixture of Experts},
+ author={Zewen Chi and Li Dong and Shaohan Huang and Damai Dai and Shuming Ma and Barun Patra and Saksham Singhal and Payal Bajaj and Xia Song and Xian-Ling Mao and Heyan Huang and Furu Wei},
+ booktitle={Advances in Neural Information Processing Systems},
+ year={2022},
+ url={https://openreview.net/forum?id=mWaYC6CZf5}
}
```