Update README.md
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ The table below lists the 8B/8B-Chat model that has completed training on 1.1T t
|
|
| 25 |
|
| 26 |
| Model Name | Description | #Param |Huggingface |
|
| 27 |
|----------------|-------------------------------------------------|----------|-------------|
|
| 28 |
-
| **OpenMoE-8B(1.1T)** | 8B MoE with comparable FLOPs of a
|
| 29 |
| **OpenMoE-8B-Chat (1.1T+SFT)** | OpenMoE-8B-1.1T supervised finetuned on the [WildChat GPT-4 Subset](https://huggingface.co/datasets/allenai/WildChat-nontoxic) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-chat) |
|
| 30 |
|
| 31 |
|
|
@@ -34,11 +34,11 @@ Besides, we also provide all our intermediate checkpoints(base, 8B, 34B) for res
|
|
| 34 |
| Model Name | Description | #Param |Huggingface |
|
| 35 |
|----------------|-------------------------------------------------|----------|-------------|
|
| 36 |
| **OpenMoE-34B-200B** | 34B MoE with comparable FLOPs of a 7B LLaMA(No SFT) |34B |[Link](https://huggingface.co/OrionZheng/openmoe-34b-200B) |
|
| 37 |
-
| OpenMoE-8B-200B | 8B MoE with comparable FLOPs of a
|
| 38 |
-
| OpenMoE-8B-400B | 8B MoE with comparable FLOPs of a
|
| 39 |
-
| OpenMoE-8B-600B | 8B MoE with comparable FLOPs of a
|
| 40 |
-
| OpenMoE-8B-800B | 8B MoE with comparable FLOPs of a
|
| 41 |
-
| OpenMoE-8B-1T | 8B MoE with comparable FLOPs of a
|
| 42 |
| OpenMoE-base(128B) | A small MoE model for debugging only |637M |[Link](https://huggingface.co/OrionZheng/openmoe-base) |
|
| 43 |
| OpenLLaMA-base(128B) | A dense counter-part of OpenMoE-base |310M |[Link](https://huggingface.co/fuzhao/OpenLLaMA_Base) |
|
| 44 |
|
|
|
|
| 25 |
|
| 26 |
| Model Name | Description | #Param |Huggingface |
|
| 27 |
|----------------|-------------------------------------------------|----------|-------------|
|
| 28 |
+
| **OpenMoE-8B(1.1T)** | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b) |
|
| 29 |
| **OpenMoE-8B-Chat (1.1T+SFT)** | OpenMoE-8B-1.1T supervised finetuned on the [WildChat GPT-4 Subset](https://huggingface.co/datasets/allenai/WildChat-nontoxic) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-chat) |
|
| 30 |
|
| 31 |
|
|
|
|
| 34 |
| Model Name | Description | #Param |Huggingface |
|
| 35 |
|----------------|-------------------------------------------------|----------|-------------|
|
| 36 |
| **OpenMoE-34B-200B** | 34B MoE with comparable FLOPs of a 7B LLaMA(No SFT) |34B |[Link](https://huggingface.co/OrionZheng/openmoe-34b-200B) |
|
| 37 |
+
| OpenMoE-8B-200B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-200B) |
|
| 38 |
+
| OpenMoE-8B-400B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-400B) |
|
| 39 |
+
| OpenMoE-8B-600B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-600B) |
|
| 40 |
+
| OpenMoE-8B-800B | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-800B) |
|
| 41 |
+
| OpenMoE-8B-1T | 8B MoE with comparable FLOPs of a 2B LLaMA(No SFT) |8B |[Link](https://huggingface.co/OrionZheng/openmoe-8b-1T) |
|
| 42 |
| OpenMoE-base(128B) | A small MoE model for debugging only |637M |[Link](https://huggingface.co/OrionZheng/openmoe-base) |
|
| 43 |
| OpenLLaMA-base(128B) | A dense counter-part of OpenMoE-base |310M |[Link](https://huggingface.co/fuzhao/OpenLLaMA_Base) |
|
| 44 |
|