JunHowie commited on
Commit
d7af091
·
verified ·
1 Parent(s): e739e6c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -13,17 +13,17 @@ base_model:
13
  - Qwen/Qwen3-Coder-480B-A35B-Instruct
14
  base_model_relation: quantized
15
  ---
16
- # 通义千问3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix
17
- 基础型 [Qwen/Qwen3-Coder-480B-A35B-Instruct](https://www.modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct)
18
 
19
 
20
- ### 【Vllm 单机8卡启动命令】
21
- <i>注: 8卡启动一定要跟`--enable-expert-parallel` 否则该模型专家张量TP整除除不尽;4卡则不需要。 </i>
22
  ```
23
  CONTEXT_LENGTH=32768 # 262144
24
 
25
  vllm serve \
26
- tclf90/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
27
  --served-model-name Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
28
  --enable-expert-parallel \
29
  --swap-space 16 \
@@ -38,35 +38,35 @@ vllm serve \
38
  --port 8000
39
  ```
40
 
41
- ### 【依赖】
42
 
43
  ```
44
  vllm>=0.9.2
45
  ```
46
 
47
- ### 【模型更新日期】
48
  ```
49
  2025-07-24
50
- 1. 首次commit
51
  ```
52
 
53
- ### 【模型列表】
54
 
55
- | 文件大小 | 最近更新时间 |
56
  |---------|--------------|
57
  | `261GB` | `2025-07-24` |
58
 
59
 
60
 
61
- ### 【模型下载】
62
 
63
  ```python
64
- from modelscope import snapshot_download
65
- snapshot_download('tclf90/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix', cache_dir="本地路径")
66
  ```
67
 
68
 
69
- ### 【介绍】
70
 
71
  # Qwen3-Coder-480B-A35B-Instruct
72
  <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
 
13
  - Qwen/Qwen3-Coder-480B-A35B-Instruct
14
  base_model_relation: quantized
15
  ---
16
+ # Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix
17
+ Base model [Qwen/Qwen3-Coder-480B-A35B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct)
18
 
19
 
20
+ ### 【VLLM Launch Command for 8 GPUs (Single Node)】
21
+ <i>注: Note: When launching with 8 GPUs, --enable-expert-parallel must be specified; otherwise, the expert tensors cannot be evenly split across tensor parallel ranks. This option is not required for 4-GPU setups. </i>
22
  ```
23
  CONTEXT_LENGTH=32768 # 262144
24
 
25
  vllm serve \
26
+ QuantTrio/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
27
  --served-model-name Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix \
28
  --enable-expert-parallel \
29
  --swap-space 16 \
 
38
  --port 8000
39
  ```
40
 
41
+ ### 【Dependencies】
42
 
43
  ```
44
  vllm>=0.9.2
45
  ```
46
 
47
+ ### 【Model Update History】
48
  ```
49
  2025-07-24
50
+ 1. fast commit
51
  ```
52
 
53
+ ### 【Model Files】
54
 
55
+ | File Size | Last Updated |
56
  |---------|--------------|
57
  | `261GB` | `2025-07-24` |
58
 
59
 
60
 
61
+ ### 【Model Download】
62
 
63
  ```python
64
+ from huggingface_hub import snapshot_download
65
+ snapshot_download('QuantTrio/Qwen3-Coder-480B-A35B-Instruct-GPTQ-Int4-Int8Mix', cache_dir="your_local_path")
66
  ```
67
 
68
 
69
+ ### 【Description】
70
 
71
  # Qwen3-Coder-480B-A35B-Instruct
72
  <a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">