XuebinWang commited on
Commit
b614283
Β·
verified Β·
1 Parent(s): e59de3c

update model with several fixings (#5)

Browse files

- remove outdated files and update accuracy numbers (ddd719bed59f5138a919973e2c2d1d5f7dc5f555)

Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,7 +12,7 @@ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
12
 
13
  - ## Quantization Stragegy
14
 
15
- - ***Quantized Layers***: All linear layers excluding "lm_head", "*.gate"
16
  - ***Weight***: Auto Mixed Precision quantized by Quark, each weight has either quantization scheme in candidates of
17
  - FP8 symmetric per-tensor
18
  - OCP Microscaling (MX) FP4
@@ -39,9 +39,9 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
39
  | Quant scheme | arc challenge (↑)<br>(acc) | | gsm8k (↑)<br>(strict-match) | | mmlu (↑)<br>(acc) | | wikitext (↓)<br>(word_perplexity) | | winogrande (↑)<br>(acc) | |
40
  |--------------|-----------------------------|-------|-----------------------------|-------|-------------------|-------|-----------------------------------|-------|--------------------------|-------|
41
  | | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate |
42
- | **FP16** | 0.6271 | 100.0% | 0.6384 | 100.0% | 0.6893 | 100.0% | 5.7259 | 100.0% | 0.7664 | 100.0% |
43
  | **FP8** | 0.6220 | 99.2% | 0.6384 | 100.0% | 0.6816 | 98.9% | 5.8092 | 98.6% | 0.7830 | 102.2% |
44
- | <span style="background-color:#5afc65; font-weight:bold;">AMP</span> | 0.6237 | 99.5% | 0.6133 | 96.1% | 0.6765 | 98.1% | 6.2020 | 92.3% | 0.7569 | 98.8% |
45
  | **MXFP4** | 0.5922 | 94.4% | 0.4845 | 75.9% | 0.6364 | 92.3% | 7.0935 | 80.7% | 0.7474 | 97.5% |
46
 
47
  #### License
 
12
 
13
  - ## Quantization Stragegy
14
 
15
+ - ***Quantized Layers***: All linear layers excluding `lm_head`, `*.gate`
16
  - ***Weight***: Auto Mixed Precision quantized by Quark, each weight has either quantization scheme in candidates of
17
  - FP8 symmetric per-tensor
18
  - OCP Microscaling (MX) FP4
 
39
  | Quant scheme | arc challenge (↑)<br>(acc) | | gsm8k (↑)<br>(strict-match) | | mmlu (↑)<br>(acc) | | wikitext (↓)<br>(word_perplexity) | | winogrande (↑)<br>(acc) | |
40
  |--------------|-----------------------------|-------|-----------------------------|-------|-------------------|-------|-----------------------------------|-------|--------------------------|-------|
41
  | | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate |
42
+ | **BF16** | 0.6271 | 100.0% | 0.6384 | 100.0% | 0.6893 | 100.0% | 5.7259 | 100.0% | 0.7664 | 100.0% |
43
  | **FP8** | 0.6220 | 99.2% | 0.6384 | 100.0% | 0.6816 | 98.9% | 5.8092 | 98.6% | 0.7830 | 102.2% |
44
+ | <span style="background-color:#5afc65; font-weight:bold;">AMP</span> | 0.6195 | 98.8% | 0.6277 | 98.3% | 0.6806 | 98.7% | 6.2026 | 92.3% | 0.7719 | 100.7% |
45
  | **MXFP4** | 0.5922 | 94.4% | 0.4845 | 75.9% | 0.6364 | 92.3% | 7.0935 | 80.7% | 0.7474 | 97.5% |
46
 
47
  #### License