update model with several fixings (#5)
Browse files- remove outdated files and update accuracy numbers (ddd719bed59f5138a919973e2c2d1d5f7dc5f555)
README.md
CHANGED
|
@@ -12,7 +12,7 @@ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
|
|
| 12 |
|
| 13 |
- ## Quantization Stragegy
|
| 14 |
|
| 15 |
-
- ***Quantized Layers***: All linear layers excluding
|
| 16 |
- ***Weight***: Auto Mixed Precision quantized by Quark, each weight has either quantization scheme in candidates of
|
| 17 |
- FP8 symmetric per-tensor
|
| 18 |
- OCP Microscaling (MX) FP4
|
|
@@ -39,9 +39,9 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
|
|
| 39 |
| Quant scheme | arc challenge (β)<br>(acc) | | gsm8k (β)<br>(strict-match) | | mmlu (β)<br>(acc) | | wikitext (β)<br>(word_perplexity) | | winogrande (β)<br>(acc) | |
|
| 40 |
|--------------|-----------------------------|-------|-----------------------------|-------|-------------------|-------|-----------------------------------|-------|--------------------------|-------|
|
| 41 |
| | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate |
|
| 42 |
-
| **
|
| 43 |
| **FP8** | 0.6220 | 99.2% | 0.6384 | 100.0% | 0.6816 | 98.9% | 5.8092 | 98.6% | 0.7830 | 102.2% |
|
| 44 |
-
| <span style="background-color:#5afc65; font-weight:bold;">AMP</span> | 0.
|
| 45 |
| **MXFP4** | 0.5922 | 94.4% | 0.4845 | 75.9% | 0.6364 | 92.3% | 7.0935 | 80.7% | 0.7474 | 97.5% |
|
| 46 |
|
| 47 |
#### License
|
|
|
|
| 12 |
|
| 13 |
- ## Quantization Stragegy
|
| 14 |
|
| 15 |
+
- ***Quantized Layers***: All linear layers excluding `lm_head`, `*.gate`
|
| 16 |
- ***Weight***: Auto Mixed Precision quantized by Quark, each weight has either quantization scheme in candidates of
|
| 17 |
- FP8 symmetric per-tensor
|
| 18 |
- OCP Microscaling (MX) FP4
|
|
|
|
| 39 |
| Quant scheme | arc challenge (β)<br>(acc) | | gsm8k (β)<br>(strict-match) | | mmlu (β)<br>(acc) | | wikitext (β)<br>(word_perplexity) | | winogrande (β)<br>(acc) | |
|
| 40 |
|--------------|-----------------------------|-------|-----------------------------|-------|-------------------|-------|-----------------------------------|-------|--------------------------|-------|
|
| 41 |
| | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate | absolute value | recovery rate |
|
| 42 |
+
| **BF16** | 0.6271 | 100.0% | 0.6384 | 100.0% | 0.6893 | 100.0% | 5.7259 | 100.0% | 0.7664 | 100.0% |
|
| 43 |
| **FP8** | 0.6220 | 99.2% | 0.6384 | 100.0% | 0.6816 | 98.9% | 5.8092 | 98.6% | 0.7830 | 102.2% |
|
| 44 |
+
| <span style="background-color:#5afc65; font-weight:bold;">AMP</span> | 0.6195 | 98.8% | 0.6277 | 98.3% | 0.6806 | 98.7% | 6.2026 | 92.3% | 0.7719 | 100.7% |
|
| 45 |
| **MXFP4** | 0.5922 | 94.4% | 0.4845 | 75.9% | 0.6364 | 92.3% | 7.0935 | 80.7% | 0.7474 | 97.5% |
|
| 46 |
|
| 47 |
#### License
|