Qubitium commited on
Commit
5c39353
·
verified ·
1 Parent(s): e121bb6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - meta-llama/Llama-3.1-8B-Instruct
5
+ tags:
6
+ - gptqmodel
7
+ - gptq
8
+ - v2
9
+ ---
10
+
11
+ ## Simple Llama 3.1 8B-Instruct model quantized using GPTQ v2 with C2/en 256 rows of calibration data
12
+
13
+ This is not a production ready quant model but one used to evaluate GPTQ v1 vs GPTQ v2 for post-quant comparison.
14
+
15
+ GPTQ v1 is hosted at: https://huggingface.co/ModelCloud/GPTQ-v1-Llama-3.1-8B-Instruct
16
+
17
+ ## Eval Script using GPTQModel (main branch) and Marlin kernel + lm-eval (main branch)
18
+
19
+ ```py
20
+ # eval
21
+ from lm_eval.tasks import TaskManager
22
+ from lm_eval.utils import make_table
23
+
24
+ with tempfile.TemporaryDirectory() as tmp_dir:
25
+ results = GPTQModel.eval(
26
+ QUANT_SAVE_PATH,
27
+ tasks=[EVAL.LM_EVAL.ARC_CHALLENGE, EVAL.LM_EVAL.GSM8K_PLATINUM_COT],
28
+ apply_chat_template=True,
29
+ random_seed=898,
30
+ output_path= tmp_dir,
31
+ )
32
+
33
+ print(make_table(results))
34
+ if "groups" in results:
35
+ print(make_table(results, "groups"))
36
+ ```
37
+
38
+
39
+ | Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
40
+ |------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
41
+ |arc_challenge| 1|none | 0|acc |↑ |0.5034|± |0.0146|
42
+ | | |none | 0|acc_norm|↑ |0.5068|± |0.0146|
43
+ |gsm8k_platinum_cot| 3|flexible-extract| 8|exact_match|↑ |0.7601|± |0.0123|
44
+ | | |strict-match | 8|exact_match|↑ |0.5211|± |0.0144|