kmouratidis commited on
Commit
3c1cd8c
·
verified ·
1 Parent(s): 9ade1a5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -3,7 +3,7 @@ license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen3-32B
5
  ---
6
- # Qwen3-32B-AWQ-w4-GEMM-sc
7
 
8
  Original Model: https://huggingface.co/Qwen/Qwen3-32B
9
 
@@ -48,4 +48,11 @@ model.quantize(tokenizer, quant_config=quant_config)
48
  quant_path = './Qwen3-32B-AWQ-4bit-GEMM-sc'
49
  model.save_quantized(quant_path)
50
  tokenizer.save_pretrained(quant_path)
51
- ```
 
 
 
 
 
 
 
 
3
  base_model:
4
  - Qwen/Qwen3-32B
5
  ---
6
+ # Qwen3-32B-AWQ-GEMM-sc
7
 
8
  Original Model: https://huggingface.co/Qwen/Qwen3-32B
9
 
 
48
  quant_path = './Qwen3-32B-AWQ-4bit-GEMM-sc'
49
  model.save_quantized(quant_path)
50
  tokenizer.save_pretrained(quant_path)
51
+ ```
52
+
53
+ ## Final notes
54
+
55
+ The quant appears to be significantly degraded. I'm trying one more quantization
56
+ with 128 samples, a different dataset (HuggingFaceTB/cosmopedia-100k), and a longer
57
+ max sequence length (40960). It will be ready in a few hours, and I'll upload it here:
58
+ https://huggingface.co/kmouratidis/Qwen3-32B-AWQ-GEMM-lc