This model was generated using DWQ quantization to bring the quality of the 4bit quantization closer to 8bit without increasing in size. This was done using mlx-lm version 0.26.3, using --bits 4 --learning-rate 1e-7 --batch-size 1 --group-size 16.
- Downloads last month
- 67