Model Card
- Base model:
meta-llama/Llama-2-70b-hf
- Quantization method: BlockLDLQ with GuidedQuant Hessian
- Target bit-width: 2
- Backend kernel: QTIP kernel (HYB variant)
- Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction
- num_groups (for GuidedQuant Hessian): 2
How to run
References
Model tree for jusjinuk/Llama-2-70b-hf-2bit-GuidedQuant-QTIP
Collection including
jusjinuk/Llama-2-70b-hf-2bit-GuidedQuant-QTIP