Nvidia AWQs + GPTQs
Collection
5 items
•
Updated
Quantised using vllm-project/llm-compressor and the following configs:
recipe = GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"])
Base model
Qwen/Qwen2.5-32B