OpenReasoning-Nemotron-14B-AWQ

Method

Quantised using vllm-project/llm-compressor and the following configs:

recipe = [
    AWQModifier(ignore=["lm_head"], scheme="W4A16_ASYM", targets=["Linear"]),
]

Safetensors

Model size

3B params

Tensor type

I64

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Qwen/Qwen2.5-14B

Finetuned

Quantized

(22)

this model