OpenReasoning-Nemotron-7B-AWQ

Method

Quantised using vllm-project/llm-compressor and the following configs:

recipe = [
    AWQModifier(ignore=["lm_head"], scheme="W4A16_ASYM", targets=["Linear"]),
]

Safetensors

Model size

2B params

Tensor type

BF16

I64

I32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Qwen/Qwen2.5-7B

Finetuned

Quantized

(27)

this model