Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

2,220

Full-text search

Active filters: compressed-tensors

nm-testing/Llama-3-8B-Instruct-trans-w4a16-mock_calib_fquant

8B • Updated 5 days ago • 11

BCCard/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic

480B • Updated 5 days ago • 1

aravindpai/Qwen3-0.6B-W4A16-G128

Text Generation • 0.2B • Updated 5 days ago • 4

aravindpai/Qwen3-1.7B-W4A16-G128

Text Generation • 0.5B • Updated 5 days ago • 3

aravindpai/Qwen3-4B-W4A16-G128

Text Generation • 0.9B • Updated 5 days ago • 3

aravindpai/Qwen3-8B-W4A16-G128

Text Generation • 2B • Updated 5 days ago • 2

Ba2han/gemma3-4b-TRD2-w8a8

Image-Text-to-Text • Updated 5 days ago • 17

pazc/Agatha-111B-v1-awq-asym

21B • Updated 4 days ago • 2

abhishekchohan/KAT-V1-40B-W4A16

7B • Updated 4 days ago • 106

Sinensis/Harbinger-24B-FP8-Dynamic

Text Generation • 24B • Updated 4 days ago • 1

BCCard/Qwen3-235B-A22B-Thinking-2507-FP8-Dynamic

235B • Updated about 13 hours ago • 2

warshanks/Qwen3-1.7B-abliterated-AWQ

Text Generation • 0.8B • Updated 3 days ago • 18

warshanks/Qwen3-8B-abliterated-AWQ

Text Generation • 2B • Updated 3 days ago • 20

ChangyuLiu/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ_W8A8_G128

2B • Updated 3 days ago • 3

ChangyuLiu/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ_FP8_DYNAMIC_G128

2B • Updated 2 days ago • 4

ChangyuLiu/DeepSeek-R1-Distill-Qwen-7B-GPTQ_W8A8_G128

8B • Updated 2 days ago • 6

cpatonn/Llama-3_3-Nemotron-Super-49B-v1_5-AWQ

Text Generation • 8B • Updated 3 days ago • 132

cpatonn/Llama-3_3-Nemotron-Super-49B-v1_5-GPTQ-8bit

Text Generation • 14B • Updated 3 days ago • 11

cpatonn/KAT-V1-40B-GPTQ-8bit

Text Generation • 11B • Updated 3 days ago • 4

ChangyuLiu/DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128

8B • Updated 2 days ago • 3

RedHatAI/SmolLM3-3B-FP8-dynamic

3B • Updated 2 days ago • 4

cpatonn/UIGEN-X-32B-0727-AWQ

Text Generation • 6B • Updated 2 days ago • 1

BCCard/Qwen3-235B-A22B-Thinking-2507-quantized.w4a16

Updated 1 day ago • 28

arnepa/Llama-3_3-Nemotron-Super-49B-v1_5-W8A8-Dynamic

50B • Updated 1 day ago • 2

warshanks/bernie0.1-AWQ

1B • Updated about 17 hours ago

nm-testing/granite-20b-code-instruct-8k-quantized.w4a16

3B • Updated about 16 hours ago

qingy2024/UIGEN-X-8B-QAT-FFT-AWQ

2B • Updated about 14 hours ago

ramblingpolymath/Qwen3-30B-A3B-2507-W8A8

31B • Updated about 12 hours ago

ChangyuLiu/DeepSeek-R1-Distill-Qwen-32B-GPTQ_W8A8_G128

33B • Updated about 3 hours ago

weiweiz1/Llama-3.2-1B-Instruct-NVFP4-W4A4-RTN

0.8B • Updated about 1 hour ago