Edit Models filters

Apps

Inference Providers

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

3,049

Full-text search

Active filters: compressed-tensors

joedonino/unsloth_qwen25vl7b_product_descriptionv1_fp8

Image-to-Text • 8B • Updated Jul 16 • 3

weiweiz1/DeepSeek-R1-NVFP4-autoround

Updated Aug 4 • 7

krickwix/Llama-3.1-70B-Instruct-W8A8-Dynamic-Per-Token

71B • Updated Jul 16 • 4

cuongpp/gemma-3-12b-it-GPTQ-4bit

Image-Text-to-Text • 3B • Updated Jul 16 • 68

krickwix/Qwen3-30B-A3B-FP8-Dynamic

31B • Updated Jul 16 • 3

Ba2han/Gemma3-TR-DatasetCreator-w8a8

Image-Text-to-Text • 5B • Updated Jul 16 • 3

nm-testing/Qwen3-0.6B-FP8-BLOCK

0.6B • Updated Jul 16 • 1

weiweiz1/DeepSeek-V2-Lite-NVFP4-autoround

9B • Updated Jul 23 • 1

Ba2han/Gemma3-TR-DatasetCreatorv3-test2

Image-Text-to-Text • 4B • Updated Jul 17 • 3

wangqia0309/Cydonia-24B-v2-FP8-KV

24B • Updated Jul 17 • 1.76k

VAmblardPEReN/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GPTQ

4B • Updated Jul 17 • 53

joedonino/unsloth_qwen25vl7b_product_descriptionv2_fp8

Image-to-Text • 8B • Updated Jul 17 • 1

chengjiyao/Qwen2-1.5B-Instruct-FP8

2B • Updated Jul 17

chengjiyao/Qwen3-1.7B-FP8-KV

2B • Updated Jul 17 • 3

warshanks/Dolphin-Mistral-24B-Venice-Edition-AWQ

4B • Updated Jul 17 • 123 • 1

nm-testing/Meta-Llama-3-8B-Instruct-transformed-w4a16

2B • Updated Jul 17 • 4

ludis/L3.3-70B-Magnum-Diamond-W8A8

71B • Updated Jul 18 • 3

weiweiz1/DeepSeek-R1-NVFP4-RTN

Updated Jul 31 • 3

warshanks/Lucy-128k-AWQ

Text Generation • 0.8B • Updated Jul 18 • 8

Ba2han/gemma3-turkv4-w8a8

Image-Text-to-Text • Updated Jul 18 • 7

warshanks/Lucy-AWQ

Text Generation • 0.8B • Updated Jul 18 • 4

JimmyFoxx/Qwen2.5-VL-32B-Instruct-FP8-Dynamic

33B • Updated Jul 18 • 3

GusPuffy/BlackSheep-24B-GPTQ

Text Generation • 4B • Updated Jul 19 • 14

cpatonn/OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic

33B • Updated Jul 19 • 3

cpatonn/OpenReasoning-Nemotron-14B-AWQ

3B • Updated Jul 19

cpatonn/OpenReasoning-Nemotron-7B-AWQ

2B • Updated Jul 19 • 9

jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4A16

Text Generation • 41B • Updated Jul 19 • 2

jiangchengchengNLP/Mistral-Small-3.2-24B-Instruct-W8A8

24B • Updated Jul 20 • 107

abhishekchohan/OpenReasoning-Nemotron-32B-W4A16

6B • Updated Jul 19 • 1.99k

abhishekchohan/cmaesar-32B-br-preview-W4A16

6B • Updated Jul 19 • 3