Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

301

Full-text search

Active filters: torchao

medmekk/Llama-3.2-1B-ao-autoquant-1

Text Generation • Updated Apr 22

medmekk/Llama-3.2-1B-ao-float8wo-2

Text Generation • Updated Apr 22

medmekk/Llama-3.2-1B-ao-float8wo-3

Text Generation • Updated Apr 22

medmekk/Llama-3.2-1B-ao-int8wo-gs256

Text Generation • Updated Apr 22

medmekk/Llama-3.2-1B-ao-int4wo-gs128

Text Generation • Updated Apr 22

medmekk/Qwen2.5-0.5B-Instruct-ao-float8wo

Text Generation • Updated Apr 22 • 1

medmekk/Llama-3.2-1B-ao-int4wo-gs256

Text Generation • Updated Apr 22

medmekk/Qwen2.5-VL-7B-Instruct-ao-float8wo

medmekk/Qwen2.5-VL-7B-Instruct-ao-int8wo

medmekk/Llama-3.1-8B-Instruct-ao-int8wo

Text Generation • Updated Apr 24 • 1

medmekk/Qwen2.5-VL-7B-Instruct-ao-int8da8w8

medmekk/Llama-3.1-8B-Instruct-ao-autoquant

Text Generation • Updated Apr 24

medmekk/Llama-3.1-8B-Instruct-ao-int4wo-gs128

Text Generation • Updated Apr 24 • 1

medmekk/Llama-3.1-8B-Instruct-ao-float8wo

Text Generation • Updated Apr 24

medmekk/Llama-3.1-8B-Instruct-ao-float8da8w8

Text Generation • Updated Apr 24 • 2

medmekk/Llama-3.1-8B-Instruct-ao-int8da8w8

Text Generation • Updated Apr 24

medmekk/Llama-3.1-8B-Instruct-ao-float8da8w8-2

Text Generation • Updated Apr 24

medmekk/Llama-3.1-8B-Instruct-ao-int4wo-gs32

Text Generation • Updated Apr 24 • 1

medmekk/Llama-3.1-8B-Instruct-ao-int4wo-gs16

Text Generation • Updated Apr 24

Erland/vanilla-340M-4096-model-AO-W4

Text Generation • Updated May 21

irresistiblegrace97/TinyLlama-1.1B-Chat-v1.0-torchao-int4_weight_only-gs_4096

Erland/softpick-340M-4096-model-AO-W4

Text Generation • Updated May 21

Erland/softpick-340M-4096-model-AO-W4A4

Text Generation • Updated May 21

Erland/vanilla-340M-4096-model-AO-W4A4

Text Generation • Updated May 21

irresistiblegrace97/tinyllama.gguf

jerryzh168/opt-125m-int4wo

Text Generation • Updated Apr 25

pytorch/Qwen3-8B-INT4

Text Generation • Updated 6 days ago • 88 • 2

pytorch/Qwen3-32B-FP8

Text Generation • Updated 6 days ago • 75

jerryzh168/opt-125m-int4wo-per-module

Text Generation • Updated May 29 • 2.84k

pytorch/Qwen3-4B-INT8-INT4

Text Generation • Updated Sep 11 • 1.72k • 2