Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

96

Full-text search

Active filters: GPTQ

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int8

Text Generation • 11B • Updated Sep 15 • 151 • 3

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int4

Text Generation • 6B • Updated Sep 15 • 96 • 5

DanielAWrightGabrielAI/pygmalion-7b-4bit-128g-cuda-2048Token

Text Generation • Updated May 18, 2023 • 22 • 15

mlabonne/gpt2-GPTQ-4bit

Text Generation • Updated Jul 8, 2023 • 41 • 1

CalderaAI/13B-Ouroboros-GPTQ4bit-128g-CUDA

Text Generation • Updated Jul 20, 2023 • 9

daedalus314/Griffin-3B-GPTQ

Text Generation • 0.7B • Updated Sep 8, 2023 • 11

Sanrove/gpt2-GPTQ-4b

Text Generation • Updated Sep 24, 2023 • 6

daedalus314/Marx-3B-V2-GPTQ

Text Generation • Updated Oct 12, 2023 • 8

TKDKid1000/pythia-2.8b-deduped-GPTQ

Text Generation • Updated Oct 25, 2023 • 16

Trelis/Yi-34B-200K-Llamafied-chat-SFT-function-calling-v2-GPTQ

Text Generation • Updated Nov 20, 2023

Inferless/deciLM-7B-GPTQ

Text Generation • Updated Jan 4, 2024 • 11 • 1

Inferless/SOLAR-10.7B-Instruct-v1.0-GPTQ

Text Generation • Updated Jan 4, 2024 • 11 • 2

Inferless/Mixtral-8x7B-v0.1-int8-GPTQ

Text Generation • Updated Jan 25, 2024 • 13 • 2

Masterjp123/SnowyRP-FinalV1-L2-13B-GPTQ

Text Generation • Updated Apr 4, 2024 • 10 • 4

bigquant/Senku-70B-GPTQ-4bit

Text Generation • Updated Feb 26, 2024 • 6 • 1

twhoool02/Llama-2-7b-hf-AutoGPTQ

Text Generation • 1B • Updated Apr 3, 2024 • 16

Dmitriy007/rugpt2_gen_news-gptq-4bit

Text Generation • 51.7M • Updated Feb 28, 2024 • 17

SwastikM/Llama-2-7B-Chat-text2code

Text Generation • Updated May 19, 2024 • 29 • 4

adriabama06/Llama-3.2-1B-Instruct-GPTQ-8bit-128g

Text Generation • 0.5B • Updated Jan 2 • 14 • 1

NightForger/saiga_nemo_12b-GPTQ

Text Generation • Updated Oct 28 • 21

NaomiBTW/L3-8B-Lunaris-v1-GPTQ

Text Generation • Updated Nov 11, 2024 • 2

GusPuffy/Llama-3.1-70B-ArliAI-RPMax-v1.3-GPTQ

11B • Updated Jul 19 • 10

iSolver-AI/test123-quantized.w4a16

Image-to-Text • Updated Aug 1 • 9

AXERA-TECH/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4

Updated Feb 19 • 25 • 1

AXERA-TECH/DeepSeek-R1-Distill-Qwen-7B-GPTQ-Int4

Updated Feb 17 • 15 • 1

AXERA-TECH/Qwen2.5-1.5B-Instruct-GPTQ-Int4

Text Generation • Updated Apr 1 • 24

AXERA-TECH/Qwen2.5-3B-Instruct-GPTQ-Int4

Updated Apr 22 • 14

AXERA-TECH/Qwen2.5-0.5B-Instruct-GPTQ-Int4

Text Generation • Updated Sep 28 • 11

AXERA-TECH/Qwen2.5-7B-Instruct-GPTQ-Int4

Updated Feb 17 • 8

RedHatAI/DeepSeek-R1-quantized.w4a16

Text Generation • Updated Sep 19 • 30 • 7