Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

arxiv: 2302.13971

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

562

Full-text search

Active filters: 2302.13971

ariakhosh/a7

Updated May 31, 2024

ariakhosh/a8

Updated May 31, 2024

ariakhosh/a9

Updated May 31, 2024

ariakhosh/a10

Updated May 31, 2024

QuantFactory/Ahma-3B-GGUF

Text Generation • 4B • Updated Jul 2 • 649 • 2

RichardErkhov/allenai_-_OLMo-1B-hf-gguf

1B • Updated Jun 22, 2024 • 252

Finnish-NLP/Ahma-3B-Instruct

Text Generation • 4B • Updated Dec 30, 2024 • 38 • 4

mozilla-ai/OLMo-7B-0424-llamafile

Updated Mar 31 • 507 • 3

RichardErkhov/wang7776_-_vicuna-7b-v1.3-sparsity-10-gguf

7B • Updated Jul 25, 2024 • 80

RichardErkhov/garage-bAInd_-_Platypus-30B-gguf

33B • Updated Jul 26, 2024 • 150

RichardErkhov/Finnish-NLP_-_llama-3b-finnish-gguf

4B • Updated Jul 30, 2024 • 359

RichardErkhov/wang7776_-_vicuna-7b-v1.3-attention-sparsity-20-gguf

7B • Updated Aug 1, 2024 • 150

Ian332/Helper_Bob

Text Classification • 8B • Updated Aug 20, 2024 • 2 • 3

RichardErkhov/wang7776_-_vicuna-7b-v1.3-sparsity-20-gguf

7B • Updated Aug 18, 2024 • 73

RichardErkhov/dfurman_-_LLaMA-13B-gguf

13B • Updated Aug 18, 2024 • 54

RichardErkhov/Finnish-NLP_-_Ahma-3B-gguf

4B • Updated Aug 19, 2024 • 531

RichardErkhov/Finnish-NLP_-_Ahma-3B-Instruct-gguf

4B • Updated Aug 22, 2024 • 210

RichardErkhov/wang7776_-_vicuna-7b-v1.3-attention-sparsity-30-gguf

7B • Updated Sep 3, 2024 • 88

QuantFactory/mpt-7b-GGUF

7B • Updated Sep 10, 2024 • 236 • 1

RichardErkhov/allenai_-_OLMo-7B-Twin-2T-hf-gguf

7B • Updated Sep 18, 2024 • 165

RichardErkhov/xzyao_-_openllama-3b-chat-gguf

3B • Updated Oct 2, 2024 • 57

RichardErkhov/allenai_-_OLMo-1B-hf-4bits

0.7B • Updated Oct 6, 2024 • 4

RichardErkhov/allenai_-_OLMo-1B-hf-8bits

1B • Updated Oct 6, 2024 • 6

mav23/open_llama_3b-GGUF

3B • Updated Oct 15, 2024 • 67

RichardErkhov/garage-bAInd_-_SuperPlatty-30B-gguf

33B • Updated Oct 21, 2024 • 43

RichardErkhov/garage-bAInd_-_GPlatty-30B-gguf

33B • Updated Oct 21, 2024 • 58

QuantFactory/plamo-13b-GGUF

Text Generation • 13B • Updated Oct 21, 2024 • 95 • 2

deouron/nlp_llm_001

0.3B • Updated Oct 30, 2024 • 7

mav23/OLMo-1B-hf-GGUF

1B • Updated Oct 30, 2024 • 82

somedeadinsidespy/SleeplessLLaMa

Updated Nov 10, 2024