Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

30

Base only

Active filters: mlp_speculator

ibm-ai-platform/llama-13b-accelerator

0.8B • Updated May 15, 2024 • 9 • 3

ibm-ai-platform/codellama-13b-accelerator

2B • Updated Jul 17, 2024 • 1

ibm-research/granite-7b-lab-accelerator

1B • Updated May 15, 2024 • 4 • 3

ibm-ai-platform/granite-7b-lab-accelerator

1B • Updated May 21, 2024 • 8

ibm-ai-platform/llama3-8b-accelerator

3B • Updated May 15, 2024 • 8 • 18

ibm-granite/granite-7b-instruct-accelerator

1B • Updated May 20, 2024 • 18 • 1

ibm-granite/granite-20b-code-instruct-accelerator

Updated Oct 7, 2024 • 12 • 3

ibm-granite/granite-8b-code-instruct-accelerator

2B • Updated May 29, 2024 • 10 • 1

cecibas/llama-13b-accelerator

0.8B • Updated Jun 8, 2024 • 4

ibm-granite/granite-3b-code-instruct-accelerator

Updated Jul 10, 2024 • 12 • 1

ibm-ai-platform/codellama-34b-accelerator

Updated Jul 17, 2024 • 8

ibm-ai-platform/llama-160m-accelerator

0.2B • Updated Jul 24, 2024 • 41 • 1

ibm-ai-platform/llama2-70b-accelerator

Updated Jul 26, 2024 • 7 • 1

ibm-ai-platform/llama3-70b-accelerator

2B • Updated Aug 29, 2024 • 25 • 6

ibm-granite/granite-34b-code-instruct-accelerator

Updated Jul 24, 2024 • 7

ibm-granite/granite-3.0-8b-instruct-accelerator

Updated Oct 16, 2024 • 7 • 2

Snowflake/Arctic-LSTM-Speculator-Llama-3.1-70B-Instruct

Updated Sep 3, 2025 • 29

Snowflake/Arctic-LSTM-Speculator-Llama-3.1-8B-Instruct

Updated Sep 3, 2025 • 65 • 2

Snowflake/Arctic-LSTM-Speculator-Qwen2.5-32B-Instruct

Updated Sep 3, 2025 • 8 • 3

Snowflake/Arctic-LSTM-Speculator-Llama-3.3-70B-Instruct

Updated Sep 3, 2025 • 10

jacksonkek/Arctic-LSTM-Speculator-Gemma-3-12B-Text-Only

Updated May 12, 2025 • 3

sfc-gh-goliaro/arctic-speculator-vicuna-7b-v1.3

Updated Jun 17, 2025 • 2

sfc-gh-goliaro/arctic-speculator-5-heads-vicuna-7b-v1.3

Updated Jun 26, 2025

sfc-gh-goliaro/arctic-speculator-8-heads-vicuna-7b-v1.3

Updated Jun 27, 2025 • 1

Snowflake/Arctic-LSTM-Speculator-gpt-oss-20b

Updated Sep 3, 2025 • 20 • 5

Snowflake/Arctic-LSTM-Speculator-gpt-oss-120b

Updated Sep 3, 2025 • 78 • 5

K-Compression/Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B

Updated Sep 5, 2025 • 4

nebius/MLP-Speculator-Llama-3.1-8B-Instruct

Text Generation • 0.2B • Updated 12 days ago • 37

rescommons/Arctic-MLP-Speculator-Llama-3.2-1B

Updated Mar 10 • 1

rescommons/Arctic-LSTM-Speculator-Llama-3.2-1B-Instruct

Updated Mar 10 • 1