Getting Started with Inference Providers

Hugging Face Inference Providers unifies 15+ inference partners under a single, OpenAI‑compatible endpoint.
Move from prototype to production with the same, unified API and no infrastructure to manage.

Hugging Face Inference Partners

Cerebras
Novita
Nebius AI
Featherless AI
Fireworks
Together AI
Groq
Hyperbolic
Cohere
fal
Nscale
SambaNova
Replicate
HF Inference API

Your first LLM call

Here we are going to make your first inference request to an LLM using moonshotai/Kimi-K2-Instruct.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=[
        {
            "role": "user",
            "content": ""
        }
    ],
)

print(completion.choices[0].message)

Generate an image

Next lets generate an image using the very fast black-forest-labs/FLUX.1-dev.

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key=os.environ["HF_TOKEN"],
)

# output is a PIL.Image object
image = client.text_to_image(
    "",
    model="black-forest-labs/FLUX.1-dev",
)

Start using Inference Providers today

You can browse compatible models and run inference directly in their model card widgets.

Get PRO

to instantly get 20x more included monthly credits, and unlock pay as you go billing!