Getting Started with Inference Providers

Hugging Face Inference Providers unifies 15+ inference partners under a single, OpenAI‑compatible endpoint.
Move from prototype to production with the same, unified API and no infrastructure to manage.

Hugging Face Inference Partners

  • Cerebras
  • Novita
  • Nebius AI
  • Featherless AI
  • Fireworks
  • Together AI
  • Groq
  • Hyperbolic
  • Cohere
  • fal
  • Nscale
  • SambaNova
  • Replicate
  • HF Inference API

Your first LLM call

Here we are going to make your first inference request to an LLM using moonshotai/Kimi-K2-Instruct.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=[
        {
            "role": "user",
            "content": ""
        }
    ],
)

print(completion.choices[0].message)

Generate an image

Next lets generate an image using the very fast black-forest-labs/FLUX.1-dev.

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key=os.environ["HF_TOKEN"],
)

# output is a PIL.Image object
image = client.text_to_image(
    "",
    model="black-forest-labs/FLUX.1-dev",
)

Start using Inference Providers today

You can browse compatible models and run inference directly in their model card widgets.

Get PRO

to instantly get 20x more included monthly credits, and unlock pay as you go billing!