Llamacpp Quantizations of gpt-oss-120b

Original model: Adopting F16 from unsloth/gpt-oss-120b-GGUF.

MXFP4_MOE quant made with update in this PR llama.cpp #15091

MXFP4_MOE : 59.02 GiB (4.34 BPW)


Download(Example)

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "bobchenyx/gpt-oss-120b-GGUF",
    local_dir = "bobchenyx/gpt-oss-120b-GGUF",
    allow_patterns = ["*MXFP4_MOE*"],
)
Downloads last month
122
GGUF
Model size
117B params
Architecture
gpt-oss
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for bobchenyx/gpt-oss-120b-GGUF

Quantized
(56)
this model

Collection including bobchenyx/gpt-oss-120b-GGUF