πŸ“Œ Overview

A 4-bit MLX quantized version of Qwen3-30Bβ€”A6B optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of Qwen3 while enabling deployment on edge devices.

Downloads last month
13
Safetensors
Model size
5B params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support