📌 Overview

A 4-bit MLX quantized version of Qwen3-30B—A6B optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of Qwen3 while enabling deployment on edge devices.

Downloads last month: 13

Safetensors

Model size

5B params

Tensor type

BF16

U32