This model is based on the Gemma3 architecture (4 billion parameters) and has been quantized using a mixed-precision approach to balance memory efficiency and inference speed while preserving accuracy.

Quantization Types Used:

Q8_0-MiXED Full precision (float32): 205 tensors 8-bit quantization (q8_0): 47 tensors 3-bit quantization (q3_K): 48 tensors 4-bit quantization (q4_K): 47 tensors 5-bit quantization (q5_K): 49 tensors 6-bit quantization (q6_K): 48 tensors Model Size: Approximately 2.33 GiB Average Bits Per Weight (BPW): 5.16

Downloads last month
84
GGUF
Model size
3.88B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for marcelone/Gemma-3-Gaia-PT-BR-4b-it-gguf

Quantized
(16)
this model

Collection including marcelone/Gemma-3-Gaia-PT-BR-4b-it-gguf