Useful for learning Portuguese - RTX 3060 12GB
Collection
2 items
โข
Updated
This model is based on the Gemma3 architecture (4 billion parameters) and has been quantized using a mixed-precision approach to balance memory efficiency and inference speed while preserving accuracy.
Quantization Types Used:
Q8_0-MiXED Full precision (float32): 205 tensors 8-bit quantization (q8_0): 47 tensors 3-bit quantization (q3_K): 48 tensors 4-bit quantization (q4_K): 47 tensors 5-bit quantization (q5_K): 49 tensors 6-bit quantization (q6_K): 48 tensors Model Size: Approximately 2.33 GiB Average Bits Per Weight (BPW): 5.16
8-bit
16-bit
Base model
google/gemma-3-4b-pt