Update README.md
Browse files
README.md
CHANGED
|
@@ -6,3 +6,22 @@ tags:
|
|
| 6 |
- ik_llama.cpp
|
| 7 |
---
|
| 8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- ik_llama.cpp
|
| 7 |
---
|
| 8 |
|
| 9 |
+
## `ik_llama.cpp` quantizations of DeepSeek-V3-0324
|
| 10 |
+
|
| 11 |
+
Quantized using `ik_llama.cpp` build = 3788 (4622fadc)
|
| 12 |
+
|
| 13 |
+
NOTE: These quants **MUST** be run using the `llama.cpp` fork, [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
|
| 14 |
+
|
| 15 |
+
Credits to
|
| 16 |
+
|
| 17 |
+
@ubergarm
|
| 18 |
+
for his DeepSeek quant recipes for which these quants were based on.
|
| 19 |
+
|
| 20 |
+
| name | file size | quant type | bpw |
|
| 21 |
+
| --- | --- | --- | --- |
|
| 22 |
+
| DeepSeek-V3-0324-IQ4_KT | 322.355 GiB | `IQ4_KT` (97.5%) / `Q8_0` (2.5%) | 4.127 |
|
| 23 |
+
| DeepSeek-V3-0324-IQ4_XS_R8 | 340.764 GiB | `IQ4_XS_R8` (97.5%) / `Q8_0` (2.5%) | 4.362 |
|
| 24 |
+
| DeepSeek-V3-0324-D-IQ4_KS_R4 | 366.762 GiB | `IQ4_KS_R4` (65%) / `IQ5_KS_R4` (32.5%) / `Q8_0` (2.5%) | 4.695 |
|
| 25 |
+
| DeepSeek-V3-0324-D-Q4_K_R4 | 412.131 GiB | `Q4_K_R4` (65%) / `Q6_K_R4` (32.5%) / `Q8_0` (2.5%) | 5.276 |
|
| 26 |
+
| DeepSeek-V3-0324-D-Q4_K_R4 | 412.131 GiB | `Q4_K_R4` (65%) / `Q6_K_R4` (32.5%) / `Q8_0` (2.5%) | 5.276 |
|
| 27 |
+
| DeepSeek-V3-0324-Q8_0_R8 | 664.295 GiB | `Q8_0_R8` (100%) | 8.504 |
|