Kebob
/

DeepSeek-V3-0324-IK_GGUF

Model card Files Files and versions

Kebob commited on Jul 6

Commit

4e03ee2

·

verified ·

1 Parent(s): 9c7bccb

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -6,3 +6,22 @@ tags:
 - ik_llama.cpp
 ---

 - ik_llama.cpp
 ---
+## `ik_llama.cpp` quantizations of DeepSeek-V3-0324
+Quantized using `ik_llama.cpp` build = 3788 (4622fadc)
+NOTE: These quants **MUST** be run using the `llama.cpp` fork, [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
+Credits to
+@ubergarm
+for his DeepSeek quant recipes for which these quants were based on.
+| name | file size | quant type | bpw |
+| --- | --- | --- | --- |
+| DeepSeek-V3-0324-IQ4_KT | 322.355 GiB | `IQ4_KT` (97.5%) / `Q8_0` (2.5%) | 4.127 |
+| DeepSeek-V3-0324-IQ4_XS_R8 | 340.764 GiB | `IQ4_XS_R8` (97.5%) / `Q8_0` (2.5%) | 4.362 |
+| DeepSeek-V3-0324-D-IQ4_KS_R4 | 366.762 GiB | `IQ4_KS_R4` (65%) / `IQ5_KS_R4` (32.5%) / `Q8_0` (2.5%) | 4.695 |
+| DeepSeek-V3-0324-D-Q4_K_R4 | 412.131 GiB | `Q4_K_R4` (65%) / `Q6_K_R4` (32.5%) / `Q8_0` (2.5%) | 5.276 |
+| DeepSeek-V3-0324-D-Q4_K_R4 | 412.131 GiB | `Q4_K_R4` (65%) / `Q6_K_R4` (32.5%) / `Q8_0` (2.5%) | 5.276 |
+| DeepSeek-V3-0324-Q8_0_R8 | 664.295 GiB | `Q8_0_R8` (100%) | 8.504 |