Update README.md
Browse files
README.md
CHANGED
|
@@ -57,7 +57,7 @@ snapshot_download(
|
|
| 57 |
allow_patterns = ["*UD-IQ1_S*"], # Select quant type UD-IQ1_S for 1.58bit
|
| 58 |
)
|
| 59 |
```
|
| 60 |
-
|
| 61 |
```bash
|
| 62 |
./llama.cpp/llama-cli \
|
| 63 |
--model DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|
|
@@ -79,7 +79,7 @@ snapshot_download(
|
|
| 79 |
Is there a scenario where 1 plus 1 wouldn't be 2? I can't think of any...
|
| 80 |
```
|
| 81 |
|
| 82 |
-
|
| 83 |
```bash
|
| 84 |
./llama.cpp/llama-cli \
|
| 85 |
--model DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|
|
@@ -91,7 +91,7 @@ snapshot_download(
|
|
| 91 |
--seed 3407 \
|
| 92 |
--prompt "<|User|>Create a Flappy Bird game in Python.<|Assistant|>"
|
| 93 |
```
|
| 94 |
-
|
| 95 |
```
|
| 96 |
./llama.cpp/llama-gguf-split --merge \
|
| 97 |
DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|
|
|
|
| 57 |
allow_patterns = ["*UD-IQ1_S*"], # Select quant type UD-IQ1_S for 1.58bit
|
| 58 |
)
|
| 59 |
```
|
| 60 |
+
5. Example with Q4_0 K quantized cache **Notice -no-cnv disables auto conversation mode**
|
| 61 |
```bash
|
| 62 |
./llama.cpp/llama-cli \
|
| 63 |
--model DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|
|
|
|
| 79 |
Is there a scenario where 1 plus 1 wouldn't be 2? I can't think of any...
|
| 80 |
```
|
| 81 |
|
| 82 |
+
6. If you have a GPU (RTX 4090 for example) with 24GB, you can offload multiple layers to the GPU for faster processing. If you have multiple GPUs, you can probably offload more layers.
|
| 83 |
```bash
|
| 84 |
./llama.cpp/llama-cli \
|
| 85 |
--model DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|
|
|
|
| 91 |
--seed 3407 \
|
| 92 |
--prompt "<|User|>Create a Flappy Bird game in Python.<|Assistant|>"
|
| 93 |
```
|
| 94 |
+
7. If you want to merge the weights together, use this script:
|
| 95 |
```
|
| 96 |
./llama.cpp/llama-gguf-split --merge \
|
| 97 |
DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
|