NexaAI
/

Qwen2.5-Omni-3B-GGUF

Model card Files Files and versions

nexaml commited on Jul 16

Commit

c54b0b7

·

verified ·

1 Parent(s): 2b800e4

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -24,7 +24,10 @@ nexaml/Qwen2.5-Omni-3B-GGUF
 #### Available Quantizations
 | Filename | Quant type | File Size | Split | Description |
 | -------- | ---------- | --------- | ----- | ----------- |
-|  |  |  |  |  |
 ## Overview

 #### Available Quantizations
 | Filename | Quant type | File Size | Split | Description |
 | -------- | ---------- | --------- | ----- | ----------- |
+| [Qwen2.5-Omni-3B-4bit.gguf](https://huggingface.co/nexaml/Qwen2.5-Omni-3B-GGUF/blob/main/Qwen2.5-Omni-3B-4bit.gguf) | 4bit | 2.1 GB | false | Lightweight 4-bit quant for fast inference. |
+| [Qwen2.5-Omni-3B-Q8_0.gguf](https://huggingface.co/nexaml/Qwen2.5-Omni-3B-GGUF/blob/main/Qwen2.5-Omni-3B-Q8_0.gguf) | Q8_0 | 3.62 GB | false | High-quality 8-bit quantization. |
+| [Qwen2.5-Omni-3Bq2_k.gguf](https://huggingface.co/nexaml/Qwen2.5-Omni-3B-GGUF/blob/main/Qwen2.5-Omni-3Bq2_k.gguf) | Q2_K | 4 Bytes | false | 2-bit quant. Best for extreme low-resource use. |
+| [mmproj-Qwen2.5-Omni-3B-Q8_0.gguf](https://huggingface.co/nexaml/Qwen2.5-Omni-3B-GGUF/blob/main/mmproj-Qwen2.5-Omni-3B-Q8_0.gguf) | Q8_0 | 1.54 GB | false | Required vision adapter for Q8_0 model. |
 ## Overview