Update README.md
Browse files
README.md
CHANGED
|
@@ -38,6 +38,7 @@ pipeline_tag: text2text-generation
|
|
| 38 |
# Quantization NVFP4A16
|
| 39 |
Quantified from https://huggingface.co/unsloth/Devstral-Small-2507 (due to in-folder tokenizer).
|
| 40 |
Compressed with [llm-compressor](https://github.com/vllm-project/llm-compressor).
|
|
|
|
| 41 |
We recommend cuda capabilities 12.0 hardware (NVIDIA Blackwell: RTX 5000 series GPU, DGX Spark, B200, ...) due to FP4 native acceleration.
|
| 42 |
|
| 43 |
# Devstral Small 1.1
|
|
|
|
| 38 |
# Quantization NVFP4A16
|
| 39 |
Quantified from https://huggingface.co/unsloth/Devstral-Small-2507 (due to in-folder tokenizer).
|
| 40 |
Compressed with [llm-compressor](https://github.com/vllm-project/llm-compressor).
|
| 41 |
+
|
| 42 |
We recommend cuda capabilities 12.0 hardware (NVIDIA Blackwell: RTX 5000 series GPU, DGX Spark, B200, ...) due to FP4 native acceleration.
|
| 43 |
|
| 44 |
# Devstral Small 1.1
|