Add quantized inference model repos. (#5)
Browse files- Add quantized inference model repos. (19d43badbe09c3afb0d61cee36d2b932cae7e677)
README.md
CHANGED
|
@@ -104,6 +104,10 @@ The model can be used with the following frameworks;
|
|
| 104 |
- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [below](#vllm-recommended)
|
| 105 |
- [`transformers`](https://github.com/huggingface/transformers): See [below](#transformers)
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
### Training
|
| 108 |
|
| 109 |
Fine-tuning is possible with (*alphabetically sorted*):
|
|
|
|
| 104 |
- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [below](#vllm-recommended)
|
| 105 |
- [`transformers`](https://github.com/huggingface/transformers): See [below](#transformers)
|
| 106 |
|
| 107 |
+
In addition the community has prepared quantized versions of the model that can be used with the following frameworks (*alphabetically sorted*):
|
| 108 |
+
- [`llama.cpp`](https://github.com/ggml-org/llama.cpp): https://huggingface.co/mistralai/Magistral-Small-2507-GGUF
|
| 109 |
+
- [`lmstudio` (llama.cpp, MLX)](https://lmstudio.ai/): [GGUF](https://huggingface.co/lmstudio-community/Magistral-Small-2507-GGUF), [MLX-bf16](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-bf16), [MLX-8bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-8bit), [MLX-6bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-6bit), [MLX-4bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-4bit)
|
| 110 |
+
|
| 111 |
### Training
|
| 112 |
|
| 113 |
Fine-tuning is possible with (*alphabetically sorted*):
|