Add quantized inference model repos. (#5)
Browse files- Add quantized inference model repos. (19d43badbe09c3afb0d61cee36d2b932cae7e677)
README.md
CHANGED
@@ -104,6 +104,10 @@ The model can be used with the following frameworks;
|
|
104 |
- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [below](#vllm-recommended)
|
105 |
- [`transformers`](https://github.com/huggingface/transformers): See [below](#transformers)
|
106 |
|
|
|
|
|
|
|
|
|
107 |
### Training
|
108 |
|
109 |
Fine-tuning is possible with (*alphabetically sorted*):
|
|
|
104 |
- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [below](#vllm-recommended)
|
105 |
- [`transformers`](https://github.com/huggingface/transformers): See [below](#transformers)
|
106 |
|
107 |
+
In addition the community has prepared quantized versions of the model that can be used with the following frameworks (*alphabetically sorted*):
|
108 |
+
- [`llama.cpp`](https://github.com/ggml-org/llama.cpp): https://huggingface.co/mistralai/Magistral-Small-2507-GGUF
|
109 |
+
- [`lmstudio` (llama.cpp, MLX)](https://lmstudio.ai/): [GGUF](https://huggingface.co/lmstudio-community/Magistral-Small-2507-GGUF), [MLX-bf16](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-bf16), [MLX-8bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-8bit), [MLX-6bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-6bit), [MLX-4bit](https://huggingface.co/lmstudio-community/Magistral-Small-2507-MLX-4bit)
|
110 |
+
|
111 |
### Training
|
112 |
|
113 |
Fine-tuning is possible with (*alphabetically sorted*):
|