danielhanchen commited on
Commit
9309703
·
verified ·
1 Parent(s): 261e91d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +572 -3
README.md CHANGED
@@ -1,3 +1,572 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Magistral-Small-2506
4
+ - mistralai/Mistral-Small-3.1-24B-Instruct-2503
5
+ license: apache-2.0
6
+ pipeline_tag: text2text-generation
7
+ tags:
8
+ - mistral
9
+ - unsloth
10
+ language:
11
+ - en
12
+ - fr
13
+ - de
14
+ - es
15
+ - pt
16
+ - it
17
+ - ja
18
+ - ko
19
+ - ru
20
+ - zh
21
+ - ar
22
+ - fa
23
+ - id
24
+ - ms
25
+ - ne
26
+ - pl
27
+ - ro
28
+ - sr
29
+ - sv
30
+ - tr
31
+ - uk
32
+ - vi
33
+ - hi
34
+ - bn
35
+ ---
36
+ > [!NOTE]
37
+ > Magistral, enhanced with optional Vision support. <br> You should use `--jinja` to enable the system prompt in `llama.cpp`
38
+ <div>
39
+ <p style="margin-bottom: 0; margin-top: 0;">
40
+ <strong>Learn to run Magistral correctly - <a href="https://docs.unsloth.ai/basics/magistral">Read our Guide</a>.</strong>
41
+ </p>
42
+ <p style="margin-top: 0;margin-bottom: 0;">
43
+ <em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves SOTA performance in model quantization.</em>
44
+ </p>
45
+ <div style="display: flex; gap: 5px; align-items: center; ">
46
+ <a href="https://github.com/unslothai/unsloth/">
47
+ <img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="133">
48
+ </a>
49
+ <a href="https://discord.gg/unsloth">
50
+ <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
51
+ </a>
52
+ <a href="https://docs.unsloth.ai/basics/magistral">
53
+ <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
54
+ </a>
55
+ </div>
56
+ <h1 style="margin-top: 0rem;">✨ Run & Fine-tune Magistral with Unsloth!</h1>
57
+ </div>
58
+
59
+ - Fine-tune Mistral v0.3 (7B) for free using our Google [Colab notebook here](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Mistral_v0.3_(7B)-Conversational.ipynb)!
60
+ - Read our Blog about Magistral support: [docs.unsloth.ai/basics/devstral](https://docs.unsloth.ai/basics/devstral)
61
+ - View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
62
+
63
+ # Model Card for mistralai/Magistral-Small-2506
64
+
65
+ Devstral is an agentic LLM for software engineering tasks built under a collaboration between [Mistral AI](https://mistral.ai/) and [All Hands AI](https://www.all-hands.dev/) 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this [benchmark](#benchmark-results).
66
+
67
+ It is finetuned from [Mistral-Small-3.1](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503), therefore it has a long context window of up to 128k tokens. As a coding agent, Devstral is text-only and before fine-tuning from `Mistral-Small-3.1` the vision encoder was removed.
68
+
69
+ For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
70
+
71
+ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
72
+
73
+
74
+ ## Key Features:
75
+ - **Agentic coding**: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
76
+ - **lightweight**: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use.
77
+ - **Apache 2.0 License**: Open license allowing usage and modification for both commercial and non-commercial purposes.
78
+ - **Context Window**: A 128k context window.
79
+ - **Tokenizer**: Utilizes a Tekken tokenizer with a 131k vocabulary size.
80
+
81
+
82
+
83
+ ## Benchmark Results
84
+
85
+ ### SWE-Bench
86
+
87
+ Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA by 6%.
88
+
89
+ | Model | Scaffold | SWE-Bench Verified (%) |
90
+ |------------------|--------------------|------------------------|
91
+ | Devstral | OpenHands Scaffold | **46.8** |
92
+ | GPT-4.1-mini | OpenAI Scaffold | 23.6 |
93
+ | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
94
+ | SWE-smith-LM 32B | SWE-agent Scaffold | 40.2 |
95
+
96
+
97
+ When evaluated under the same test scaffold (OpenHands, provided by All Hands AI 🙌), Devstral exceeds far larger models such as Deepseek-V3-0324 and Qwen3 232B-A22B.
98
+
99
+ ![SWE Benchmark](assets/swe_bench.png)
100
+
101
+ ## Usage
102
+
103
+ We recommend to use Devstral with the [OpenHands](https://github.com/All-Hands-AI/OpenHands/tree/main) scaffold.
104
+ You can use it either through our API or by running locally.
105
+
106
+ ### API
107
+ Follow these [instructions](https://docs.mistral.ai/getting-started/quickstart/#account-setup) to create a Mistral account and get an API key.
108
+
109
+ Then run these commands to start the OpenHands docker container.
110
+ ```bash
111
+ export MISTRAL_API_KEY=<MY_KEY>
112
+
113
+ docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik
114
+
115
+ mkdir -p ~/.openhands-state && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2505","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json
116
+
117
+ docker run -it --rm --pull=always \
118
+ -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
119
+ -e LOG_ALL_EVENTS=true \
120
+ -v /var/run/docker.sock:/var/run/docker.sock \
121
+ -v ~/.openhands-state:/.openhands-state \
122
+ -p 3000:3000 \
123
+ --add-host host.docker.internal:host-gateway \
124
+ --name openhands-app \
125
+ docker.all-hands.dev/all-hands-ai/openhands:0.39
126
+ ```
127
+
128
+ ### Local inference
129
+
130
+ You can also run the model locally. It can be done with LMStudio or other providers listed below.
131
+
132
+ Launch Openhands
133
+ You can now interact with the model served from LM Studio with openhands. Start the openhands server with the docker
134
+
135
+ ```bash
136
+ docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
137
+ docker run -it --rm --pull=always \
138
+ -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
139
+ -e LOG_ALL_EVENTS=true \
140
+ -v /var/run/docker.sock:/var/run/docker.sock \
141
+ -v ~/.openhands-state:/.openhands-state \
142
+ -p 3000:3000 \
143
+ --add-host host.docker.internal:host-gateway \
144
+ --name openhands-app \
145
+ docker.all-hands.dev/all-hands-ai/openhands:0.38
146
+ ```
147
+
148
+ The server will start at http://0.0.0.0:3000. Open it in your browser and you will see a tab AI Provider Configuration.
149
+ Now you can start a new conversation with the agent by clicking on the plus sign on the left bar.
150
+
151
+
152
+ The model can also be deployed with the following libraries:
153
+ - [`LMStudio (recommended for quantized model)`](https://lmstudio.ai/): See [here](#lmstudio)
154
+ - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm)
155
+ - [`ollama`](https://github.com/ollama/ollama): See [here](#ollama)
156
+ - [`mistral-inference`](https://github.com/mistralai/mistral-inference): See [here](#mistral-inference)
157
+ - [`transformers`](https://github.com/huggingface/transformers): See [here](#transformers)
158
+
159
+ ### OpenHands (recommended)
160
+
161
+ #### Launch a server to deploy Devstral-Small-2505
162
+
163
+ Make sure you launched an OpenAI-compatible server such as vLLM or Ollama as described above. Then, you can use OpenHands to interact with `Devstral-Small-2505`.
164
+
165
+ In the case of the tutorial we spineed up a vLLM server running the command:
166
+ ```bash
167
+ vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2
168
+ ```
169
+
170
+ The server address should be in the following format: `http://<your-server-url>:8000/v1`
171
+
172
+ #### Launch OpenHands
173
+
174
+ You can follow installation of OpenHands [here](https://docs.all-hands.dev/modules/usage/installation).
175
+
176
+ The easiest way to launch OpenHands is to use the Docker image:
177
+ ```bash
178
+ docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
179
+
180
+ docker run -it --rm --pull=always \
181
+ -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
182
+ -e LOG_ALL_EVENTS=true \
183
+ -v /var/run/docker.sock:/var/run/docker.sock \
184
+ -v ~/.openhands-state:/.openhands-state \
185
+ -p 3000:3000 \
186
+ --add-host host.docker.internal:host-gateway \
187
+ --name openhands-app \
188
+ docker.all-hands.dev/all-hands-ai/openhands:0.38
189
+ ```
190
+
191
+
192
+ Then, you can access the OpenHands UI at `http://localhost:3000`.
193
+
194
+ #### Connect to the server
195
+
196
+ When accessing the OpenHands UI, you will be prompted to connect to a server. You can use the advanced mode to connect to the server you launched earlier.
197
+
198
+ Fill the following fields:
199
+ - **Custom Model**: `openai/mistralai/Devstral-Small-2505`
200
+ - **Base URL**: `http://<your-server-url>:8000/v1`
201
+ - **API Key**: `token` (or any other token you used to launch the server if any)
202
+
203
+ #### Use OpenHands powered by Devstral
204
+
205
+ Now you're good to use Devstral Small inside OpenHands by **starting a new conversation**. Let's build a To-Do list app.
206
+
207
+ <details>
208
+ <summary>To-Do list app</summary
209
+
210
+ 1. Let's ask Devstral to generate the app with the following prompt:
211
+
212
+ ```txt
213
+ Build a To-Do list app with the following requirements:
214
+ - Built using FastAPI and React.
215
+ - Make it a one page app that:
216
+ - Allows to add a task.
217
+ - Allows to delete a task.
218
+ - Allows to mark a task as done.
219
+ - Displays the list of tasks.
220
+ - Store the tasks in a SQLite database.
221
+ ```
222
+
223
+ ![Agent prompting](assets/tuto_open_hands/agent_prompting.png)
224
+
225
+
226
+ 2. Let's see the result
227
+
228
+ You should see the agent construct the app and be able to explore the code it generated.
229
+
230
+ If it doesn't do it automatically, ask Devstral to deploy the app or do it manually, and then go the front URL deployment to see the app.
231
+
232
+ ![Agent working](assets/tuto_open_hands/agent_working.png)
233
+ ![App UI](assets/tuto_open_hands/app_ui.png)
234
+
235
+
236
+ 3. Iterate
237
+
238
+ Now that you have a first result you can iterate on it by asking your agent to improve it. For example, in the app generated we could click on a task to mark it checked but having a checkbox would improve UX. You could also ask it to add a feature to edit a task, or to add a feature to filter the tasks by status.
239
+
240
+ Enjoy building with Devstral Small and OpenHands!
241
+
242
+ </details>
243
+
244
+
245
+ ### LMStudio (recommended for quantized model)
246
+ Download the weights from huggingface:
247
+
248
+ ```
249
+ pip install -U "huggingface_hub[cli]"
250
+ huggingface-cli download \
251
+ "mistralai/Devstral-Small-2505_gguf" \
252
+ --include "devstralQ4_K_M.gguf" \
253
+ --local-dir "mistralai/Devstral-Small-2505_gguf/"
254
+ ```
255
+
256
+ You can serve the model locally with [LMStudio](https://lmstudio.ai/).
257
+ * Download [LM Studio](https://lmstudio.ai/) and install it
258
+ * Install `lms cli ~/.lmstudio/bin/lms bootstrap`
259
+ * In a bash terminal, run `lms import devstralQ4_K_M.ggu` in the directory where you've downloaded the model checkpoint (e.g. `mistralai/Devstral-Small-2505_gguf`)
260
+ * Open the LMStudio application, click the terminal icon to get into the developer tab. Click select a model to load and select Devstral Q4 K M. Toggle the status button to start the model, in setting oggle Serve on Local Network to be on.
261
+ * On the right tab, you will see an API identifier which should be devstralq4_k_m and an api address under API Usage. Keep note of this address, we will use it in the next step.
262
+
263
+ Launch Openhands
264
+ You can now interact with the model served from LM Studio with openhands. Start the openhands server with the docker
265
+
266
+ ```bash
267
+ docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
268
+ docker run -it --rm --pull=always \
269
+ -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
270
+ -e LOG_ALL_EVENTS=true \
271
+ -v /var/run/docker.sock:/var/run/docker.sock \
272
+ -v ~/.openhands-state:/.openhands-state \
273
+ -p 3000:3000 \
274
+ --add-host host.docker.internal:host-gateway \
275
+ --name openhands-app \
276
+ docker.all-hands.dev/all-hands-ai/openhands:0.38
277
+ ```
278
+
279
+ Click “see advanced setting” on the second line.
280
+ In the new tab, toggle advanced to on. Set the custom model to be mistral/devstralq4_k_m and Base URL the api address we get from the last step in LM Studio. Set API Key to dummy. Click save changes.
281
+
282
+ ### vLLM (recommended)
283
+
284
+ We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
285
+ to implement production-ready inference pipelines.
286
+
287
+ **_Installation_**
288
+
289
+ Make sure you install [`vLLM >= 0.8.5`](https://github.com/vllm-project/vllm/releases/tag/v0.8.5):
290
+
291
+ ```
292
+ pip install vllm --upgrade
293
+ ```
294
+
295
+ Doing so should automatically install [`mistral_common >= 1.5.4`](https://github.com/mistralai/mistral-common/releases/tag/v1.5.4).
296
+
297
+ To check:
298
+ ```
299
+ python -c "import mistral_common; print(mistral_common.__version__)"
300
+ ```
301
+
302
+ You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/latest/images/sha256-de9032a92ffea7b5c007dad80b38fd44aac11eddc31c435f8e52f3b7404bbf39).
303
+
304
+ #### Server
305
+
306
+ We recommand that you use Devstral in a server/client setting.
307
+
308
+ 1. Spin up a server:
309
+
310
+ ```
311
+ vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2
312
+ ```
313
+
314
+
315
+ 2. To ping the client you can use a simple Python snippet.
316
+
317
+ ```py
318
+ import requests
319
+ import json
320
+ from huggingface_hub import hf_hub_download
321
+
322
+
323
+ url = "http://<your-server-url>:8000/v1/chat/completions"
324
+ headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
325
+
326
+ model = "mistralai/Devstral-Small-2505"
327
+
328
+ def load_system_prompt(repo_id: str, filename: str) -> str:
329
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
330
+ with open(file_path, "r") as file:
331
+ system_prompt = file.read()
332
+ return system_prompt
333
+
334
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
335
+
336
+ messages = [
337
+ {"role": "system", "content": SYSTEM_PROMPT},
338
+ {
339
+ "role": "user",
340
+ "content": [
341
+ {
342
+ "type": "text",
343
+ "text": "Write a function that computes fibonacci in Python.",
344
+ },
345
+ ],
346
+ },
347
+ ]
348
+
349
+ data = {"model": model, "messages": messages, "temperature": 0.15}
350
+
351
+ response = requests.post(url, headers=headers, data=json.dumps(data))
352
+ print(response.json()["choices"][0]["message"]["content"])
353
+ ```
354
+
355
+ <details>
356
+ <summary>Output</summary>
357
+
358
+ Certainly! The Fibonacci sequence is a series of numbers where each number is the sum of the two preceding ones, usually starting with 0 and 1. Here's a simple Python function to compute the Fibonacci sequence:
359
+
360
+ ### Iterative Approach
361
+ This approach uses a loop to compute the Fibonacci number iteratively.
362
+
363
+ ```python
364
+ def fibonacci(n):
365
+ if n <= 0:
366
+ return "Input should be a positive integer."
367
+ elif n == 1:
368
+ return 0
369
+ elif n == 2:
370
+ return 1
371
+
372
+ a, b = 0, 1
373
+ for _ in range(2, n):
374
+ a, b = b, a + b
375
+ return b
376
+
377
+ # Example usage:
378
+ print(fibonacci(10)) # Output: 34
379
+ ```
380
+
381
+ ### Recursive Approach
382
+ This approach uses recursion to compute the Fibonacci number. Note that this is less efficient for large `n` due to repeated calculations.
383
+
384
+ ```python
385
+ def fibonacci_recursive(n):
386
+ if n <= 0:
387
+ return "Input should be a positive integer."
388
+ elif n == 1:
389
+ return 0
390
+ elif n == 2:
391
+ return 1
392
+ else:
393
+ return fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)
394
+
395
+ # Example usage:
396
+ print(fibonacci_recursive(10)) # Output: 34
397
+ ```
398
+
399
+ \### Memoization Approach
400
+ This approach uses memoization to store previously computed Fibonacci numbers, making it more efficient than the simple recursive approach.
401
+
402
+ ```python
403
+ def fibonacci_memo(n, memo={}):
404
+ if n <= 0:
405
+ return "Input should be a positive integer."
406
+ elif n == 1:
407
+ return 0
408
+ elif n == 2:
409
+ return 1
410
+ elif n in memo:
411
+ return memo[n]
412
+
413
+ memo[n] = fibonacci_memo(n - 1, memo) + fibonacci_memo(n - 2, memo)
414
+ return memo[n]
415
+
416
+ # Example usage:
417
+ print(fibonacci_memo(10)) # Output: 34
418
+ ```
419
+
420
+ \### Dynamic Programming Approach
421
+ This approach uses an array to store the Fibonacci numbers up to `n`.
422
+
423
+ ```python
424
+ def fibonacci_dp(n):
425
+ if n <= 0:
426
+ return "Input should be a positive integer."
427
+ elif n == 1:
428
+ return 0
429
+ elif n == 2:
430
+ return 1
431
+
432
+ fib = [0, 1] + [0] * (n - 2)
433
+ for i in range(2, n):
434
+ fib[i] = fib[i - 1] + fib[i - 2]
435
+ return fib[n - 1]
436
+
437
+ # Example usage:
438
+ print(fibonacci_dp(10)) # Output: 34
439
+ ```
440
+
441
+ You can choose any of these approaches based on your needs. The iterative and dynamic programming approaches are generally more efficient for larger values of `n`.
442
+
443
+ </details>
444
+
445
+
446
+ ### Mistral-inference
447
+
448
+ We recommend using mistral-inference to quickly try out / "vibe-check" Devstral.
449
+
450
+ #### Install
451
+
452
+ Make sure to have mistral_inference >= 1.6.0 installed.
453
+
454
+ ```bash
455
+ pip install mistral_inference --upgrade
456
+ ```
457
+
458
+ #### Download
459
+
460
+ ```python
461
+ from huggingface_hub import snapshot_download
462
+ from pathlib import Path
463
+
464
+ mistral_models_path = Path.home().joinpath('mistral_models', 'Devstral')
465
+ mistral_models_path.mkdir(parents=True, exist_ok=True)
466
+
467
+ snapshot_download(repo_id="mistralai/Devstral-Small-2505", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)
468
+ ```
469
+
470
+ #### Python
471
+
472
+ You can run the model using the following command:
473
+
474
+ ```bash
475
+ mistral-chat $HOME/mistral_models/Devstral --instruct --max_tokens 300
476
+ ```
477
+
478
+ If you prompt it with "Write me a unique and efficient function that computes fibonacci in Python", the model should generate something along the following lines:
479
+
480
+ <details>
481
+ <summary>Output</summary>
482
+
483
+ Certainly! A common and efficient way to compute Fibonacci numbers is by using memoization to store previously computed values. This avoids redundant calculations and significantly improves performance. Below is a Python function that uses memoization to compute Fibonacci numbers efficiently:
484
+
485
+ ```python
486
+ def fibonacci(n, memo=None):
487
+ if memo is None:
488
+ memo = {}
489
+
490
+ if n in memo:
491
+ return memo[n]
492
+
493
+ if n <= 1:
494
+ return n
495
+
496
+ memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo)
497
+ return memo[n]
498
+
499
+ # Example usage:
500
+ n = 10
501
+ print(f"Fibonacci number at position {n} is {fibonacci(n)}")
502
+ ```
503
+
504
+ ### Explanation:
505
+
506
+ 1. **Base Case**: If `n` is 0 or 1, the function returns `n` because the Fibonacci sequence starts with 0 and 1.
507
+ 2. **Memoization**: The function uses a dictionary `memo` to store the results of previously computed Fibonacci numbers.
508
+ 3. **Recursive Case**: For other values of `n`, the function recursively computes the Fibonacci number by summing the results of `fibonacci(n - 1)` and `fibonacci(n)`
509
+
510
+ </details>
511
+
512
+ ### Ollama
513
+
514
+ You can run Devstral using the [Ollama](https://ollama.ai/) CLI.
515
+
516
+ ```bash
517
+ ollama run devstral
518
+ ```
519
+
520
+ ### Transformers
521
+
522
+ To make the best use of our model with transformers make sure to have [installed](https://github.com/mistralai/mistral-common) ` mistral-common >= 1.5.5` to use our tokenizer.
523
+
524
+ ```bash
525
+ pip install mistral-common --upgrade
526
+ ```
527
+
528
+ Then load our tokenizer along with the model and generate:
529
+
530
+ ```python
531
+ import torch
532
+
533
+ from mistral_common.protocol.instruct.messages import (
534
+ SystemMessage, UserMessage
535
+ )
536
+ from mistral_common.protocol.instruct.request import ChatCompletionRequest
537
+ from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
538
+ from mistral_common.tokens.tokenizers.tekken import SpecialTokenPolicy
539
+ from huggingface_hub import hf_hub_download
540
+ from transformers import AutoModelForCausalLM
541
+
542
+ def load_system_prompt(repo_id: str, filename: str) -> str:
543
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
544
+ with open(file_path, "r") as file:
545
+ system_prompt = file.read()
546
+ return system_prompt
547
+
548
+ model_id = "mistralai/Devstral-Small-2505"
549
+ tekken_file = hf_hub_download(repo_id=model_id, filename="tekken.json")
550
+ SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")
551
+
552
+ tokenizer = MistralTokenizer.from_file(tekken_file)
553
+
554
+ model = AutoModelForCausalLM.from_pretrained(model_id)
555
+
556
+ tokenized = tokenizer.encode_chat_completion(
557
+ ChatCompletionRequest(
558
+ messages=[
559
+ SystemMessage(content=SYSTEM_PROMPT),
560
+ UserMessage(content="Write me a function that computes fibonacci in Python."),
561
+ ],
562
+ )
563
+ )
564
+
565
+ output = model.generate(
566
+ input_ids=torch.tensor([tokenized.tokens]),
567
+ max_new_tokens=1000,
568
+ )[0]
569
+
570
+ decoded_output = tokenizer.decode(output[len(tokenized.tokens):])
571
+ print(decoded_output)
572
+ ```