Update README.md
Browse files
README.md
CHANGED
@@ -70,7 +70,7 @@ It is finetuned from [Mistral-Small-3.1](https://huggingface.co/mistralai/Mistra
|
|
70 |
|
71 |
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
|
72 |
|
73 |
-
Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
|
74 |
|
75 |
**Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
|
76 |
- Improved performance, please refer to the [benchmark results](#benchmark-results).
|
@@ -90,11 +90,11 @@ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
|
|
90 |
|
91 |
### SWE-Bench
|
92 |
|
93 |
-
Devstral Small 1.1 achieves a score of **
|
94 |
|
95 |
| Model | Agentic Scaffold | SWE-Bench Verified (%) |
|
96 |
|--------------------|--------------------|------------------------|
|
97 |
-
| Devstral Small 1.1 | OpenHands Scaffold | **
|
98 |
| Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
|
99 |
| GPT-4.1-mini | OpenAI Scaffold | 23.6 |
|
100 |
| Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
|
@@ -539,4 +539,4 @@ Finally, the game is ready to be played:
|
|
539 |
|
540 |

|
541 |
|
542 |
-
Don't hesitate to iterate or give more information to Devstral to improve the game!
|
|
|
70 |
|
71 |
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
|
72 |
|
73 |
+
Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral-2507).
|
74 |
|
75 |
**Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
|
76 |
- Improved performance, please refer to the [benchmark results](#benchmark-results).
|
|
|
90 |
|
91 |
### SWE-Bench
|
92 |
|
93 |
+
Devstral Small 1.1 achieves a score of **53.6%** on SWE-Bench Verified, outperforming Devstral Small 1.0 by +6,8% and the second best state of the art model by +11.4%.
|
94 |
|
95 |
| Model | Agentic Scaffold | SWE-Bench Verified (%) |
|
96 |
|--------------------|--------------------|------------------------|
|
97 |
+
| Devstral Small 1.1 | OpenHands Scaffold | **53.6** |
|
98 |
| Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
|
99 |
| GPT-4.1-mini | OpenAI Scaffold | 23.6 |
|
100 |
| Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
|
|
|
539 |
|
540 |

|
541 |
|
542 |
+
Don't hesitate to iterate or give more information to Devstral to improve the game!
|