danielhanchen commited on
Commit
0578b9b
·
verified ·
1 Parent(s): 35de3d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -70,7 +70,7 @@ It is finetuned from [Mistral-Small-3.1](https://huggingface.co/mistralai/Mistra
70
 
71
  For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
72
 
73
- Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
74
 
75
  **Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
76
  - Improved performance, please refer to the [benchmark results](#benchmark-results).
@@ -90,11 +90,11 @@ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
90
 
91
  ### SWE-Bench
92
 
93
- Devstral Small 1.1 achieves a score of **52.4%** on SWE-Bench Verified, outperforming Devstral Small 1.0 by +5,6% and the second best state of the art model by +10.2%.
94
 
95
  | Model | Agentic Scaffold | SWE-Bench Verified (%) |
96
  |--------------------|--------------------|------------------------|
97
- | Devstral Small 1.1 | OpenHands Scaffold | **52.4** |
98
  | Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
99
  | GPT-4.1-mini | OpenAI Scaffold | 23.6 |
100
  | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
@@ -539,4 +539,4 @@ Finally, the game is ready to be played:
539
 
540
  ![space invaders pong - game](assets/space_invaders_pong/game.png)
541
 
542
- Don't hesitate to iterate or give more information to Devstral to improve the game!
 
70
 
71
  For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
72
 
73
+ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral-2507).
74
 
75
  **Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
76
  - Improved performance, please refer to the [benchmark results](#benchmark-results).
 
90
 
91
  ### SWE-Bench
92
 
93
+ Devstral Small 1.1 achieves a score of **53.6%** on SWE-Bench Verified, outperforming Devstral Small 1.0 by +6,8% and the second best state of the art model by +11.4%.
94
 
95
  | Model | Agentic Scaffold | SWE-Bench Verified (%) |
96
  |--------------------|--------------------|------------------------|
97
+ | Devstral Small 1.1 | OpenHands Scaffold | **53.6** |
98
  | Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
99
  | GPT-4.1-mini | OpenAI Scaffold | 23.6 |
100
  | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
 
539
 
540
  ![space invaders pong - game](assets/space_invaders_pong/game.png)
541
 
542
+ Don't hesitate to iterate or give more information to Devstral to improve the game!