Devstral-Small-2507-Rebased-Vision
This model was created by taking Mistral-Small-3.2-24B-Instruct-2506 and replacing the weights under the language_model
with the weights from Devstral-Small-2507. The result is Devstral with vision capabilities, but you should expect a small quality degradation.
Notes: I used unsloth's uploads of these models for convenience, since they include some extra files and configs too. I didn't name this "-Vision" because it was not trained or finetuned after weight rebase, and in case a future version by mistralai has vision.
The code will be released soon.
Evaluation
Evaluation was performed on 7 benchmarks using lm_eval
and sglang. Scripts and other details will also be released with the code. This is not a comprehensive evaluation, and it's not directly comparable to the official benchmark numbers from Mistral, the goal was to approximate quality degradation. Make sure to test on your own downstream tasks!
| Tasks | Devstral-Small-2507 | Devstral-Small-2507-rebased | Relative Loss | Relative Stderr |
Model Evaluation Comparison
Here's a comparison of the evaluation results for Devstral-Small-2507 and Devstral-Small-2507-rebased, including the relative loss and the relative standard error for each task:
Tasks | Metric | Devstral-Small-2507 | Devstral-Small-2507-rebased | Relative Loss (%) | Relative Stderr (%) |
---|---|---|---|---|---|
arc_challenge_chat | exact_match | 0.9292 | 0.9283 | 0.10% | ยฑ0.81% |
eq_bench | eqbench | 72.3376 | 73.7481 | -1.95% | ยฑ3.52% |
gsm8k | exact_match | 0.8643 | 0.862 | 0.27% | ยฑ1.09% |
gsm8k | exact_match | 0.8605 | 0.8567 | 0.44% | ยฑ1.10% |
ifeval | inst_level_loose_acc | 0.6631 | 0.6595 | 0.54% | N/A |
ifeval | inst_level_strict_acc | 0.6067 | 0.6019 | 0.79% | N/A |
ifeval | prompt_level_loose_acc | 0.5619 | 0.5545 | 1.32% | ยฑ3.81% |
ifeval | prompt_level_strict_acc | 0.4917 | 0.4861 | 1.14% | ยฑ4.37% |
mbpp | pass_at_1 | 0.118 | 0.112 | 5.08% | ยฑ12.20% |
mmlu_pro | exact_match | 0.5786 | 0.579 | -0.07% | ยฑ0.76% |
triviaqa | exact_match | 0.7075 | 0.7068 | 0.10% | ยฑ0.48% |
- Downloads last month
- 13