MiroMindM1

Models Data Paper Github Website

MiroMind-M1

🧾 Overview

7B Model Training Performance

Training performance of MiroMind-M1-RL-7B on AIME24 and AIME25.

MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning. It is trained through supervised fine-tuning (SFT) on 719K curated problems and reinforcement learning with verifiable rewards (RLVR) on 62K challenging examples, using a context-aware multi-stage policy optimization method (CAMPO). MiroMind-M1 achieves state-of-the-art performance among open-source 7B Qwen-2.5-based models on AIME24, AIME25, and MATH500, with all models (MiroMind-M1-SFT-7B, MiroMind-M1-RL-7B, MiroMind-M1-RL-32B), data (MiroMind-M1-SFT-719K, MiroMind-M1-RL-62K), and training setups openly released.

📊 Evaluation

MiroMind-M1-SFT

Model Initial Checkpoint AIME24 (avg@64) AIME25 (avg@64) MATH500 (avg@5)
DeepSeek-R1-Distill Qwen2.5-Math-7B 55.5 40.4† 92.8
OpenThoughts Qwen2.5-7-Instruct 31.3 23.3 83.2
Open-R1 Qwen2.5-Math-7B-Instruct 36.7 40.0 90.6
Synthetic-1 Qwen2.5-7B-Instruct 30.0 26.6 85.6
MiroMind-SFT-7B Qwen2.5-Math-7B 60.4 45.0 94.6

† means that the score of DeepSeek-R1 on AIME25 is from our evaluation.

MiroMind-M1-RL

Model AIME24 (avg@64) AIME25 (avg@64) MATH500 (avg@5)
DeepSeek-R1 79.8 70.0
DeepSeek-R1-0528 91.4 87.5
Qwen3-8B 76.0 67.3
DeepSeek-R1-0528-Qwen3-8B 86.0 76.3
32B Models trained from Qwen2.5 series
DeepSeek-R1-Distill-Qwen-32B 70.8 52.1 95.8
Skywork-OR1-32B-Preview 77.1 68.2 97.5
MiroMind-M1-RL-32B 77.5 65.6 96.4
7B Models trained from Qwen2.5 series
DeepSeek-R1-Distill-Qwen-7B 55.5 39.2
MiroMind-M1-SFT-7B 60.4 45.0 94.6
Light-R1-7B-DS 59.1 44.3
Skywork-OR1-7B 72.2 54.6
MiroMind-M1-RL-7B 73.4 57.8 96.7

🔗 Resources

Models

MiroMind-M1-SFT-7B
MiroMind-M1-RL-7B
MiroMind-M1-RL-32B

Data

MiroMind-M1-SFT-719K
MiroMind-M1-RL-62K

Downloads last month
13
Safetensors
Model size
32.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for miromind-ai/MiroMind-M1-RL-32B

Finetuned
(73)
this model
Quantizations
2 models

Collection including miromind-ai/MiroMind-M1-RL-32B