Update README.md
Browse files
README.md
CHANGED
|
@@ -15,14 +15,16 @@ For more details, please refer to our blog
|
|
| 15 |
- [English Version](https://krafton-ai.github.io/blog/llm_post_training_en/)
|
| 16 |
- [Korean Version](https://krafton-ai.github.io/blog/llm_post_training_kr/)
|
| 17 |
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Results
|
| 20 |
|
| 21 |
| Model | Method | AIME25 | AMC23 | LiveCodeBench | GPQA-Diamond | IFEval |
|
| 22 |
|--------------------------------|--------------------------------|--------|-------|---------------|--------------|--------|
|
| 23 |
-
| Openthinker3-7B | Base | 57.
|
| 24 |
-
| | Offline GRPO (+bias)
|
| 25 |
-
| Openthinker2-7B | Base | 39.792 | 88.633 | 56.115 | 45.833 | 53.
|
| 26 |
-
| | Offline GRPO (+bias)
|
| 27 |
-
| AceReason-Nemetron-1.1-7B | Base | 64.635 | 92.
|
| 28 |
-
| | Offline GRPO (+bias)
|
|
|
|
| 15 |
- [English Version](https://krafton-ai.github.io/blog/llm_post_training_en/)
|
| 16 |
- [Korean Version](https://krafton-ai.github.io/blog/llm_post_training_kr/)
|
| 17 |
|
| 18 |
+
Additionally, we have released the code used for training
|
| 19 |
+
- [Github](https://github.com/krafton-ai/Offline-GRPO)
|
| 20 |
|
| 21 |
## Results
|
| 22 |
|
| 23 |
| Model | Method | AIME25 | AMC23 | LiveCodeBench | GPQA-Diamond | IFEval |
|
| 24 |
|--------------------------------|--------------------------------|--------|-------|---------------|--------------|--------|
|
| 25 |
+
| Openthinker3-7B | Base | 57.292 | 92.617 | 63.968 | 50.947 | 50.09 |
|
| 26 |
+
| | Offline GRPO (+bias) | 59.532 | 93.516 | 65.435 | 50.947 | 51.14 |
|
| 27 |
+
| Openthinker2-7B | Base | 39.792 | 88.633 | 56.115 | 45.833 | 53.30 |
|
| 28 |
+
| | Offline GRPO (+bias) | 40.573 | 88.359 | 56.115 | 46.717 | 52.62 |
|
| 29 |
+
| AceReason-Nemetron-1.1-7B | Base | 64.635 | 92.930 | 72.383 | 52.462 | 36.02 |
|
| 30 |
+
| | Offline GRPO (+bias) | 65.573 | 93.203 | 71.673 | 52.146 | 37.38 |
|