Elliott commited on
Commit
3e1c7ed
·
verified ·
1 Parent(s): 92255c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -3
README.md CHANGED
@@ -49,9 +49,6 @@ print(outputs[0].outputs[0].text)
49
 
50
  # 📃Evaluation
51
 
52
- LUFFY is evaluated on six competition-level benchmarks, achieving state-of-the-art results among all zero-RL methods. It surpasses both on-policy RL and imitation learning (SFT), especially in generalization:
53
-
54
- ## LUFFY on Qwen2.5-Instruct-7B
55
  | **Model** | **AIME 2024** | **AIME 2025** | **AMC** | **MATH-500** | **Minerva** | **Olympiad** | **Avg.** |
56
  |-----------------------------------|-------------|-------------|---------|---------------|-------------|---------------|----------|
57
  | Qwen2.5-7B-Instruct | 11.9 | 7.6 | 44.1 | 74.6 | 30.5 | 39.7 | 34.7 |
 
49
 
50
  # 📃Evaluation
51
 
 
 
 
52
  | **Model** | **AIME 2024** | **AIME 2025** | **AMC** | **MATH-500** | **Minerva** | **Olympiad** | **Avg.** |
53
  |-----------------------------------|-------------|-------------|---------|---------------|-------------|---------------|----------|
54
  | Qwen2.5-7B-Instruct | 11.9 | 7.6 | 44.1 | 74.6 | 30.5 | 39.7 | 34.7 |