Evaluation metrics
#20
by
BounharAbdelaziz
- opened
Hey,
Thank you once again for sharing the results of your efforts.
A small question regarding the evaluation metrics. In particular those for AIME25 and lcbv4, is it pass@1 and pass@1:maj@16, respectively?
I couldn’t find this info. Many thanks in advance!!
Perfecto!
I am using the Lighteval too, thanks for such fast and easy framework!
Another small question: Why didn't you include AIME24 in your evals? Was it part of the training data?
Many thanks again =)
The main reason was that AIME25 is more recent than AIME24, so we prioritized evaluating on the most recent version.
Great infos, thanks!
lewtun
changed discussion status to
closed