OpenEvals

community
Activity Feed

AI & ML interests

LLM evaluation

Recent Activity

SaylorTwiftΒ  updated a Space 31 minutes ago
OpenEvals/open_benchmark_index
clefourrierΒ  updated a Space 14 days ago
OpenEvals/InferenceProviderTesting
SaylorTwiftΒ  updated a Space 15 days ago
OpenEvals/evals
View all activity

Articles

OpenEvals 's collections 5

Research collaborations
A small overview of our research collabs through the years
Archived Open LLM Leaderboard (2024-2025)
This leaderboard has been evaluating LLMs from Jun 2024 on IFEval, MuSR, GPQA, MATH, BBH and MMLU-Pro