InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation Paper • 2509.24663 • Published Sep 29, 2025 • 14
Restarting on CPU Upgrade 18 BigCodeBench Evaluator 🥇 18 Evaluate code samples using specified parameters
RAVine: Reality-Aligned Evaluation for Agentic Search Paper • 2507.16725 • Published Jul 22, 2025 • 29
Running on CPU Upgrade 13.8k Open LLM Leaderboard 🏆 13.8k Track, rank and evaluate open LLMs and chatbots