DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 9 days ago • 53
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning Paper • 2510.27606 • Published Oct 31 • 27
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27 • 83
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 95
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published Oct 9 • 63
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9 • 108
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 172
CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics Paper • 2508.18124 • Published Aug 25 • 48
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 207
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward Paper • 2508.03686 • Published Aug 5 • 37
The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner Paper • 2507.13332 • Published Jul 17 • 48
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers Paper • 2507.04404 • Published Jul 6 • 21
Running 6 Open LMM Spatial Leaderboard 🥇 6 A Leaderboard for LMM spatial understanding capabilities
CompassJudger-2: Towards Generalist Judge Model via Verifiable Rewards Paper • 2507.09104 • Published Jul 12 • 17