MiniCPM4 Collection MiniCPM4: Ultra-Efficient LLMs on End Devices • 29 items • Updated Sep 8, 2025 • 82
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28, 2025 • 131
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published Mar 27, 2025 • 38
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering Paper • 2503.16867 • Published Mar 21, 2025 • 11
An Empirical Study on Eliciting and Improving R1-like Reasoning Models Paper • 2503.04548 • Published Mar 6, 2025 • 8
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published Jan 4, 2025 • 103