LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 8 days ago • 65
OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration Paper • 2602.08344 • Published 8 days ago • 5
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback Paper • 2601.10156 • Published Jan 15 • 26
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published Dec 22, 2025 • 64
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published Dec 17, 2025 • 44
Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective Paper • 2505.17652 • Published May 23, 2025 • 6
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning Paper • 2504.05520 • Published Apr 7, 2025 • 11
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published Mar 24, 2025 • 31
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published Mar 20, 2025 • 77
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18, 2025 • 144
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity Paper • 2503.01506 • Published Mar 3, 2025 • 10
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28, 2024 • 42
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29, 2024 • 16
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Paper • 2406.07522 • Published Jun 11, 2024 • 40