Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published Oct 16, 2025 • 47
Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models Paper • 2509.24510 • Published Sep 29, 2025 • 3
Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models Paper • 2509.24510 • Published Sep 29, 2025 • 3
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published Aug 11, 2025 • 20
MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge Paper • 2507.21183 • Published Jul 27, 2025 • 14
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper • 2505.21189 • Published May 27, 2025 • 61
Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence Paper • 2505.20325 • Published May 23, 2025 • 46
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning Paper • 2505.20355 • Published May 26, 2025 • 36
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models Paper • 2505.11711 • Published May 16, 2025 • 11