CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model Paper • 2310.15477 • Published Oct 24, 2023
Critical Data Size of Language Models from a Grokking Perspective Paper • 2401.10463 • Published Jan 19, 2024 • 1
UltraMedical: Building Specialized Generalists in Biomedicine Paper • 2406.03949 • Published Jun 6, 2024
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding Paper • 2501.18362 • Published Jan 30, 2025 • 23
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving Paper • 2505.16278 • Published May 22, 2025 • 5
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 75
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning Paper • 2509.09674 • Published Sep 11, 2025 • 80
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 114
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19, 2025 • 27
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published Mar 14, 2025 • 28
PaD: Program-aided Distillation Specializes Large Models in Reasoning Paper • 2305.13888 • Published May 23, 2023