NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper • 2507.08800 • Published 19 days ago • 75
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published 16 days ago • 80
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper • 2507.07996 • Published 20 days ago • 31
GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior Paper • 2506.08012 • Published Jun 9 • 7
BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation Paper • 2506.07530 • Published Jun 9 • 20
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22 • 33
LaViDa: A Large Diffusion Language Model for Multimodal Understanding Paper • 2505.16839 • Published May 22 • 12
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29 • 97
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 132
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published Apr 19 • 13
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 55
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 72
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published Mar 18 • 21
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published Mar 6 • 20
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published Mar 3 • 39