F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published 3 days ago • 22
From reactive to cognitive: brain-inspired spatial intelligence for embodied agents Paper • 2508.17198 • Published 18 days ago • 8
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model Paper • 2508.13009 • Published 24 days ago • 24
VertexRegen: Mesh Generation with Continuous Level of Detail Paper • 2508.09062 • Published 30 days ago • 35
Matrix-3D: Omnidirectional Explorable 3D World Generation Paper • 2508.08086 • Published about 1 month ago • 70
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published about 1 month ago • 45
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published Aug 10 • 92
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 175
TokensGen: Harnessing Condensed Tokens for Long Video Generation Paper • 2507.15728 • Published Jul 21 • 7
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17 • 56
A Survey on Vision-Language-Action Models for Autonomous Driving Paper • 2506.24044 • Published Jun 30 • 14
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge Paper • 2507.04447 • Published Jul 6 • 43
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact Paper • 2507.00951 • Published Jul 1 • 23
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning Paper • 2506.09985 • Published Jun 11 • 30