Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 8 days ago • 18
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published 11 days ago • 28
GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation Paper • 2512.01801 • Published 4 days ago • 22
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning Paper • 2511.20549 • Published 10 days ago • 23
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models Paper • 2511.23319 • Published 7 days ago • 21
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published 9 days ago • 24
Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published 8 days ago • 28
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published 4 days ago • 47
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published 10 days ago • 43
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published 10 days ago • 37
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper • 2511.23475 • Published 7 days ago • 40
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards Paper • 2512.00425 • Published 6 days ago • 44
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models Paper • 2511.22625 • Published 8 days ago • 45
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published 8 days ago • 62