MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 2 days ago • 144
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 14 days ago • 81
Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator Paper • 2604.08121 • Published Apr 9 • 43 • 4
Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator Paper • 2604.08121 • Published Apr 9 • 43
Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator Paper • 2604.08121 • Published Apr 9 • 43
Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance Paper • 2603.02175 • Published Mar 2 • 24
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 104
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 227
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 76