Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation Paper • 2512.03534 • Published 2 days ago • 16
PixelDiT: Pixel Diffusion Transformers for Image Generation Paper • 2511.20645 • Published 9 days ago • 23
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework Paper • 2512.03041 • Published 2 days ago • 56
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 8 days ago • 88
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 3 days ago • 155
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published 9 days ago • 43
First Frame Is the Place to Go for Video Content Customization Paper • 2511.15700 • Published 15 days ago • 52
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 21 days ago • 60
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 22 days ago • 68
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published 21 days ago • 120
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 21 days ago • 92
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 25 days ago • 53
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 26 days ago • 128