SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis Paper • 2602.07803 • Published 4 days ago • 3
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 2 days ago • 54
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 2 days ago • 142
Flow Matching Meets PDEs: A Unified Framework for Physics-Constrained Generation Paper • 2506.08604 • Published Jun 10, 2025 • 1
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published 9 days ago • 32
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published 6 days ago • 34
view article Article Training Design for Text-to-Image Models: Lessons from Ablations 8 days ago • 55
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Paper • 2410.19355 • Published Oct 25, 2024 • 24
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis Paper • 2602.03139 • Published 9 days ago • 41
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published 8 days ago • 55
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 11 days ago • 268
PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards Paper • 2602.01624 • Published 10 days ago • 23
M-ErasureBench: A Comprehensive Multimodal Evaluation Benchmark for Concept Erasure in Diffusion Models Paper • 2512.22877 • Published Dec 28, 2025 • 2
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Paper • 2601.22143 • Published 13 days ago • 6
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders Paper • 2601.17950 • Published 17 days ago • 4
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer Paper • 2601.16515 • Published 20 days ago • 15
VideoMaMa: Mask-Guided Video Matting via Generative Prior Paper • 2601.14255 • Published 22 days ago • 15
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding Paper • 2601.14724 • Published 22 days ago • 74