DesignLab: Designing Slides Through Iterative Detection and Correction Paper • 2507.17202 • Published 8 days ago • 46
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 28
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published Jan 2 • 55
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published Jan 2 • 44
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking Paper • 2501.02690 • Published Jan 5 • 17
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6 • 56
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling Paper • 2411.18664 • Published Nov 27, 2024 • 24
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20, 2024 • 62