AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning Paper • 2507.12841 • Published 14 days ago • 39
SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Paper • 2506.19838 • Published Jun 24 • 12
Scaling Image and Video Generation via Test-Time Evolutionary Search Paper • 2505.17618 • Published May 23 • 42
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published Mar 31 • 77
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14 • 144
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published Nov 12, 2024 • 67
CGB-DM: Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model Paper • 2407.15233 • Published Jul 21, 2024 • 6
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 56