UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning Paper • 2505.23380 • Published May 29 • 23
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 106
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published Mar 25 • 73
ROICtrl: Boosting Instance Control for Visual Generation Paper • 2411.17949 • Published Nov 27, 2024 • 88
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 89