VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning Paper β’ 2510.25772 β’ Published 6 days ago β’ 32
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper β’ 2510.08673 β’ Published 26 days ago β’ 121
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning Paper β’ 2509.20360 β’ Published Sep 24 β’ 17
Running 12 12 INR-Harmon - Harmonize Any Image You Want! π Harmonize images using masks and pretrained models
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention Paper β’ 2507.17745 β’ Published Jul 23 β’ 34
Pixels, Patterns, but No Poetry: To See The World like Humans Paper β’ 2507.16863 β’ Published Jul 21 β’ 68
Build error 91 91 Financial Analyst AI π’ Analyze financial text and audio for tone, sentiment, and entities
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering Paper β’ 2505.24417 β’ Published May 30 β’ 13
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper β’ 2505.19297 β’ Published May 25 β’ 84
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper β’ 2505.01583 β’ Published May 2 β’ 8
YoChameleon: Personalized Vision and Language Generation Paper β’ 2504.20998 β’ Published Apr 29 β’ 12
No application file Yolo Logo Detection π’ Logo detection using YOLOv7 with LogoDet-3K and Flickr Logos