MIRA: Multimodal Iterative Reasoning Agent for Image Editing Paper • 2511.21087 • Published 3 days ago • 8
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published 2 days ago • 12
Think Visually, Reason Textually: Vision-Language Synergy in ARC Paper • 2511.15703 • Published 9 days ago • 8
Cognitive Foundations for Reasoning and Their Manifestation in LLMs Paper • 2511.16660 • Published 8 days ago • 8
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms Paper • 2511.17592 • Published 12 days ago • 110
ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper • 2511.20626 • Published 3 days ago • 161
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published 4 days ago • 38
Diverse Video Generation with Determinantal Point Process-Guided Policy Optimization Paper • 2511.20647 • Published 3 days ago • 2
MagicWorld: Interactive Geometry-driven Video World Exploration Paper • 2511.18886 • Published 5 days ago • 16
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow Paper • 2511.20462 • Published 4 days ago • 16
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers Paper • 2511.20123 • Published 4 days ago • 15
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs Paper • 2511.19773 • Published 4 days ago • 8
iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation Paper • 2511.20635 • Published 3 days ago • 29
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published 4 days ago • 28
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published 4 days ago • 45
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation Paper • 2511.19320 • Published 5 days ago • 37