OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 1 day ago • 72
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published Mar 31 • 46
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published Mar 30 • 85
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published Feb 26 • 45
Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss Paper • 2602.07022 • Published Feb 2 • 19
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published Jan 26 • 48
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey Paper • 2601.11655 • Published Jan 15 • 63
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision Paper • 2512.01342 • Published Dec 1, 2025 • 19
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning Paper • 2511.20549 • Published Nov 25, 2025 • 27
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning Paper • 2511.20549 • Published Nov 25, 2025 • 27 • 2
Interleaving Reasoning for Better Text-to-Image Generation Paper • 2509.06945 • Published Sep 8, 2025 • 16
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents Paper • 2509.06501 • Published Sep 8, 2025 • 82
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published Aug 13, 2025 • 53
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 90