Perception-Aware Policy Optimization for Multimodal Reasoning Paper โข 2507.06448 โข Published 22 days ago โข 44
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation Paper โข 2506.04614 โข Published Jun 5 โข 16
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper โข 2505.17667 โข Published May 23 โข 89
QwenLong-CPRS: Towards infty-LLMs with Dynamic Context Optimization Paper โข 2505.18092 โข Published May 23 โข 44
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought Paper โข 2505.16192 โข Published May 22 โข 12
WritingBench: A Comprehensive Benchmark for Generative Writing Paper โข 2503.05244 โข Published Mar 7 โข 19
Agent models: Internalizing Chain-of-Action Generation into Reasoning models Paper โข 2503.06580 โข Published Mar 9 โข 19
Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration Paper โข 2502.17110 โข Published Feb 24 โข 13
Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration Paper โข 2502.17110 โข Published Feb 24 โข 13
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper โข 2502.14282 โข Published Feb 20 โข 20
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper โข 2502.14282 โข Published Feb 20 โข 20
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper โข 2501.11733 โข Published Jan 20 โข 29
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper โข 2501.11733 โข Published Jan 20 โข 29
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization Paper โข 2411.11909 โข Published Nov 17, 2024 โข 23
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding Paper โข 2409.03420 โข Published Sep 5, 2024 โข 27