ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 9 days ago • 141
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 14 days ago • 113
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published about 1 month ago • 34
Boosting Visual Instruction Tuning with Self-Supervised Guidance Paper • 2604.12966 • Published 8 days ago • 11
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 7 days ago • 104
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 20 days ago • 44
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 20 days ago • 29
Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification Paper • 2603.26648 • Published 25 days ago • 42
AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation Paper • 2603.28068 • Published 22 days ago • 13