UniREditBench: A Unified Reasoning-based Image Editing Benchmark Paper • 2511.01295 • Published 1 day ago • 14
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published 7 days ago • 63
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published 8 days ago • 93
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 26 days ago • 121
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning Paper • 2510.06217 • Published 28 days ago • 62
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30 • 19
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published Sep 29 • 18
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24 • 76 • 9
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios Paper • 2506.20279 • Published Jun 25 • 19
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios Paper • 2506.20279 • Published Jun 25 • 19
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios Paper • 2506.20279 • Published Jun 25 • 19 • 1
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 104
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45