Submitted by zhangshaolei 56 DeepAnalyze: Agentic Large Language Models for Autonomous Data Science RUC-DataLab 154 1
Submitted by Andrew613 55 PICABench: How Far Are We from Physically Realistic Image Editing? · 13 authors 13 2
Submitted by VLyb 32 TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model Zhongguancun Academy 3
Submitted by SnowNation 28 Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation Renmin University of China 7 2
Submitted by yoon6503 27 When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling KAIST AI 3
Submitted by monurcan 19 Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling · 3 authors 2
Submitted by EiffL 15 AION-1: Omnimodal Foundation Model for Astronomical Sciences Polymathic AI 11 1
Submitted by chestnutlzj 15 Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Peking University 46 1
Submitted by zachary-yin 11 ConsistEdit: Highly Consistent and Precise Training-free Visual Editing · 4 authors 25 2
Submitted by sdzy 6 Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native Agentic AI Beijing JiaoTong University 25 2
Submitted by para-lost 4 Constantly Improving Image Models Need Constantly Improving Benchmarks University of California, Berkeley 9 1
Submitted by taesiri 3 Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics Salesforce 66 1
Submitted by taesiri 3 UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action Apple 2
Submitted by passing2961 3 MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models KAIST 3 1
Submitted by xwjzds 3 Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense Amazon Science 1
Submitted by taesiri 3 Embody 3D: A Large-scale Multimodal Motion and Behavior Dataset Meta Research 2
Submitted by hongyuyang23casia 3 Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering Chinese Academic of Science Institute of Automation 1
Submitted by austinxu87 2 Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains Salesforce 1
Submitted by shuaichenchang 2 Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection · 8 authors 1
Submitted by monurcan 2 Balanced Multi-Task Attention for Satellite Image Classification: A Systematic Approach to Achieving 97.23% Accuracy on EuroSAT Without Pre-Training · 1 authors 2
Submitted by sanskxr02 1 Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models · 4 authors 1
Submitted by linyueqian 1 AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning · 7 authors 1
Submitted by jacksukk 1 On Non-interactive Evaluation of Animal Communication Translators · 3 authors 2
Submitted by kellycyy - MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes · 18 authors 0 1
Submitted by sayandsarkar - GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer Gradient Spaces Research Group 1
Submitted by Zihao-Li - Test-Time Scaling of Reasoning Models for Machine Translation Language Technology Research Group at the University of Helsinki 1