Submitted by YerbaPage 88 LongCodeZip: Compress Long Context for Code Language Models Stanford University 63 5
Submitted by cuijiaxing 69 Self-Forcing++: Towards Minute-Scale High-Quality Video Generation ByteDance Seed 48 3
Submitted by yulunliu 53 StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions National Yang Ming Chiao Tung University 48 2
Submitted by yuntian-deng 36 Interactive Training: Feedback-Driven Neural Network Optimization Yuntian Group 10 3
Submitted by taesiri 34 StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? · 7 authors 3
Submitted by ruohao 21 Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks · 6 authors 2
Submitted by weiminwang 21 Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Character.AI 220 2
Submitted by invokerliang 20 CLUE: Non-parametric Verification from Experience via Hidden-State Clustering Tencent 1
Submitted by lr10260 19 VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning Tencent 2
Submitted by xw-eric 18 The Unreasonable Effectiveness of Scaling Agents for Computer Use Simular 6.62k 2
Submitted by songw-zju 15 RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Westlake University 24 2
Submitted by Geralt-Targaryen 13 F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data CodeFuse AI 27 2
Submitted by taesiri 13 A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports · 12 authors 2
Submitted by zhangchenxu 13 TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments IBM 52 3
Submitted by Shilin-LU 10 DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing · 7 authors 2
Submitted by AdamF92 10 Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction Reactive AI 0 2
Submitted by Harold328 8 Go with Your Gut: Scaling Confidence for Autoregressive Image Generation · 7 authors 13 2
Submitted by zorik 8 Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs Technion Israel institute of technology 2
Submitted by YuZeng260 7 Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models · 12 authors 12 3
Submitted by yxl66666 7 Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow · 11 authors 5 1
Submitted by erjui 6 Automated Structured Radiology Report Generation with Rich Clinical Context · 6 authors 3 3
Submitted by enisimsar 5 Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity · 3 authors 7 2
Submitted by SteveZeyuZhang 5 VLA-R1: Enhancing Reasoning in Vision-Language-Action Models · 6 authors 11 2
Submitted by Ksgk-fy 4 RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems · 7 authors 2
Submitted by Wyattz23 4 TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis · 7 authors 2
Submitted by yanxi-chen 4 Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends · 8 authors 357 2
Submitted by tetrisd 3 Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation University College London 0 2
Submitted by James-WYang 3 Parallel Scaling Law: Unveiling Reasoning Generalization through A Cross-Linguistic Perspective Chinese Academic of Science Institute of Automation 2
Submitted by Yalimu 3 One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient · 5 authors 2
Submitted by Xiaoye08 3 FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting · 6 authors 9 3
Submitted by taesiri 3 SKYLENAGE Technical Report: Mathematical Reasoning and Contest-Innovation Benchmarks for Multi-Level Math Evaluation · 18 authors 2
Submitted by patricebechard 2 Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval ServiceNow-AI 2
Submitted by zzhao0104 2 Controlled Generation for Private Synthetic Text Center for Language and Speech Processing @ JHU 2
Submitted by taesiri 1 MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs · 20 authors 2
Submitted by nandan523 1 Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? New York University 2
Submitted by whats2000 1 SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval · 3 authors 2
Submitted by dinobby - Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression · 6 authors 2
Submitted by pranamanam - AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance · 3 authors 2
Submitted by therem - IoT-MCP: Bridging LLMs and IoT Systems Through Model Context Protocol · 10 authors 2