Submitted by Ningyu 84 LightMem: Lightweight and Efficient Memory-Augmented Generation Zhejiang University 110 2
Submitted by GindaChen 66 Efficient Long-context Language Model Training by Core Attention Disaggregation · 9 authors 1
Submitted by taesiri 59 UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation · 11 authors 92 1
Submitted by taesiri 31 MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation ByteDance 18 4
Submitted by taesiri 29 Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs ByteDance 41 1
Submitted by taesiri 20 Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model inclusionAI 49 1
Submitted by 3145tttt 20 GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver Bayesian Methods Research Group 36 2
Submitted by WTNswaggy 19 Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning · 6 authors 1
Submitted by mgubri 15 Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution Parameter Lab 1 1
Submitted by CheeryLJH 14 MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues NJU-LINK Lab 5 1
Submitted by taesiri 11 UltraGen: High-Resolution Video Generation with Hierarchical Attention · 4 authors 2
Submitted by aHapBean 11 ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning · 8 authors 1
Submitted by Non-no 9 MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models shopee-llm-mug team 59 2
Submitted by Kaichengalex 8 ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder · 9 authors 11 2
Submitted by Apostle723 7 Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Tsinghua University 8 1
Submitted by clyu 6 AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading · 2 authors 13 1
Submitted by jinfengliu26 4 Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos · 5 authors 13 2
Submitted by wlin21at 3 PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies · 7 authors 1 2
Submitted by yundaichuanzhan 3 Expanding the Action Space of LLMs to Reason Beyond Language · 6 authors 2 1
Submitted by javyduck 2 Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth ByteDance Seed 1
Submitted by manglu3935 2 Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism · 11 authors 1
Submitted by Tomk187 2 Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration · 2 authors 0 2
Submitted by haizhongzheng 2 When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents? · 9 authors 1
Submitted by Henrddy211 1 The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives · 2 authors 1
Submitted by junzhin 1 Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis General Medical AI 13 1
Submitted by Davidavid4 1 Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD) · 1 authors 0 1
Submitted by Jinnkunn 1 Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations · 4 authors 1
Submitted by Elynden - EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning · 6 authors 1
Submitted by billmatrix - PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold Pokee AI 39 1