new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Oct 1

Submitted by

Jakumetsu

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

NationalUniversityofSingapore

National University of Singapore

Submitted by

janchorowski

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

pathwaycom

Submitted by

taesiri

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

·
9 authors

Submitted by

Jessamine

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

alibabagroup

3

Submitted by

weizhepei

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Submitted by

Junlinh

Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

·
7 authors

Submitted by

Ningyu

OceanGym: A Benchmark Environment for Underwater Embodied Agents

Zhejiang University

Submitted by

xytian1008

More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models

·
8 authors

Submitted by

xx18

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

tencent

Submitted by

han-cai

DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

nvidia

Submitted by

wjldw

Who's Your Judge? On the Detectability of LLM-Generated Judgments

DMML

Data Mining and Machine Learning lab

Submitted by

Minbyul

Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

Korea University

Submitted by

Zigeng

dParallel: Learnable Parallel Decoding for dLLMs

NationalUniversityofSingapore

National University of Singapore

Submitted by

hewei2001

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

meituan-longcat

Submitted by

Fiaa

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs

PrincetonUniversity

Princeton University

Submitted by

JiayiGuo821

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

shi-labs

Submitted by

WENGSYX

DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

WestlakeNLP

Text Intelligence Lab of Westlake University

Submitted by

flateon

MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation

·
5 authors

Submitted by

JusperLee

Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention

Tsinghua University

Submitted by

haodongli

DA^2: Depth Anything in Any Direction

Tencent-Hunyuan

Tencent Hunyuan

Submitted by

ai-hyz

Mem-α: Learning Memory Construction via Reinforcement Learning

·
7 authors

Submitted by

soujanyaporia

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!

declare-lab

Deep Cognition and Language Research (DeCLaRe) Lab

Submitted by

Fengzhuo

Muon Outperforms Adam in Tail-End Associative Memory Learning

·
9 authors

Submitted by

linyueqian

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

·
9 authors

Submitted by

Miaosen

InfoAgent: Advancing Autonomous Information-Seeking Agents

microsoft

2

Submitted by

sijial430

Humanline: Online Alignment as Perceptual Loss

PrincetonUniversity

Princeton University

Submitted by

RyanLiu112

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Tsinghua University

3

Submitted by

burtenshaw

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

·
4 authors

2

Submitted by

taesiri

VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes

·
9 authors

Submitted by

yshenaw

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

MicrosoftResearch

Microsoft Research

Submitted by

Seanie-lee

Rethinking Reward Models for Multi-Domain Test-Time Scaling

·
15 authors

Submitted by

taesiri

Regression Language Models for Code

google

Submitted by

rover-xingyu

TTT3R: 3D Reconstruction as Test-Time Training

·
5 authors

Submitted by

taesiri

Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents

apple

Submitted by

kittttttt

Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs

·
5 authors

Submitted by

danielmisrael

The Pitfalls of KV Cache Compression

·
5 authors

Submitted by

dtanow

DeepCodeSeek: Real-Time API Retrieval for Context-Aware Code Generation

ServiceNow-AI

2

Submitted by

dlion168

TAU: A Benchmark for Cultural Sound Understanding Beyond Semantics

·
15 authors

Submitted by

sachithabey

EntroPE: Entropy-Guided Dynamic Patch Encoder for Time Series Forecasting

nanyang-technological-university-singapore

Nanyang Technological University Singapore

Submitted by

hanxiao

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

jinaai

2

Submitted by

Franck-Dernoncourt

Knowledge Homophily in Large Language Models

·
9 authors

Submitted by

Kamichanw

d^2Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching

·
7 authors

Submitted by

Divij

BuildBench: Benchmarking LLM Agents on Compiling Real-World Open-Source Software

cogint

2

Submitted by

normanpaulsen

Context Is What You Need: The Maximum Effective Context Window for Real World Limits of LLMs

·
1 authors

2

Submitted by

bilpo

Video Object Segmentation-Aware Audio Generation

·
3 authors

Submitted by

taesiri

Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

·
64 authors

Submitted by

dinobby

Nudging the Boundaries of LLM Reasoning

·
7 authors

Submitted by

stockeh

Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting

·
3 authors

Submitted by

tomoyukun

LayerD: Decomposing Raster Graphic Designs into Layers

·
4 authors

Submitted by

KejiaRobust

MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification

·
5 authors

2

Submitted by

Benjamin-eecs

Who invented deep residual learning?

·
1 authors

Submitted by

YYF42

CORRECT: COndensed eRror RECognition via knowledge Transfer in multi-agent systems

·
7 authors

Submitted by

Qingren

Estimating Time Series Foundation Model Transferability via In-Context Learning

·
6 authors

2

Submitted by

chinefed

Convolutional Set Transformer

·
2 authors

Submitted by

agneet

Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

stabilityai

2

Submitted by

EdBianchi

ProfVLM: A Lightweight Video-Language Model for Multi-View Proficiency Estimation

·
3 authors

Submitted by

ZhangShenao

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

·
7 authors

Submitted by

jonhue

Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

lasgroup

LAS @ ETH Zurich

Submitted by

oppurity

LLM Watermark Evasion via Bias Inversion

·
3 authors

Submitted by

buxiangzhiren

GeoRemover: Removing Objects and Their Causal Visual Artifacts

·
6 authors

Submitted by

YuhengSSS

Catching the Details: Self-Distilled RoI Predictors for Fine-Grained MLLM Perception

Sydney-Uni

The University of Sydney