new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Oct 22

Submitted by

Ningyu

LightMem: Lightweight and Efficient Memory-Augmented Generation

Zhejiang University

Submitted by

jienengchen

World-in-World: World Models in a Closed-Loop World

·
17 authors

Submitted by

GindaChen

Efficient Long-context Language Model Training by Core Attention Disaggregation

·
9 authors

Submitted by

taesiri

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

·
11 authors

Submitted by

weidawang

Chem-R: Learning to Reason as a Chemist

Submitted by

taesiri

MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation

ByteDance

Submitted by

taesiri

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

ByteDance

Submitted by

CheeryLJH

IF-VidCap: Can Video Caption Models Follow Instructions?

NJU-LINK

Submitted by

taesiri

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

inclusionAI

Submitted by

3145tttt

GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver

Bayes-Group

Bayesian Methods Research Group

Submitted by

WTNswaggy

Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning

·
6 authors

Submitted by

mgubri

Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution

parameterlab

Submitted by

CheeryLJH

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues

NJU-LINK

Submitted by

eliebak

DeepSeek-OCR: Contexts Optical Compression

deepseek-ai

Submitted by

taesiri

UltraGen: High-Resolution Video Generation with Hierarchical Attention

·
4 authors

Submitted by

aHapBean

ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning

·
8 authors

Submitted by

Non-no

MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

MUG-V

shopee-llm-mug team

Submitted by

Kaichengalex

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

·
9 authors

Submitted by

Apostle723

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views

Tsinghua University

Submitted by

sleetwang6

DSI-Bench: A Benchmark for Dynamic Spatial Intelligence

alibaba-inc

Submitted by

clyu

AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading

·
2 authors

Submitted by

iliashum

Extracting alignment data in open models

google

Submitted by

jinfengliu26

Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos

·
5 authors

Submitted by

DeepakSridhar

Video Reasoning without Training

qualcomm

Submitted by

wlin21at

PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

·
7 authors

Submitted by

yundaichuanzhan

Expanding the Action Space of LLMs to Reason Beyond Language

·
6 authors

Submitted by

javyduck

Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth

ByteDance-Seed

Submitted by

manglu3935

Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

·
11 authors

Submitted by

Tomk187

Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration

·
2 authors

Submitted by

haizhongzheng

When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?

·
9 authors

Submitted by

danielmisrael

Planned Diffusion

·
7 authors

Submitted by

Henrddy211

The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives

·
2 authors

1

Submitted by

junzhin

Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis

General-Medical-AI

General Medical AI

Submitted by

Davidavid4

Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)

·
1 authors

Submitted by

Jinnkunn

Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations

·
4 authors

Submitted by

Elynden

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

·
6 authors

1

Submitted by

billmatrix

PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold

PokeeAI