AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing Paper • 2511.12133 • Published 5 days ago • 1 • 2
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published 3 days ago • 1 • 2
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards Paper • 2511.14659 • Published 1 day ago • 7 • 2
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering Paper • 2511.13998 • Published 2 days ago • 2 • 2
Error-Driven Scene Editing for 3D Grounding in Large Language Models Paper • 2511.14086 • Published 2 days ago • 2 • 2
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning Paper • 2511.14366 • Published 1 day ago • 13 • 2
Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework Paper • 2511.13189 • Published 3 days ago • 11 • 2
Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark Paper • 2511.13853 • Published 2 days ago • 32 • 3
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning Paper • 2511.14460 • Published 1 day ago • 11 • 2
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding Paper • 2511.13026 • Published 3 days ago • 22 • 2
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper • 2511.14295 • Published 1 day ago • 59 • 3
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published 6 days ago • 73 • 4
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models Paper • 2511.14582 • Published 1 day ago • 13 • 2
MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs Paper • 2511.14159 • Published 2 days ago • 24 • 3
Agent READMEs: An Empirical Study of Context Files for Agentic Coding Paper • 2511.12884 • Published 3 days ago • 4 • 2
Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution Paper • 2511.14210 • Published 2 days ago • 8 • 2
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published 8 days ago • 72 • 3
A Brain Wave Encodes a Thousand Tokens: Modeling Inter-Cortical Neural Interactions for Effective EEG-based Emotion Recognition Paper • 2511.13954 • Published 2 days ago • 3 • 2
Mitigating Label Length Bias in Large Language Models Paper • 2511.14385 • Published 1 day ago • 4 • 2