LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28 • 44
What makes Reasoning Models Different? Follow the Reasoning Leader for Efficient Decoding Paper • 2506.06998 • Published Jun 8
CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning Paper • 2507.00045 • Published Jun 23 • 1
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper • 2509.00676 • Published Aug 31 • 83
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published Jun 11 • 22
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Paper • 2506.05523 • Published Jun 5 • 34
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement Paper • 2504.07934 • Published Apr 10 • 20
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension Paper • 2412.03704 • Published Dec 4, 2024 • 7
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Paper • 2410.06508 • Published Oct 9, 2024 • 11
LLaVA-Critic: Learning to Evaluate Multimodal Models Paper • 2410.02712 • Published Oct 3, 2024 • 37
Premier-TACO: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss Paper • 2402.06187 • Published Feb 9, 2024 • 11
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy Paper • 2207.12141 • Published Jul 25, 2022
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning Paper • 2306.13229 • Published Jun 22, 2023 • 3
DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization Paper • 2310.19668 • Published Oct 30, 2023 • 3
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences Paper • 2401.10529 • Published Jan 19, 2024 • 1
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL Paper • 2310.07220 • Published Oct 11, 2023 • 1
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function Paper • 2302.01244 • Published Feb 2, 2023