MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models
Abstract
MMRefine evaluates the error refinement capabilities of Multimodal Large Language Models through a benchmark that categorizes errors and identifies performance bottlenecks.
This paper introduces MMRefine, a MultiModal Refinement benchmark designed to evaluate the error refinement capabilities of Multimodal Large Language Models (MLLMs). As the emphasis shifts toward enhancing reasoning during inference, MMRefine provides a framework that evaluates MLLMs' abilities to detect and correct errors across six distinct scenarios beyond just comparing final accuracy before and after refinement. Furthermore, the benchmark analyzes the refinement performance by categorizing errors into six error types. Experiments with various open and closed MLLMs reveal bottlenecks and factors impeding refinement performance, highlighting areas for improvement in effective reasoning enhancement. Our code and dataset are publicly available at https://github.com/naver-ai/MMRefine.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models (2025)
- GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks (2025)
- MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs (2025)
- VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge (2025)
- HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models (2025)
- ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models (2025)
- Seeing Beyond Words: MatVQA for Challenging Visual-Scientific Reasoning in Materials Science (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper