VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 9 days ago • 41
Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published Aug 27 • 84
POSS: Position Specialist Generates Better Draft for Speculative Decoding Paper • 2506.03566 • Published Jun 4 • 6
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation Paper • 2504.00043 • Published Mar 30 • 9
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 16 • 41
Taming Overconfidence in LLMs: Reward Calibration in RLHF Paper • 2410.09724 • Published Oct 13, 2024 • 3