Self-Evaluation Improves Selective Generation in Large Language Models Paper • 2312.09300 • Published Dec 14, 2023 • 16
Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward Paper • 2506.05433 • Published Jun 5 • 4