Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO Paper • 2505.11595 • Published May 16 • 1