Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example"
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 97 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1
Text Generation • 2B • Updated • 661 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13
Text Generation • 2B • Updated • 820 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1_pi13
Text Generation • 2B • Updated • 33