a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue-decayepsfalse 3B • Updated 19 days ago • 17
a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2-modenull-cftrue-decayepsfalse 3B • Updated 25 days ago • 4
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue 0.6B • Updated 26 days ago • 6
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr6.5e-7-epoch2-modemargin-cftrue 0.6B • Updated 27 days ago • 3
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr1e-6-epoch2-modenull-cftrue 0.6B • Updated 29 days ago • 7
a-F1/math-qwen3-0.6b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull 0.6B • Updated Jun 26 • 2
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-beam_search-prm-completions Viewer • Updated May 10 • 4 • 4
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-best_of_n-prm-completions Viewer • Updated May 10 • 4 • 4
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions Updated May 9 • 3
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-beam_search-prm-completions Viewer • Updated May 8 • 8 • 2
a-F1/DeepSeek-R1-Distill-Qwen-7B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions Viewer • Updated May 7 • 7 • 4