anirudhb11/critic_200_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-5000-91a081ef96 Text Classification • 2B • Updated 6 days ago • 18
anirudhb11/actor_200_ppo-run-math-training-prompt-len-800-response-len-4096-seed-43-subset-5000-3dac955361 Text Generation • 2B • Updated 6 days ago • 14
anirudhb11/deepscaler-math-1000-subset-seed-43-500-subset-seed-43 Viewer • Updated 10 days ago • 500 • 17