The ToolRL model trained for tool use through GRPO
Cheng Qian
chengq9
AI & ML interests
Agent, Tool Learning
Recent Activity
upvoted a paper about 11 hours ago
Code as Agent Harness upvoted a paper about 11 hours ago
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe updated a dataset 12 days ago
chengq9/CreativityBench