Haoyuan WU
hywu
AI & ML interests
None yet
Recent Activity
authored
a paper
18 days ago
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy
Gradient
upvoted
a
paper
20 days ago
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy
Gradient
authored
a paper
29 days ago
Reinforcement Learning on Pre-Training Data
Organizations
None yet