Yanxi Chen's picture

4 2

Yanxi Chen

yanxi-chen

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

authored a paper about 2 months ago

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

upvoted a paper about 2 months ago

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

View all activity

Organizations

None yet

Papers 8

arxiv:2509.24203

arxiv:2508.11408

arxiv:2505.17826

arxiv:2505.12629

models 0

None public yet

datasets 0

None public yet