Xiao Hu's picture

2 2

Xiao Hu

huxiao09

·

huxiao09

AI & ML interests

Reinforcement Learning, LLM Reasoning

Recent Activity

authored a paper 17 days ago

Query-Policy Misalignment in Preference-Based Reinforcement Learning

authored a paper 17 days ago

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

authored a paper 17 days ago

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

View all activity

Organizations

None yet

Papers 5

arxiv:2507.01949

arxiv:2505.21067

arxiv:2505.02835

arxiv:2402.03046

models 0

None public yet

datasets 0

None public yet