Xiao Hu's picture

2 2

Xiao Hu

huxiao09

·

huxiao09

AI & ML interests

Reinforcement Learning, LLM Reasoning

Recent Activity

authored a paper 20 days ago

Query-Policy Misalignment in Preference-Based Reinforcement Learning

authored a paper 20 days ago

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

authored a paper 20 days ago

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

View all activity

Organizations

None yet

huxiao09 's datasets

None public yet