Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
2
2
Xiao Hu
huxiao09
Follow
0 followers
ยท
1 following
huxiao09
AI & ML interests
Reinforcement Learning, LLM Reasoning
Recent Activity
authored
a paper
17 days ago
Query-Policy Misalignment in Preference-Based Reinforcement Learning
authored
a paper
17 days ago
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning
authored
a paper
17 days ago
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
View all activity
Organizations
None yet
Papers
5
arxiv:
2507.01949
arxiv:
2505.21067
arxiv:
2505.02835
arxiv:
2402.03046
Expand 5 papers
models
0
None public yet
datasets
0
None public yet