Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
11
8
6
Shihan Dou
Ablustrund
Follow
21world's profile picture
TaoJi's profile picture
2 followers
·
3 following
Ablustrund
AI & ML interests
Natural Language Processing, Large Language Models
Recent Activity
upvoted
a
paper
14 days ago
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
authored
a paper
20 days ago
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback
authored
a paper
20 days ago
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
View all activity
Organizations
Papers
22
arxiv:
2507.05197
arxiv:
2504.13914
arxiv:
2502.17184
arxiv:
2412.12505
Expand 22 papers
models
1
Ablustrund/moss-rlhf-reward-model-7B-zh
Updated
Jul 13, 2023
•
23
datasets
0
None public yet