Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Tongyi-ConvAI
's Collections
RM-NLHF
RM-NLHF
updated
Feb 25
Official collection for paper "Reward Modeling from Natural Language Human Feedback".
Upvote
2
Tongyi-ConvAI/Baseline-Outcome-Reward-Qwen-7B
8B
•
Updated
Feb 25
•
2
Tongyi-ConvAI/RM-NLHF-Qwen-32B
33B
•
Updated
Feb 25
•
7
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-32B
32B
•
Updated
Feb 25
•
2
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-7B
7B
•
Updated
Feb 25
•
4
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-7B
7B
•
Updated
Feb 25
•
5
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-32B
32B
•
Updated
Feb 26
•
4
•
1
Tongyi-ConvAI/RM-NLHF-Qwen-7B
8B
•
Updated
Feb 25
•
4
•
2
Tongyi-ConvAI/RM-NLHF
Viewer
•
Updated
Feb 25
•
49.5k
•
11
•
1
Upvote
2
Share collection
View history
Collection guide
Browse collections