Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLHFlow
university
RLHFlow
RLHFlow
Activity Feed
Follow
139
AI & ML interests
Workflow of Reinforcement Learning from Human Feedback (RLHF). Blog: https://rlhflow.github.io/
Recent Activity
baohao
updated
a collection
1 day ago
Reinforce-Ada
baohao
updated
a dataset
1 day ago
RLHFlow/reinforce_ada_hard_prompt
baohao
published
a dataset
1 day ago
RLHFlow/reinforce_ada_hard_prompt
View all activity
Team members
9
RLHFlow
's datasets
84
Sort: Recently updated
RLHFlow/DS-MATH500-Test-Result-of-Mistral-ORM
Viewer
•
Updated
Nov 8, 2024
•
500
•
11
RLHFlow/DS-GSM8K-Test-Result-of-Mistral-ORM
Viewer
•
Updated
Nov 8, 2024
•
1.32k
•
12
RLHFlow/DS-GSM8K-Test-Result-of-DS-ORM
Viewer
•
Updated
Nov 8, 2024
•
1.32k
•
10
RLHFlow/DS-MATH500-Test-Result-of-DS-ORM
Viewer
•
Updated
Nov 8, 2024
•
500
•
16
RLHFlow/Mistral-PRM-Data-No-Final-Ans
Viewer
•
Updated
Nov 6, 2024
•
273k
•
9
RLHFlow/Deepseek-ORM-Data-Pairwise
Viewer
•
Updated
Nov 4, 2024
•
36k
•
22
•
1
RLHFlow/Mistral-ORM-Data-Pairwise
Viewer
•
Updated
Nov 3, 2024
•
37.9k
•
11
RLHFlow/Deepseek-GSM8K-Test
Viewer
•
Updated
Nov 3, 2024
•
1.32k
•
74
RLHFlow/RLHFlow-SFT-Dataset-ver2
Viewer
•
Updated
Nov 2, 2024
•
2.32M
•
98
•
5
RLHFlow/Mistral-GSM8K-Test
Viewer
•
Updated
Nov 2, 2024
•
1.32k
•
15
RLHFlow/ultrafeedback_all
Viewer
•
Updated
Oct 29, 2024
•
59.6k
•
9
RLHFlow/Llama3-SFT-RAFT-Ultrafeedback-iter1
Viewer
•
Updated
Sep 21, 2024
•
20k
•
8
RLHFlow/ultrafeedback_iter3
Viewer
•
Updated
Sep 19, 2024
•
19.6k
•
33
RLHFlow/ultrafeedback_iter2
Viewer
•
Updated
Sep 19, 2024
•
20k
•
46
RLHFlow/ultrafeedback_iter1
Viewer
•
Updated
Sep 19, 2024
•
20k
•
47
RLHFlow/pair-preference-Skywork-80K-v0.1
Viewer
•
Updated
Sep 9, 2024
•
82k
•
8
RLHFlow/ArmoRM-Multi-Objective-Data-v0.2
Viewer
•
Updated
Sep 7, 2024
•
555k
•
6
RLHFlow/ArmoRM-Multi-Objective-Data-v0.1
Viewer
•
Updated
Sep 7, 2024
•
569k
•
63
•
3
RLHFlow/pair_data_v2_80K_wsafety_short
Viewer
•
Updated
Aug 24, 2024
•
790k
•
34
RLHFlow/pair_data_v2_78_wo_safety
Viewer
•
Updated
Jul 26, 2024
•
777k
•
21
RLHFlow/pair_data_v2_80K_wsafety
Viewer
•
Updated
Jul 26, 2024
•
803k
•
33
•
3
RLHFlow/preference_data_v2_80K_wsafety
Viewer
•
Updated
Jul 26, 2024
•
803k
•
21
RLHFlow/preference_data_v2_78K
Viewer
•
Updated
Jul 26, 2024
•
777k
•
17
RLHFlow/lmsys-delete-tie-standard
Viewer
•
Updated
Jul 22, 2024
•
39.7k
•
5
RLHFlow/Helpsteer2-standard
Viewer
•
Updated
Jul 22, 2024
•
8.05k
•
4
•
1
RLHFlow/iterative-prompt-v1-iter9-20K
Viewer
•
Updated
Jun 12, 2024
•
19.9k
•
18
•
1
RLHFlow/iterative-prompt-v1-iter8-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
13
RLHFlow/iterative-prompt-v1-iter7-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
10
RLHFlow/iterative-prompt-v1-iter6-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
13
RLHFlow/iterative-prompt-v1-iter5-20K
Viewer
•
Updated
Jun 12, 2024
•
20k
•
12
Previous
1
2
3
Next