[ICLR'24 Spotlight] Tool-Augmented Reward Modeling
ernie-research
community
AI & ML interests
Large Language Models
Recent Activity
View all activity
models
12
ernie-research/Themis-7b
Updated
•
1
•
4
ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
9B
•
Updated
•
18
ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
3B
•
Updated
•
1
ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
1
ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
2
ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
2
•
1
ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
27B
•
Updated
•
1
ernie-research/ernie-code-560m
Updated
•
3
•
10
ernie-research/MonoGPT
Text Generation
•
0.4B
•
Updated
•
2