rlhf-gpt2-pipeline / reward_model_final
3.25 MB
Nabeel Shan
Change RM Adapter extension
f35cb12