rlhf-qa-ppo / latest
kastan's picture
Step 3 of 3; First attempt at a PPO fine-tuned model.
959dbed
raw
history blame contribute delete
13 Bytes
pytorch_model