Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
AdversarialRLHF
/
rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft_prefix_nokl
like
0
Follow
Adversarial Goodhart RLHF
3
Safetensors
gpt_neox
Model card
Files
Files and versions
xet
Community
a73f86d
rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft_prefix_nokl
/
checkpoint-26
/
generation_config.json
Muqeeth
Training in progress, step 26, checkpoint
6173354
verified
7 months ago
raw
Copy download link
history
blame
Safe
90 Bytes
{
"_from_model_config"
:
true
,
"bos_token_id"
:
0
,
"transformers_version"
:
"4.50.3"
}