RealGuardrails Models
This model was trained on the RealGuardrails dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the simplemix split of ~150K examples using our custom training library torchllms and converted back to a transformers compatible checkpoint.
Training Hyperparameters
| Name |
Value |
| optimizer |
AdamW |
| batch size |
128 |
| learning rate |
2e-5 |
| lr scheduler |
cosine with 200 warmup steps |
| betas |
(0.9, 0.999) |
| eps |
1e-8 |
| weight decay |
0 |
| epochs |
1 |
| max grad norm |
1.0 |
| precision |
bf16 |
| max length |
4096 |