Hyperparameters:
- learning rate: 2e-5
 - weight decay: 0.01
 - per_device_train_batch_size: 8
 - per_device_eval_batch_size: 8
 - gradient_accumulation_steps:1
 - eval steps: 50000
 - max_length: 512
 - num_epochs: 1
 - hidden_dropout_prob: 0.3
 - attention_probs_dropout_prob: 0.25
 
Dataset version:
- tasky_or_not/10xp3nirstbbflanseuni_10xc4
 
Checkpoint:
- 300000 steps.
 
Results on Validation set:
| Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 | 
|---|---|---|---|---|---|---|
| 50000 | 0.020800 | 0.192550 | 0.970363 | 0.990686 | 0.949654 | 0.969736 | 
| 100000 | 0.015200 | 0.264168 | 0.969427 | 0.994374 | 0.944196 | 0.968636 | 
| 150000 | 0.012900 | 0.146541 | 0.981440 | 0.994599 | 0.968138 | 0.981190 | 
| 200000 | 0.011100 | 0.319310 | 0.970516 | 0.998871 | 0.942097 | 0.969654 | 
| 250000 | 0.008000 | 0.204103 | 0.976309 | 0.996226 | 0.956241 | 0.975824 | 
| 300000 | 0.006100 | 0.096262 | 0.988053 | 0.994676 | 0.981358 | 0.987972 | 
| 350000 | 0.005800 | 0.162989 | 0.983663 | 0.994730 | 0.972478 | 0.983478 | 
Wandb logs:
- Downloads last month
 - 9