eternis_router_encoder_sft_9Sep

This model is a fine-tuned version of FacebookAI/roberta-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7641
  • Mse: 0.4879
  • Mae: 0.4952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Mse Mae
8.5061 0.3429 300 1.7282 1.0945 0.8522
8.1139 0.6857 600 1.6352 1.0425 0.8249
7.7043 1.0286 900 1.5126 0.9853 0.8051
6.9404 1.3714 1200 1.3621 0.8932 0.7625
6.6003 1.7143 1500 1.2822 0.8509 0.7413
5.8303 2.0571 1800 1.1538 0.7647 0.7005
5.4533 2.4 2100 1.1136 0.7337 0.6812
5.0087 2.7429 2400 1.0895 0.7149 0.6684
4.8301 3.0857 2700 1.0786 0.7060 0.6610
4.676 3.4286 3000 1.0654 0.6971 0.6550
4.6187 3.7714 3300 1.0473 0.6842 0.6447
4.3955 4.1143 3600 1.0129 0.6608 0.6308
4.4388 4.4571 3900 1.0255 0.6688 0.6355
4.3574 4.8 4200 0.9843 0.6405 0.6154
4.384 5.1429 4500 0.9835 0.6403 0.6153
4.4066 5.4857 4800 1.0027 0.6535 0.6256
4.4052 5.8286 5100 0.9643 0.6275 0.6050
4.3846 6.1714 5400 0.9507 0.6175 0.5982
4.31 6.5143 5700 0.9536 0.6193 0.5989
4.2701 6.8571 6000 0.9481 0.6159 0.5943
4.2284 7.2 6300 0.9181 0.5951 0.5815
4.1039 7.5429 6600 0.9036 0.5857 0.5731
4.2027 7.8857 6900 0.8906 0.5758 0.5655
4.1092 8.2286 7200 0.8899 0.5761 0.5653
4.213 8.5714 7500 0.8903 0.5778 0.5658
4.1775 8.9143 7800 0.8766 0.5663 0.5582
4.1293 9.2571 8100 0.8891 0.5739 0.5637
4.135 9.6 8400 0.8823 0.5701 0.5599
4.1324 9.9429 8700 0.8633 0.5590 0.5546
4.0383 10.2857 9000 0.8562 0.5524 0.5462
4.0424 10.6286 9300 0.8390 0.5403 0.5376
4.0851 10.9714 9600 0.8455 0.5446 0.5421
3.9225 11.3143 9900 0.8190 0.5283 0.5305
4.0771 11.6571 10200 0.8195 0.5261 0.5268
3.9188 12.0 10500 0.8112 0.5218 0.5233
3.9961 12.3429 10800 0.8157 0.5247 0.5241
3.9793 12.6857 11100 0.7877 0.5059 0.5097
4.0327 13.0286 11400 0.8180 0.5256 0.5242
3.9755 13.3714 11700 0.8043 0.5173 0.5183
3.9531 13.7143 12000 0.7860 0.5035 0.5060
3.9182 14.0571 12300 0.7914 0.5070 0.5123
3.8201 14.4 12600 0.7873 0.5033 0.5095
3.8887 14.7429 12900 0.7868 0.5039 0.5077
3.9021 15.0857 13200 0.7736 0.4964 0.5016
3.8241 15.4286 13500 0.7677 0.4903 0.5002
3.8387 15.7714 13800 0.7751 0.4953 0.5025
3.8344 16.1143 14100 0.7802 0.5004 0.5023
3.8173 16.4571 14400 0.7588 0.4853 0.4958
3.9998 16.8 14700 0.7556 0.4828 0.4915
3.838 17.1429 15000 0.7690 0.4916 0.4988
3.7549 17.4857 15300 0.7545 0.4819 0.4920
3.8553 17.8286 15600 0.7901 0.5058 0.5077
3.7462 18.1714 15900 0.7558 0.4829 0.4947
3.7608 18.5143 16200 0.7518 0.4796 0.4890
3.9487 18.8571 16500 0.7590 0.4836 0.4915
3.9489 19.2 16800 0.7278 0.4639 0.4758
3.8653 19.5429 17100 0.7498 0.4786 0.4914
3.7533 19.8857 17400 0.7641 0.4879 0.4952

Framework versions

  • Transformers 4.56.1
  • Pytorch 2.7.0
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for eternis/eternis_router_encoder_sft_9Sep

Finetuned
(2006)
this model