3ea24f2ed1b858a23541f31d599e5b34

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2718
  • Data Size: 1.0
  • Epoch Runtime: 21.1311
  • Bleu: 2.0758

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 4.6309 0 2.1849 0.1822
No log 1 74 4.5341 0.0078 2.2990 0.2002
No log 2 148 4.1256 0.0156 2.9168 0.2166
0.1573 3 222 3.7375 0.0312 3.9709 0.2079
0.1573 4 296 3.6161 0.0625 4.9454 0.1545
0.2639 5 370 3.4622 0.125 6.3778 0.2279
0.2639 6 444 3.3369 0.25 10.1854 0.3524
0.7481 7 518 3.1923 0.5 11.8463 0.3223
2.1397 8.0 592 3.0346 1.0 20.9981 0.5087
3.146 9.0 666 2.9258 1.0 19.1073 0.5852
3.0648 10.0 740 2.8472 1.0 20.3700 0.7541
2.9566 11.0 814 2.7775 1.0 21.0247 0.8917
2.894 12.0 888 2.7275 1.0 20.0655 0.9357
2.802 13.0 962 2.6781 1.0 20.3078 1.0024
2.7706 14.0 1036 2.6386 1.0 19.4287 1.0168
2.6821 15.0 1110 2.6058 1.0 21.8017 0.9984
2.6472 16.0 1184 2.5715 1.0 20.1687 1.0827
2.5909 17.0 1258 2.5406 1.0 20.1049 1.1494
2.5565 18.0 1332 2.5214 1.0 19.9202 1.2234
2.4964 19.0 1406 2.4951 1.0 20.0591 1.2734
2.4667 20.0 1480 2.4675 1.0 21.0128 1.2103
2.4218 21.0 1554 2.4467 1.0 21.0528 1.2792
2.3851 22.0 1628 2.4419 1.0 21.1871 1.3279
2.334 23.0 1702 2.4227 1.0 20.1782 1.3110
2.3197 24.0 1776 2.3996 1.0 20.7571 1.4217
2.2742 25.0 1850 2.3887 1.0 20.8407 1.4773
2.2534 26.0 1924 2.3755 1.0 19.8197 1.4696
2.2366 27.0 1998 2.3680 1.0 20.9092 1.5464
2.2003 28.0 2072 2.3532 1.0 19.9523 1.5650
2.1657 29.0 2146 2.3426 1.0 21.2934 1.6725
2.1374 30.0 2220 2.3273 1.0 20.8864 1.6748
2.112 31.0 2294 2.3211 1.0 21.6561 1.7425
2.0838 32.0 2368 2.3296 1.0 21.1446 1.7935
2.0718 33.0 2442 2.3094 1.0 20.0078 1.8223
2.041 34.0 2516 2.2992 1.0 19.7819 1.8350
2.0212 35.0 2590 2.3038 1.0 21.2442 1.8237
1.9988 36.0 2664 2.2849 1.0 21.4291 1.9246
1.9767 37.0 2738 2.2840 1.0 21.1425 1.9711
1.9357 38.0 2812 2.2883 1.0 20.4910 1.9415
1.9175 39.0 2886 2.2718 1.0 22.1560 1.9449
1.8986 40.0 2960 2.2752 1.0 20.7935 1.9755
1.8808 41.0 3034 2.2679 1.0 20.4600 2.0046
1.8552 42.0 3108 2.2829 1.0 20.6810 1.9760
1.8316 43.0 3182 2.2647 1.0 20.8495 2.0886
1.8148 44.0 3256 2.2536 1.0 20.6102 2.0035
1.8051 45.0 3330 2.2611 1.0 22.5918 2.0329
1.7788 46.0 3404 2.2717 1.0 22.0629 2.0431
1.7648 47.0 3478 2.2586 1.0 21.0546 2.0633
1.7331 48.0 3552 2.2718 1.0 21.1311 2.0758

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/3ea24f2ed1b858a23541f31d599e5b34

Base model

google-t5/t5-base
Finetuned
(712)
this model