3ea24f2ed1b858a23541f31d599e5b34
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2718
- Data Size: 1.0
- Epoch Runtime: 21.1311
- Bleu: 2.0758
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 4.6309 | 0 | 2.1849 | 0.1822 |
| No log | 1 | 74 | 4.5341 | 0.0078 | 2.2990 | 0.2002 |
| No log | 2 | 148 | 4.1256 | 0.0156 | 2.9168 | 0.2166 |
| 0.1573 | 3 | 222 | 3.7375 | 0.0312 | 3.9709 | 0.2079 |
| 0.1573 | 4 | 296 | 3.6161 | 0.0625 | 4.9454 | 0.1545 |
| 0.2639 | 5 | 370 | 3.4622 | 0.125 | 6.3778 | 0.2279 |
| 0.2639 | 6 | 444 | 3.3369 | 0.25 | 10.1854 | 0.3524 |
| 0.7481 | 7 | 518 | 3.1923 | 0.5 | 11.8463 | 0.3223 |
| 2.1397 | 8.0 | 592 | 3.0346 | 1.0 | 20.9981 | 0.5087 |
| 3.146 | 9.0 | 666 | 2.9258 | 1.0 | 19.1073 | 0.5852 |
| 3.0648 | 10.0 | 740 | 2.8472 | 1.0 | 20.3700 | 0.7541 |
| 2.9566 | 11.0 | 814 | 2.7775 | 1.0 | 21.0247 | 0.8917 |
| 2.894 | 12.0 | 888 | 2.7275 | 1.0 | 20.0655 | 0.9357 |
| 2.802 | 13.0 | 962 | 2.6781 | 1.0 | 20.3078 | 1.0024 |
| 2.7706 | 14.0 | 1036 | 2.6386 | 1.0 | 19.4287 | 1.0168 |
| 2.6821 | 15.0 | 1110 | 2.6058 | 1.0 | 21.8017 | 0.9984 |
| 2.6472 | 16.0 | 1184 | 2.5715 | 1.0 | 20.1687 | 1.0827 |
| 2.5909 | 17.0 | 1258 | 2.5406 | 1.0 | 20.1049 | 1.1494 |
| 2.5565 | 18.0 | 1332 | 2.5214 | 1.0 | 19.9202 | 1.2234 |
| 2.4964 | 19.0 | 1406 | 2.4951 | 1.0 | 20.0591 | 1.2734 |
| 2.4667 | 20.0 | 1480 | 2.4675 | 1.0 | 21.0128 | 1.2103 |
| 2.4218 | 21.0 | 1554 | 2.4467 | 1.0 | 21.0528 | 1.2792 |
| 2.3851 | 22.0 | 1628 | 2.4419 | 1.0 | 21.1871 | 1.3279 |
| 2.334 | 23.0 | 1702 | 2.4227 | 1.0 | 20.1782 | 1.3110 |
| 2.3197 | 24.0 | 1776 | 2.3996 | 1.0 | 20.7571 | 1.4217 |
| 2.2742 | 25.0 | 1850 | 2.3887 | 1.0 | 20.8407 | 1.4773 |
| 2.2534 | 26.0 | 1924 | 2.3755 | 1.0 | 19.8197 | 1.4696 |
| 2.2366 | 27.0 | 1998 | 2.3680 | 1.0 | 20.9092 | 1.5464 |
| 2.2003 | 28.0 | 2072 | 2.3532 | 1.0 | 19.9523 | 1.5650 |
| 2.1657 | 29.0 | 2146 | 2.3426 | 1.0 | 21.2934 | 1.6725 |
| 2.1374 | 30.0 | 2220 | 2.3273 | 1.0 | 20.8864 | 1.6748 |
| 2.112 | 31.0 | 2294 | 2.3211 | 1.0 | 21.6561 | 1.7425 |
| 2.0838 | 32.0 | 2368 | 2.3296 | 1.0 | 21.1446 | 1.7935 |
| 2.0718 | 33.0 | 2442 | 2.3094 | 1.0 | 20.0078 | 1.8223 |
| 2.041 | 34.0 | 2516 | 2.2992 | 1.0 | 19.7819 | 1.8350 |
| 2.0212 | 35.0 | 2590 | 2.3038 | 1.0 | 21.2442 | 1.8237 |
| 1.9988 | 36.0 | 2664 | 2.2849 | 1.0 | 21.4291 | 1.9246 |
| 1.9767 | 37.0 | 2738 | 2.2840 | 1.0 | 21.1425 | 1.9711 |
| 1.9357 | 38.0 | 2812 | 2.2883 | 1.0 | 20.4910 | 1.9415 |
| 1.9175 | 39.0 | 2886 | 2.2718 | 1.0 | 22.1560 | 1.9449 |
| 1.8986 | 40.0 | 2960 | 2.2752 | 1.0 | 20.7935 | 1.9755 |
| 1.8808 | 41.0 | 3034 | 2.2679 | 1.0 | 20.4600 | 2.0046 |
| 1.8552 | 42.0 | 3108 | 2.2829 | 1.0 | 20.6810 | 1.9760 |
| 1.8316 | 43.0 | 3182 | 2.2647 | 1.0 | 20.8495 | 2.0886 |
| 1.8148 | 44.0 | 3256 | 2.2536 | 1.0 | 20.6102 | 2.0035 |
| 1.8051 | 45.0 | 3330 | 2.2611 | 1.0 | 22.5918 | 2.0329 |
| 1.7788 | 46.0 | 3404 | 2.2717 | 1.0 | 22.0629 | 2.0431 |
| 1.7648 | 47.0 | 3478 | 2.2586 | 1.0 | 21.0546 | 2.0633 |
| 1.7331 | 48.0 | 3552 | 2.2718 | 1.0 | 21.1311 | 2.0758 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/3ea24f2ed1b858a23541f31d599e5b34
Base model
google-t5/t5-base