7372f17ce60385262d23dba2b7623114

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1349
  • Data Size: 1.0
  • Epoch Runtime: 187.4076
  • Bleu: 7.3674

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.4554 0 15.3875 0.2515
No log 1 808 10.8399 0.0078 16.9857 0.2769
No log 2 1616 11.1653 0.0156 18.6915 0.2704
No log 3 2424 10.8503 0.0312 22.5773 0.2913
0.5264 4 3232 9.5802 0.0625 27.5701 0.3043
11.1193 5 4040 7.0025 0.125 38.5328 0.3351
4.8783 6 4848 3.2130 0.25 60.1048 2.3122
3.5996 7 5656 2.7725 0.5 102.1661 3.7294
3.2702 8.0 6464 2.5566 1.0 186.4477 4.5388
3.028 9.0 7272 2.4598 1.0 186.6356 4.9488
2.8677 10.0 8080 2.3925 1.0 187.9979 5.3090
2.7841 11.0 8888 2.3431 1.0 188.1784 5.6945
2.6599 12.0 9696 2.3045 1.0 186.9817 5.9210
2.6231 13.0 10504 2.2772 1.0 185.4800 6.0661
2.5456 14.0 11312 2.2522 1.0 186.3448 6.1830
2.4849 15.0 12120 2.2261 1.0 187.5464 6.3408
2.4557 16.0 12928 2.2111 1.0 187.1685 6.4413
2.3773 17.0 13736 2.1877 1.0 187.9075 6.5337
2.3059 18.0 14544 2.1803 1.0 187.9377 6.6183
2.2838 19.0 15352 2.1739 1.0 187.2289 6.6588
2.237 20.0 16160 2.1608 1.0 189.7539 6.8167
2.1972 21.0 16968 2.1526 1.0 188.6121 6.8071
2.1411 22.0 17776 2.1517 1.0 187.3689 6.8581
2.1112 23.0 18584 2.1353 1.0 185.9960 6.9965
2.0672 24.0 19392 2.1301 1.0 188.7333 7.0392
2.0241 25.0 20200 2.1263 1.0 188.2022 7.0634
2.0117 26.0 21008 2.1245 1.0 188.2026 7.1456
1.9704 27.0 21816 2.1271 1.0 186.2390 7.1761
1.9364 28.0 22624 2.1231 1.0 186.3455 7.2058
1.9214 29.0 23432 2.1294 1.0 186.7046 7.2392
1.8309 30.0 24240 2.1273 1.0 189.2313 7.2532
1.8582 31.0 25048 2.1222 1.0 187.5225 7.2641
1.831 32.0 25856 2.1250 1.0 187.5108 7.3159
1.8037 33.0 26664 2.1233 1.0 186.4405 7.3336
1.754 34.0 27472 2.1257 1.0 186.3864 7.3487
1.7069 35.0 28280 2.1349 1.0 187.4076 7.3674

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7372f17ce60385262d23dba2b7623114

Base model

google/umt5-base
Finetuned
(47)
this model

Evaluation results