a4d3a0b9f9652792eaa12a64fc7a0a0f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4688
  • Data Size: 1.0
  • Epoch Runtime: 316.8055
  • Bleu: 14.5135

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.0722 0 26.0473 0.1039
No log 1 1407 11.3248 0.0078 28.6270 0.1691
No log 2 2814 10.6613 0.0156 32.0288 0.0854
0.4011 3 4221 8.3129 0.0312 37.6507 0.2103
8.6224 4 5628 4.8219 0.0625 45.5095 1.9646
4.0351 5 7035 2.4925 0.125 64.6807 9.7578
2.9018 6 8442 2.0996 0.25 102.0038 8.0975
2.5594 7 9849 1.9141 0.5 174.3677 9.6353
2.2289 8.0 11256 1.7767 1.0 318.3380 10.8255
2.0555 9.0 12663 1.6844 1.0 317.6098 11.5320
2.0165 10.0 14070 1.6339 1.0 318.5579 12.0837
1.8789 11.0 15477 1.6035 1.0 315.8370 12.4816
1.7774 12.0 16884 1.5668 1.0 317.9719 12.7907
1.7569 13.0 18291 1.5487 1.0 315.5376 13.0673
1.7189 14.0 19698 1.5414 1.0 319.9117 13.2733
1.6574 15.0 21105 1.5134 1.0 317.2714 13.4508
1.6254 16.0 22512 1.5017 1.0 319.6040 13.6132
1.5744 17.0 23919 1.4962 1.0 319.9870 13.7815
1.5375 18.0 25326 1.4928 1.0 322.8918 13.8688
1.5039 19.0 26733 1.4923 1.0 320.2338 13.9764
1.4757 20.0 28140 1.4775 1.0 316.7046 14.1037
1.4347 21.0 29547 1.4763 1.0 318.7257 14.1721
1.4164 22.0 30954 1.4582 1.0 317.5881 14.2710
1.394 23.0 32361 1.4609 1.0 317.6584 14.3750
1.349 24.0 33768 1.4667 1.0 318.4692 14.3713
1.3484 25.0 35175 1.4703 1.0 317.2499 14.5182
1.3162 26.0 36582 1.4688 1.0 316.8055 14.5135

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
1.0B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/a4d3a0b9f9652792eaa12a64fc7a0a0f

Base model

google/umt5-base
Finetuned
(47)
this model