3d60c57c3f3c5f84a08a67aca192dceb

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fi-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4253
  • Data Size: 1.0
  • Epoch Runtime: 23.7459
  • Bleu: 4.5379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 12.4930 0 2.4647 0.0252
No log 1 88 12.3720 0.0078 2.7830 0.0249
No log 2 176 12.3626 0.0156 4.1122 0.0317
No log 3 264 12.1881 0.0312 5.8347 0.0239
No log 4 352 11.3909 0.0625 7.4028 0.0239
No log 5 440 9.8327 0.125 9.6713 0.0376
1.1949 6 528 8.3388 0.25 12.5681 0.0799
4.2766 7 616 7.0456 0.5 15.9200 0.1479
5.9558 8.0 704 3.9423 1.0 24.7735 1.9463
4.9277 9.0 792 3.0999 1.0 22.7905 4.5165
4.1288 10.0 880 2.7956 1.0 22.5833 2.5227
3.7293 11.0 968 2.7002 1.0 22.8304 2.8830
3.4868 12.0 1056 2.6330 1.0 22.6622 3.2056
3.3872 13.0 1144 2.5942 1.0 24.2142 3.3695
3.2691 14.0 1232 2.5500 1.0 24.7831 3.4857
3.1419 15.0 1320 2.5322 1.0 22.9104 3.4446
3.0114 16.0 1408 2.5091 1.0 23.4552 3.8765
2.9133 17.0 1496 2.4738 1.0 25.0048 3.9387
2.8438 18.0 1584 2.4587 1.0 24.7820 3.9834
2.7913 19.0 1672 2.4573 1.0 22.6510 4.0498
2.697 20.0 1760 2.4375 1.0 22.6254 4.1260
2.6554 21.0 1848 2.4304 1.0 22.7013 4.1289
2.6171 22.0 1936 2.4352 1.0 24.4042 4.2317
2.5397 23.0 2024 2.4222 1.0 23.3454 4.2221
2.4962 24.0 2112 2.4267 1.0 23.9381 4.2985
2.4463 25.0 2200 2.4151 1.0 23.3682 4.4046
2.4017 26.0 2288 2.4168 1.0 23.7511 4.4193
2.3341 27.0 2376 2.4161 1.0 26.1075 4.4194
2.3059 28.0 2464 2.4077 1.0 22.7957 4.4997
2.2576 29.0 2552 2.4237 1.0 22.7708 4.5188
2.2294 30.0 2640 2.4263 1.0 22.9204 4.4313
2.1692 31.0 2728 2.4236 1.0 24.2041 4.4810
2.1514 32.0 2816 2.4253 1.0 23.7459 4.5379

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
1.0B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/3d60c57c3f3c5f84a08a67aca192dceb

Base model

google/umt5-base
Finetuned
(47)
this model