94b30d9cc2315e0cd9ff484731b38f52

This model is a fine-tuned version of google/mt5-xl on the Helsinki-NLP/opus_books [en-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4633
  • Data Size: 1.0
  • Epoch Runtime: 50.3594
  • Bleu: 12.4260

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 7.8812 0 3.1567 0.0112
No log 1 77 5.3106 0.0078 3.5268 0.0472
No log 2 154 3.6828 0.0156 11.8525 0.3424
No log 3 231 3.0095 0.0312 17.9511 0.7707
No log 4 308 2.7073 0.0625 26.0137 1.0505
No log 5 385 2.2687 0.125 25.5012 1.3031
0.336 6 462 1.7849 0.25 29.1275 13.0199
0.9198 7 539 1.5161 0.5 43.3328 9.2252
1.7349 8.0 616 1.3972 1.0 65.3745 11.4792
1.4925 9.0 693 1.3574 1.0 51.1325 11.9868
1.2101 10.0 770 1.3493 1.0 53.2914 12.1847
1.1172 11.0 847 1.3935 1.0 48.7534 12.2443
0.9201 12.0 924 1.3828 1.0 48.4278 12.5264
0.8138 13.0 1001 1.4273 1.0 51.6848 12.3103
0.7079 14.0 1078 1.4633 1.0 50.3594 12.4260

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
4
Safetensors
Model size
0.9B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/94b30d9cc2315e0cd9ff484731b38f52

Base model

google/mt5-xl
Finetuned
(40)
this model

Evaluation results