3d60c57c3f3c5f84a08a67aca192dceb
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fi-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 2.4253
- Data Size: 1.0
- Epoch Runtime: 23.7459
- Bleu: 4.5379
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 12.4930 | 0 | 2.4647 | 0.0252 |
| No log | 1 | 88 | 12.3720 | 0.0078 | 2.7830 | 0.0249 |
| No log | 2 | 176 | 12.3626 | 0.0156 | 4.1122 | 0.0317 |
| No log | 3 | 264 | 12.1881 | 0.0312 | 5.8347 | 0.0239 |
| No log | 4 | 352 | 11.3909 | 0.0625 | 7.4028 | 0.0239 |
| No log | 5 | 440 | 9.8327 | 0.125 | 9.6713 | 0.0376 |
| 1.1949 | 6 | 528 | 8.3388 | 0.25 | 12.5681 | 0.0799 |
| 4.2766 | 7 | 616 | 7.0456 | 0.5 | 15.9200 | 0.1479 |
| 5.9558 | 8.0 | 704 | 3.9423 | 1.0 | 24.7735 | 1.9463 |
| 4.9277 | 9.0 | 792 | 3.0999 | 1.0 | 22.7905 | 4.5165 |
| 4.1288 | 10.0 | 880 | 2.7956 | 1.0 | 22.5833 | 2.5227 |
| 3.7293 | 11.0 | 968 | 2.7002 | 1.0 | 22.8304 | 2.8830 |
| 3.4868 | 12.0 | 1056 | 2.6330 | 1.0 | 22.6622 | 3.2056 |
| 3.3872 | 13.0 | 1144 | 2.5942 | 1.0 | 24.2142 | 3.3695 |
| 3.2691 | 14.0 | 1232 | 2.5500 | 1.0 | 24.7831 | 3.4857 |
| 3.1419 | 15.0 | 1320 | 2.5322 | 1.0 | 22.9104 | 3.4446 |
| 3.0114 | 16.0 | 1408 | 2.5091 | 1.0 | 23.4552 | 3.8765 |
| 2.9133 | 17.0 | 1496 | 2.4738 | 1.0 | 25.0048 | 3.9387 |
| 2.8438 | 18.0 | 1584 | 2.4587 | 1.0 | 24.7820 | 3.9834 |
| 2.7913 | 19.0 | 1672 | 2.4573 | 1.0 | 22.6510 | 4.0498 |
| 2.697 | 20.0 | 1760 | 2.4375 | 1.0 | 22.6254 | 4.1260 |
| 2.6554 | 21.0 | 1848 | 2.4304 | 1.0 | 22.7013 | 4.1289 |
| 2.6171 | 22.0 | 1936 | 2.4352 | 1.0 | 24.4042 | 4.2317 |
| 2.5397 | 23.0 | 2024 | 2.4222 | 1.0 | 23.3454 | 4.2221 |
| 2.4962 | 24.0 | 2112 | 2.4267 | 1.0 | 23.9381 | 4.2985 |
| 2.4463 | 25.0 | 2200 | 2.4151 | 1.0 | 23.3682 | 4.4046 |
| 2.4017 | 26.0 | 2288 | 2.4168 | 1.0 | 23.7511 | 4.4193 |
| 2.3341 | 27.0 | 2376 | 2.4161 | 1.0 | 26.1075 | 4.4194 |
| 2.3059 | 28.0 | 2464 | 2.4077 | 1.0 | 22.7957 | 4.4997 |
| 2.2576 | 29.0 | 2552 | 2.4237 | 1.0 | 22.7708 | 4.5188 |
| 2.2294 | 30.0 | 2640 | 2.4263 | 1.0 | 22.9204 | 4.4313 |
| 2.1692 | 31.0 | 2728 | 2.4236 | 1.0 | 24.2041 | 4.4810 |
| 2.1514 | 32.0 | 2816 | 2.4253 | 1.0 | 23.7459 | 4.5379 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/3d60c57c3f3c5f84a08a67aca192dceb
Base model
google/umt5-base