7372f17ce60385262d23dba2b7623114

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-it] dataset. It achieves the following results on the evaluation set:

Loss: 2.1349
Data Size: 1.0
Epoch Runtime: 187.4076
Bleu: 7.3674

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.4554	0	15.3875	0.2515
No log	1	808	10.8399	0.0078	16.9857	0.2769
No log	2	1616	11.1653	0.0156	18.6915	0.2704
No log	3	2424	10.8503	0.0312	22.5773	0.2913
0.5264	4	3232	9.5802	0.0625	27.5701	0.3043
11.1193	5	4040	7.0025	0.125	38.5328	0.3351
4.8783	6	4848	3.2130	0.25	60.1048	2.3122
3.5996	7	5656	2.7725	0.5	102.1661	3.7294
3.2702	8.0	6464	2.5566	1.0	186.4477	4.5388
3.028	9.0	7272	2.4598	1.0	186.6356	4.9488
2.8677	10.0	8080	2.3925	1.0	187.9979	5.3090
2.7841	11.0	8888	2.3431	1.0	188.1784	5.6945
2.6599	12.0	9696	2.3045	1.0	186.9817	5.9210
2.6231	13.0	10504	2.2772	1.0	185.4800	6.0661
2.5456	14.0	11312	2.2522	1.0	186.3448	6.1830
2.4849	15.0	12120	2.2261	1.0	187.5464	6.3408
2.4557	16.0	12928	2.2111	1.0	187.1685	6.4413
2.3773	17.0	13736	2.1877	1.0	187.9075	6.5337
2.3059	18.0	14544	2.1803	1.0	187.9377	6.6183
2.2838	19.0	15352	2.1739	1.0	187.2289	6.6588
2.237	20.0	16160	2.1608	1.0	189.7539	6.8167
2.1972	21.0	16968	2.1526	1.0	188.6121	6.8071
2.1411	22.0	17776	2.1517	1.0	187.3689	6.8581
2.1112	23.0	18584	2.1353	1.0	185.9960	6.9965
2.0672	24.0	19392	2.1301	1.0	188.7333	7.0392
2.0241	25.0	20200	2.1263	1.0	188.2022	7.0634
2.0117	26.0	21008	2.1245	1.0	188.2026	7.1456
1.9704	27.0	21816	2.1271	1.0	186.2390	7.1761
1.9364	28.0	22624	2.1231	1.0	186.3455	7.2058
1.9214	29.0	23432	2.1294	1.0	186.7046	7.2392
1.8309	30.0	24240	2.1273	1.0	189.2313	7.2532
1.8582	31.0	25048	2.1222	1.0	187.5225	7.2641
1.831	32.0	25856	2.1250	1.0	187.5108	7.3159
1.8037	33.0	26664	2.1233	1.0	186.4405	7.3336
1.754	34.0	27472	2.1257	1.0	186.3864	7.3487
1.7069	35.0	28280	2.1349	1.0	187.4076	7.3674

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7372f17ce60385262d23dba2b7623114

Base model

google/umt5-base

Finetuned

(47)

this model

contemmcm
/

7372f17ce60385262d23dba2b7623114

7372f17ce60385262d23dba2b7623114

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/7372f17ce60385262d23dba2b7623114

Evaluation results