f5fcb0deefda9b42c5f01b73d4074f5a

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-ru] dataset. It achieves the following results on the evaluation set:

Loss: 1.7601
Data Size: 1.0
Epoch Runtime: 103.0441
Bleu: 9.7353

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	10.8029	0	8.6388	0.1891
No log	1	437	10.3068	0.0078	9.9822	0.2941
No log	2	874	9.5257	0.0156	10.9625	0.3143
No log	3	1311	9.0151	0.0312	13.1092	0.3782
No log	4	1748	7.5096	0.0625	15.9280	0.6061
10.1556	5	2185	5.9652	0.125	21.9748	0.9684
6.2716	6	2622	3.3615	0.25	33.4550	7.2606
3.7253	7	3059	2.4801	0.5	56.2168	5.6955
2.9367	8.0	3496	2.1896	1.0	102.8297	6.4054
2.7182	9.0	3933	2.0701	1.0	99.5541	7.0651
2.531	10.0	4370	2.0068	1.0	100.6438	7.4090
2.4395	11.0	4807	1.9543	1.0	102.5407	7.7493
2.3342	12.0	5244	1.9145	1.0	101.2465	8.0284
2.2166	13.0	5681	1.8871	1.0	101.3928	8.2270
2.1458	14.0	6118	1.8661	1.0	101.8264	8.4796
2.0813	15.0	6555	1.8434	1.0	101.7612	8.5710
1.9936	16.0	6992	1.8180	1.0	102.8554	8.7805
1.9716	17.0	7429	1.8095	1.0	102.6467	8.8730
1.908	18.0	7866	1.8025	1.0	103.7183	8.9425
1.829	19.0	8303	1.7905	1.0	103.8481	9.1296
1.8205	20.0	8740	1.7831	1.0	102.4268	9.1897
1.7783	21.0	9177	1.7715	1.0	102.0740	9.3412
1.7252	22.0	9614	1.7758	1.0	102.5830	9.3150
1.6862	23.0	10051	1.7665	1.0	102.5648	9.4577
1.6352	24.0	10488	1.7677	1.0	102.6332	9.5920
1.64	25.0	10925	1.7584	1.0	102.5432	9.5846
1.5801	26.0	11362	1.7625	1.0	102.4186	9.6556
1.5826	27.0	11799	1.7628	1.0	101.8117	9.7640
1.5081	28.0	12236	1.7625	1.0	103.3678	9.7842
1.4737	29.0	12673	1.7601	1.0	103.0441	9.7353

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 6

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f5fcb0deefda9b42c5f01b73d4074f5a

Base model

google/umt5-base

Finetuned

(47)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard