a4d3a0b9f9652792eaa12a64fc7a0a0f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.4688
Data Size: 1.0
Epoch Runtime: 316.8055
Bleu: 14.5135

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.0722	0	26.0473	0.1039
No log	1	1407	11.3248	0.0078	28.6270	0.1691
No log	2	2814	10.6613	0.0156	32.0288	0.0854
0.4011	3	4221	8.3129	0.0312	37.6507	0.2103
8.6224	4	5628	4.8219	0.0625	45.5095	1.9646
4.0351	5	7035	2.4925	0.125	64.6807	9.7578
2.9018	6	8442	2.0996	0.25	102.0038	8.0975
2.5594	7	9849	1.9141	0.5	174.3677	9.6353
2.2289	8.0	11256	1.7767	1.0	318.3380	10.8255
2.0555	9.0	12663	1.6844	1.0	317.6098	11.5320
2.0165	10.0	14070	1.6339	1.0	318.5579	12.0837
1.8789	11.0	15477	1.6035	1.0	315.8370	12.4816
1.7774	12.0	16884	1.5668	1.0	317.9719	12.7907
1.7569	13.0	18291	1.5487	1.0	315.5376	13.0673
1.7189	14.0	19698	1.5414	1.0	319.9117	13.2733
1.6574	15.0	21105	1.5134	1.0	317.2714	13.4508
1.6254	16.0	22512	1.5017	1.0	319.6040	13.6132
1.5744	17.0	23919	1.4962	1.0	319.9870	13.7815
1.5375	18.0	25326	1.4928	1.0	322.8918	13.8688
1.5039	19.0	26733	1.4923	1.0	320.2338	13.9764
1.4757	20.0	28140	1.4775	1.0	316.7046	14.1037
1.4347	21.0	29547	1.4763	1.0	318.7257	14.1721
1.4164	22.0	30954	1.4582	1.0	317.5881	14.2710
1.394	23.0	32361	1.4609	1.0	317.6584	14.3750
1.349	24.0	33768	1.4667	1.0	318.4692	14.3713
1.3484	25.0	35175	1.4703	1.0	317.2499	14.5182
1.3162	26.0	36582	1.4688	1.0	316.8055	14.5135

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 6

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/a4d3a0b9f9652792eaa12a64fc7a0a0f

Base model

google/umt5-base

Finetuned

(47)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard