--- language: - ko - ja base_model: facebook/mbart-large-50-many-to-many-mmt tags: - generated_from_trainer metrics: - bleu model-index: - name: mbartLarge_koja_37p_exp2 results: [] --- # mbartLarge_koja_37p_exp2 This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.8988 - Bleu: 6.7577 - Gen Len: 17.8104 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 2 - total_train_batch_size: 32 - total_eval_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 350 - num_epochs: 15 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:------:|:-------:| | 2.0622 | 0.11 | 1250 | 1.6679 | 1.2834 | 17.8009 | | 1.5139 | 0.22 | 2500 | 1.4378 | 2.0427 | 17.8496 | | 1.4121 | 0.33 | 3750 | 1.3116 | 2.7599 | 17.7667 | | 1.2879 | 0.44 | 5000 | 1.2381 | 3.1444 | 17.8887 | | 1.2344 | 0.55 | 6250 | 1.1769 | 3.3835 | 17.8323 | | 1.1778 | 0.66 | 7500 | 1.1382 | 3.9511 | 17.4892 | | 1.1461 | 0.77 | 8750 | 1.0938 | 3.9402 | 18.0136 | | 1.1151 | 0.88 | 10000 | 1.0749 | 4.2134 | 18.0537 | | 1.093 | 0.99 | 11250 | 1.0418 | 3.9587 | 17.8715 | | 1.0626 | 1.1 | 12500 | 1.0315 | 4.6251 | 17.9406 | | 1.0192 | 1.21 | 13750 | 1.0132 | 4.9573 | 18.1266 | | 0.9957 | 1.32 | 15000 | 0.9989 | 4.3068 | 18.0925 | | 0.9778 | 1.43 | 16250 | 0.9850 | 5.0517 | 17.8783 | | 0.9446 | 1.54 | 17500 | 0.9748 | 5.0194 | 17.9348 | | 0.9236 | 1.65 | 18750 | 0.9619 | 4.6011 | 17.7926 | | 0.9091 | 1.76 | 20000 | 0.9564 | 4.6035 | 17.9399 | | 0.9072 | 1.87 | 21250 | 0.9533 | 4.8313 | 17.6221 | | 0.8758 | 1.98 | 22500 | 0.9421 | 5.2707 | 17.5851 | | 0.8539 | 2.09 | 23750 | 0.9304 | 5.2661 | 17.821 | | 0.8575 | 2.2 | 25000 | 0.9329 | 4.9143 | 17.8879 | | 0.8314 | 2.31 | 26250 | 0.9262 | 5.106 | 18.0037 | | 0.8248 | 2.42 | 27500 | 0.9241 | 5.3073 | 17.6632 | | 0.8151 | 2.53 | 28750 | 0.9302 | 5.5675 | 17.7676 | | 0.8093 | 2.64 | 30000 | 0.9149 | 6.2644 | 17.8475 | | 0.7691 | 2.75 | 31250 | 0.8988 | 6.6682 | 17.7685 | | 0.771 | 2.86 | 32500 | 0.9189 | 5.7856 | 17.8678 | | 0.7658 | 2.97 | 33750 | 0.9175 | 6.2468 | 17.7313 | | 0.7914 | 3.08 | 35000 | 0.9020 | 5.5525 | 17.7627 | | 0.7264 | 3.19 | 36250 | 0.9046 | 6.2055 | 17.7662 | ### Framework versions - Transformers 4.34.1 - Pytorch 2.1.0+cu121 - Datasets 2.14.6 - Tokenizers 0.14.1