NLLB for Low-Resource Middle Eastern Languages

This model is a fine-tuned version of nllb-200-distilled-600M. It achieves the following results on the evaluation set:

  • Loss: 5.2654
  • Bleu: 14.7793
  • Gen Len: 14.8572

This model is fine-tuned to translate from the following languages into English (eng_Latn):

  • Luri Bakhtiari (bqi_Arab)
  • Gilaki (glk_Arab)
  • Hawrami (hac_Arab)
  • Laki (lki_Arab)
  • Mazanderani (mzn_Arab)
  • Southern Kurdish (sdh_Arab)
  • Talysh (tly_Arab)
  • Zazaki (zza_Latn)

Intended uses & limitations

This model is trained to translate into English. It is not to train from English.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.0023 1.0 396 5.0074 14.2158 14.4698
0.0024 2.0 792 5.0344 14.2734 14.3467
0.0027 3.0 1188 5.0521 14.2059 14.3137
0.0039 4.0 1584 5.0034 13.7306 14.7393
0.0069 5.0 1980 5.0078 13.8802 14.4926
0.013 6.0 2376 4.9957 13.5899 14.494
0.0165 7.0 2772 4.9971 13.324 14.9148
0.0195 8.0 3168 4.9949 13.5516 14.4363
0.0218 9.0 3564 4.9608 13.6364 14.1306
0.0249 10.0 3960 4.9907 13.1309 14.3164
0.0237 11.0 4356 4.9949 13.389 14.4307
0.0183 12.0 4752 5.0267 13.4564 14.6526
0.0212 13.0 5148 5.0724 13.59 14.2952
0.0158 14.0 5544 5.0832 13.3564 14.5018
0.0149 15.0 5940 5.0480 13.71 14.4261
0.0152 16.0 6336 5.0454 13.3368 14.4033
0.0179 17.0 6732 5.0282 13.2518 14.4889
0.0139 18.0 7128 5.0397 13.4478 14.5729
0.0124 19.0 7524 5.1244 13.418 14.4207
0.0107 20.0 7920 5.1304 13.4141 14.5943
0.0104 21.0 8316 5.0841 13.6054 14.0954
0.0121 22.0 8712 5.0961 13.4688 14.6354
0.0086 23.0 9108 5.1330 13.5374 14.4979
0.0097 24.0 9504 5.1155 13.4956 14.4816
0.0074 25.0 9900 5.1742 13.8177 14.3275
0.0058 26.0 10296 5.1479 13.6641 14.219
0.0058 27.0 10692 5.1932 13.7447 14.1751
0.0044 28.0 11088 5.1611 13.488 14.7169
0.0083 29.0 11484 5.1577 13.8153 14.3556
0.0053 30.0 11880 5.2061 14.1224 14.1012
0.0046 31.0 12276 5.2480 13.9126 14.5045
0.0054 32.0 12672 5.1965 14.019 14.16
0.0035 33.0 13068 5.1847 14.004 14.4037
0.0032 34.0 13464 5.2124 14.228 14.2273
0.0024 35.0 13860 5.2090 14.2703 14.0995
0.0029 36.0 14256 5.2327 13.7593 14.604
0.0043 37.0 14652 5.2005 14.3019 14.0886
0.0022 38.0 15048 5.2218 14.2565 14.1928
0.0031 39.0 15444 5.2403 14.1208 14.438
0.0022 40.0 15840 5.2507 14.2927 14.3079
0.0014 41.0 16236 5.2558 14.2727 14.2874
0.0021 42.0 16632 5.2735 14.1117 14.1115
0.0013 43.0 17028 5.2707 14.4166 14.1923
0.0021 44.0 17424 5.2790 14.4223 14.2129
0.0016 45.0 17820 5.2758 14.486 14.2625
0.0019 46.0 18216 5.2546 14.5501 14.2695
0.0011 47.0 18612 5.2654 14.6166 14.1882
0.0016 48.0 19008 5.2610 14.5838 14.2617
0.0011 49.0 19404 5.2642 14.5987 14.2119
0.001 49.8743 19750 5.2645 14.576 14.2289

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SinaAhmadi/NLLB-DOLMA

Finetuned
(208)
this model