Papers
arxiv:2507.12672

The first open machine translation system for the Chechen language

Published on Jul 16
Authors:
,

Abstract

An open-source translation model between Chechen and Russian is introduced, along with a dataset and multilingual sentence encoder, achieving specific BLEU and ChrF++ scores.

AI-generated summary

We introduce the first open-source model for translation between the vulnerable Chechen language and Russian, and the dataset collected to train and evaluate it. We explore fine-tuning capabilities for including a new language into a large language model system for multilingual translation NLLB-200. The BLEU / ChrF++ scores for our model are 8.34 / 34.69 and 20.89 / 44.55 for translation from Russian to Chechen and reverse direction, respectively. The release of the translation models is accompanied by the distribution of parallel words, phrases and sentences corpora and multilingual sentence encoder adapted to the Chechen language.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2507.12672 in a Space README.md to link it from this page.

Collections including this paper 1