metadata
language: mn
license: mit
tags:
- mongolian
- tokenizer
- sentencepiece
SentencePiece Tokenizer
This repository contains a fine-tuned SentencePiece tokenizer on Mongolian text.
Files
tokenizer_config.json: The tokenizer configuration filemn_tokenizer.model: The SentencePiece model filemn_tokenizer.vocab: The SentencePiece vocabulary file
Usage
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Namuun123/mn_sentencepiece_tokenizer")