ru_to_ossbert_e

This model is a fine-tuned version of ai-forever/ruBert-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.8898

Model description

Ossetic ruBert-base with custom embeddings from tokenizer trained on Ossetic (vocab. length = 25K). A model trained for experimental purpose

Training and evaluation data

Ossetic National Corpus (approx. 200K tokens)

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
No log 0.9217 200 6.4320
No log 1.8433 400 6.2294
No log 2.7650 600 6.1100
No log 3.6866 800 5.9897
No log 4.6083 1000 5.8898

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
26
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ania3000/ru_to_ossbert_e

Finetuned
(21)
this model