ru_to_ossbert

This model is a fine-tuned version of ai-forever/ruBert-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2399

Model description

Ossetic ruBert-base trained for experimental purpose

Training and evaluation data

Ossetic National Corpus (approx. 250K tokens)

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
No log 0.7273 200 2.8286
No log 1.4545 400 2.5696
No log 2.1818 600 2.4407
No log 2.9091 800 2.3352
No log 3.6364 1000 2.3069
No log 4.3636 1200 2.2399

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
16
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ania3000/ru_to_ossbert

Finetuned
(21)
this model