gpt2-mydataset

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9846

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
8.0182 0.0617 50 7.4035
7.2474 0.1233 100 7.0215
6.8954 0.1850 150 6.7160
6.6165 0.2466 200 6.5297
6.4592 0.3083 250 6.3550
6.2465 0.3699 300 6.2113
6.1436 0.4316 350 6.0884
6.0491 0.4932 400 5.9975
5.975 0.5549 450 5.9086
5.8884 0.6165 500 5.8378
5.8221 0.6782 550 5.7791
5.8462 0.7398 600 5.7297
5.7512 0.8015 650 5.6896
5.6931 0.8631 700 5.6288
5.6135 0.9248 750 5.5984
5.5411 0.9864 800 5.5516
5.4014 1.0481 850 5.5243
5.3431 1.1097 900 5.4868
5.3665 1.1714 950 5.4619
5.3427 1.2330 1000 5.4313
5.2786 1.2947 1050 5.4047
5.3004 1.3564 1100 5.3722
5.279 1.4180 1150 5.3468
5.2892 1.4797 1200 5.3211
5.225 1.5413 1250 5.2964
5.243 1.6030 1300 5.2768
5.1481 1.6646 1350 5.2502
5.1373 1.7263 1400 5.2257
5.1689 1.7879 1450 5.2159
5.1515 1.8496 1500 5.1912
5.115 1.9112 1550 5.1717
5.1288 1.9729 1600 5.1469
4.911 2.0345 1650 5.1360
4.881 2.0962 1700 5.1215
4.8682 2.1578 1750 5.1092
4.9181 2.2195 1800 5.0962
4.904 2.2811 1850 5.0810
4.9309 2.3428 1900 5.0686
4.8559 2.4044 1950 5.0563
4.8654 2.4661 2000 5.0444
4.8656 2.5277 2050 5.0383
4.8428 2.5894 2100 5.0228
4.8463 2.6510 2150 5.0125
4.7709 2.7127 2200 5.0048
4.8147 2.7744 2250 4.9981
4.7904 2.8360 2300 4.9923
4.7581 2.8977 2350 4.9869
4.8169 2.9593 2400 4.9846

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
11
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zehao888/gpt2-mydataset

Finetuned
(2020)
this model