chessllm_FPT

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7339

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.0862 0.0100 169 1.0676
0.9835 0.0200 338 1.0131
0.9741 0.0300 507 0.9742
0.9371 0.0400 676 0.9535
0.9055 0.0500 845 0.9388
0.8902 0.0600 1014 0.9213
0.9321 0.0700 1183 0.9080
0.9259 0.0800 1352 0.8983
0.8817 0.0900 1521 0.8895
0.8814 0.1000 1690 0.8822
0.8637 0.1100 1859 0.8758
0.8912 0.1200 2028 0.8714
0.8818 0.1300 2197 0.8644
0.8598 0.1400 2366 0.8553
0.8799 0.1500 2535 0.8529
0.8408 0.1600 2704 0.8490
0.8558 0.1700 2873 0.8473
0.86 0.1800 3042 0.8415
0.8346 0.1900 3211 0.8407
0.8063 0.2000 3380 0.8324
0.8222 0.2100 3549 0.8292
0.8259 0.2200 3718 0.8272
0.7882 0.2300 3887 0.8249
0.8147 0.2400 4056 0.8219
0.7923 0.2500 4225 0.8197
0.8191 0.2600 4394 0.8156
0.7878 0.2700 4563 0.8162
0.8186 0.2800 4732 0.8102
0.8267 0.2900 4901 0.8101
0.7905 0.3000 5070 0.8060
0.8469 0.3100 5239 0.8046
0.8046 0.3200 5408 0.8000
0.8264 0.3300 5577 0.7998
0.7819 0.3400 5746 0.7966
0.7875 0.3500 5915 0.7961
0.8016 0.3600 6084 0.7923
0.7858 0.3700 6253 0.7892
0.7763 0.3800 6422 0.7902
0.7857 0.3900 6591 0.7866
0.7835 0.4000 6760 0.7866
0.774 0.4100 6929 0.7825
0.7452 0.4200 7098 0.7845
0.7758 0.4301 7267 0.7812
0.799 0.4401 7436 0.7797
0.789 0.4501 7605 0.7780
0.7706 0.4601 7774 0.7768
0.7434 0.4701 7943 0.7759
0.7912 0.4801 8112 0.7737
0.8031 0.4901 8281 0.7715
0.8049 0.5001 8450 0.7693
0.7418 0.5101 8619 0.7681
0.766 0.5201 8788 0.7695
0.7152 0.5301 8957 0.7665
0.7679 0.5401 9126 0.7641
0.7493 0.5501 9295 0.7629
0.7385 0.5601 9464 0.7621
0.7494 0.5701 9633 0.7603
0.7725 0.5801 9802 0.7589
0.7643 0.5901 9971 0.7587
0.769 0.6001 10140 0.7574
0.7869 0.6101 10309 0.7569
0.7574 0.6201 10478 0.7545
0.735 0.6301 10647 0.7532
0.7243 0.6401 10816 0.7522
0.7368 0.6501 10985 0.7525
0.7638 0.6601 11154 0.7509
0.7481 0.6701 11323 0.7494
0.7223 0.6801 11492 0.7490
0.7219 0.6901 11661 0.7480
0.7361 0.7001 11830 0.7475
0.768 0.7101 11999 0.7461
0.7219 0.7201 12168 0.7455
0.7563 0.7301 12337 0.7446
0.7458 0.7401 12506 0.7438
0.7455 0.7501 12675 0.7425
0.7453 0.7601 12844 0.7423
0.7596 0.7701 13013 0.7407
0.7102 0.7801 13182 0.7405
0.7102 0.7901 13351 0.7402
0.7293 0.8001 13520 0.7394
0.7044 0.8101 13689 0.7390
0.7897 0.8201 13858 0.7381
0.7463 0.8301 14027 0.7377
0.7306 0.8401 14196 0.7371
0.765 0.8501 14365 0.7369
0.755 0.8601 14534 0.7366
0.7377 0.8701 14703 0.7360
0.7587 0.8801 14872 0.7358
0.7111 0.8901 15041 0.7352
0.7151 0.9001 15210 0.7349
0.7218 0.9101 15379 0.7348
0.7137 0.9201 15548 0.7346
0.7555 0.9301 15717 0.7344
0.7076 0.9401 15886 0.7342
0.732 0.9501 16055 0.7341
0.7208 0.9601 16224 0.7340
0.7199 0.9701 16393 0.7340
0.7498 0.9801 16562 0.7339
0.7434 0.9901 16731 0.7339

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
18
Safetensors
Model size
70.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for huyvux3005/chessllm_FPT

Finetuned
(56)
this model