chessllm_FPT
This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7339
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 1
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.0862 | 0.0100 | 169 | 1.0676 |
| 0.9835 | 0.0200 | 338 | 1.0131 |
| 0.9741 | 0.0300 | 507 | 0.9742 |
| 0.9371 | 0.0400 | 676 | 0.9535 |
| 0.9055 | 0.0500 | 845 | 0.9388 |
| 0.8902 | 0.0600 | 1014 | 0.9213 |
| 0.9321 | 0.0700 | 1183 | 0.9080 |
| 0.9259 | 0.0800 | 1352 | 0.8983 |
| 0.8817 | 0.0900 | 1521 | 0.8895 |
| 0.8814 | 0.1000 | 1690 | 0.8822 |
| 0.8637 | 0.1100 | 1859 | 0.8758 |
| 0.8912 | 0.1200 | 2028 | 0.8714 |
| 0.8818 | 0.1300 | 2197 | 0.8644 |
| 0.8598 | 0.1400 | 2366 | 0.8553 |
| 0.8799 | 0.1500 | 2535 | 0.8529 |
| 0.8408 | 0.1600 | 2704 | 0.8490 |
| 0.8558 | 0.1700 | 2873 | 0.8473 |
| 0.86 | 0.1800 | 3042 | 0.8415 |
| 0.8346 | 0.1900 | 3211 | 0.8407 |
| 0.8063 | 0.2000 | 3380 | 0.8324 |
| 0.8222 | 0.2100 | 3549 | 0.8292 |
| 0.8259 | 0.2200 | 3718 | 0.8272 |
| 0.7882 | 0.2300 | 3887 | 0.8249 |
| 0.8147 | 0.2400 | 4056 | 0.8219 |
| 0.7923 | 0.2500 | 4225 | 0.8197 |
| 0.8191 | 0.2600 | 4394 | 0.8156 |
| 0.7878 | 0.2700 | 4563 | 0.8162 |
| 0.8186 | 0.2800 | 4732 | 0.8102 |
| 0.8267 | 0.2900 | 4901 | 0.8101 |
| 0.7905 | 0.3000 | 5070 | 0.8060 |
| 0.8469 | 0.3100 | 5239 | 0.8046 |
| 0.8046 | 0.3200 | 5408 | 0.8000 |
| 0.8264 | 0.3300 | 5577 | 0.7998 |
| 0.7819 | 0.3400 | 5746 | 0.7966 |
| 0.7875 | 0.3500 | 5915 | 0.7961 |
| 0.8016 | 0.3600 | 6084 | 0.7923 |
| 0.7858 | 0.3700 | 6253 | 0.7892 |
| 0.7763 | 0.3800 | 6422 | 0.7902 |
| 0.7857 | 0.3900 | 6591 | 0.7866 |
| 0.7835 | 0.4000 | 6760 | 0.7866 |
| 0.774 | 0.4100 | 6929 | 0.7825 |
| 0.7452 | 0.4200 | 7098 | 0.7845 |
| 0.7758 | 0.4301 | 7267 | 0.7812 |
| 0.799 | 0.4401 | 7436 | 0.7797 |
| 0.789 | 0.4501 | 7605 | 0.7780 |
| 0.7706 | 0.4601 | 7774 | 0.7768 |
| 0.7434 | 0.4701 | 7943 | 0.7759 |
| 0.7912 | 0.4801 | 8112 | 0.7737 |
| 0.8031 | 0.4901 | 8281 | 0.7715 |
| 0.8049 | 0.5001 | 8450 | 0.7693 |
| 0.7418 | 0.5101 | 8619 | 0.7681 |
| 0.766 | 0.5201 | 8788 | 0.7695 |
| 0.7152 | 0.5301 | 8957 | 0.7665 |
| 0.7679 | 0.5401 | 9126 | 0.7641 |
| 0.7493 | 0.5501 | 9295 | 0.7629 |
| 0.7385 | 0.5601 | 9464 | 0.7621 |
| 0.7494 | 0.5701 | 9633 | 0.7603 |
| 0.7725 | 0.5801 | 9802 | 0.7589 |
| 0.7643 | 0.5901 | 9971 | 0.7587 |
| 0.769 | 0.6001 | 10140 | 0.7574 |
| 0.7869 | 0.6101 | 10309 | 0.7569 |
| 0.7574 | 0.6201 | 10478 | 0.7545 |
| 0.735 | 0.6301 | 10647 | 0.7532 |
| 0.7243 | 0.6401 | 10816 | 0.7522 |
| 0.7368 | 0.6501 | 10985 | 0.7525 |
| 0.7638 | 0.6601 | 11154 | 0.7509 |
| 0.7481 | 0.6701 | 11323 | 0.7494 |
| 0.7223 | 0.6801 | 11492 | 0.7490 |
| 0.7219 | 0.6901 | 11661 | 0.7480 |
| 0.7361 | 0.7001 | 11830 | 0.7475 |
| 0.768 | 0.7101 | 11999 | 0.7461 |
| 0.7219 | 0.7201 | 12168 | 0.7455 |
| 0.7563 | 0.7301 | 12337 | 0.7446 |
| 0.7458 | 0.7401 | 12506 | 0.7438 |
| 0.7455 | 0.7501 | 12675 | 0.7425 |
| 0.7453 | 0.7601 | 12844 | 0.7423 |
| 0.7596 | 0.7701 | 13013 | 0.7407 |
| 0.7102 | 0.7801 | 13182 | 0.7405 |
| 0.7102 | 0.7901 | 13351 | 0.7402 |
| 0.7293 | 0.8001 | 13520 | 0.7394 |
| 0.7044 | 0.8101 | 13689 | 0.7390 |
| 0.7897 | 0.8201 | 13858 | 0.7381 |
| 0.7463 | 0.8301 | 14027 | 0.7377 |
| 0.7306 | 0.8401 | 14196 | 0.7371 |
| 0.765 | 0.8501 | 14365 | 0.7369 |
| 0.755 | 0.8601 | 14534 | 0.7366 |
| 0.7377 | 0.8701 | 14703 | 0.7360 |
| 0.7587 | 0.8801 | 14872 | 0.7358 |
| 0.7111 | 0.8901 | 15041 | 0.7352 |
| 0.7151 | 0.9001 | 15210 | 0.7349 |
| 0.7218 | 0.9101 | 15379 | 0.7348 |
| 0.7137 | 0.9201 | 15548 | 0.7346 |
| 0.7555 | 0.9301 | 15717 | 0.7344 |
| 0.7076 | 0.9401 | 15886 | 0.7342 |
| 0.732 | 0.9501 | 16055 | 0.7341 |
| 0.7208 | 0.9601 | 16224 | 0.7340 |
| 0.7199 | 0.9701 | 16393 | 0.7340 |
| 0.7498 | 0.9801 | 16562 | 0.7339 |
| 0.7434 | 0.9901 | 16731 | 0.7339 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 18
Model tree for huyvux3005/chessllm_FPT
Base model
EleutherAI/pythia-70m-deduped