regex-GPT2pretrain

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.0016

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
6.6944 0.9119 300 6.2799
5.8024 1.8237 600 5.7527
5.3033 2.7356 900 5.5474
5.0624 3.6474 1200 5.4211
4.7656 4.5593 1500 5.3018
4.5517 5.4711 1800 5.2248
4.3684 6.3830 2100 5.1723
4.2025 7.2948 2400 5.1078
4.0043 8.2067 2700 5.0954
3.8377 9.1185 3000 5.0395
3.6478 10.0304 3300 5.0016
3.478 10.9422 3600 5.0053
3.2909 11.8541 3900 5.0227

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
19
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zehao888/regex-GPT2pretrain

Finetuned
(2023)
this model
Finetunes
1 model