AdoCleanCode commited on
Commit
9df8779
·
verified ·
1 Parent(s): e7d7d87

Model save

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 4.8017
20
 
21
  ## Model description
22
 
@@ -35,70 +35,25 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 1e-05
39
  - train_batch_size: 8
40
  - eval_batch_size: 16
41
  - seed: 42
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
- - lr_scheduler_warmup_steps: 100
45
- - num_epochs: 50
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:-----:|:---------------:|
52
- | 4.1358 | 1.0 | 285 | 4.0575 |
53
- | 3.8365 | 2.0 | 570 | 3.9368 |
54
- | 3.7408 | 3.0 | 855 | 3.9005 |
55
- | 3.8123 | 4.0 | 1140 | 3.8798 |
56
- | 3.637 | 5.0 | 1425 | 3.8924 |
57
- | 3.2996 | 6.0 | 1710 | 3.8942 |
58
- | 3.5063 | 7.0 | 1995 | 3.8985 |
59
- | 3.4073 | 8.0 | 2280 | 3.9164 |
60
- | 3.2897 | 9.0 | 2565 | 3.9335 |
61
- | 3.2355 | 10.0 | 2850 | 3.9501 |
62
- | 3.2153 | 11.0 | 3135 | 3.9670 |
63
- | 3.0633 | 12.0 | 3420 | 3.9940 |
64
- | 3.0258 | 13.0 | 3705 | 4.0125 |
65
- | 2.8951 | 14.0 | 3990 | 4.0485 |
66
- | 2.9628 | 15.0 | 4275 | 4.0615 |
67
- | 2.6961 | 16.0 | 4560 | 4.0907 |
68
- | 2.8086 | 17.0 | 4845 | 4.1207 |
69
- | 2.7014 | 18.0 | 5130 | 4.1463 |
70
- | 2.6813 | 19.0 | 5415 | 4.1685 |
71
- | 2.5686 | 20.0 | 5700 | 4.2127 |
72
- | 2.4509 | 21.0 | 5985 | 4.2431 |
73
- | 2.5327 | 22.0 | 6270 | 4.2569 |
74
- | 2.4029 | 23.0 | 6555 | 4.3080 |
75
- | 2.5409 | 24.0 | 6840 | 4.3201 |
76
- | 2.4863 | 25.0 | 7125 | 4.3456 |
77
- | 2.2923 | 26.0 | 7410 | 4.4077 |
78
- | 2.3704 | 27.0 | 7695 | 4.4213 |
79
- | 2.239 | 28.0 | 7980 | 4.4589 |
80
- | 2.4065 | 29.0 | 8265 | 4.4888 |
81
- | 2.1606 | 30.0 | 8550 | 4.5011 |
82
- | 2.3792 | 31.0 | 8835 | 4.5244 |
83
- | 2.0402 | 32.0 | 9120 | 4.5647 |
84
- | 2.2368 | 33.0 | 9405 | 4.5788 |
85
- | 2.1341 | 34.0 | 9690 | 4.6060 |
86
- | 2.0746 | 35.0 | 9975 | 4.6244 |
87
- | 2.1967 | 36.0 | 10260 | 4.6548 |
88
- | 2.0212 | 37.0 | 10545 | 4.6723 |
89
- | 2.0272 | 38.0 | 10830 | 4.6886 |
90
- | 2.0901 | 39.0 | 11115 | 4.7127 |
91
- | 2.1051 | 40.0 | 11400 | 4.7235 |
92
- | 2.0967 | 41.0 | 11685 | 4.7322 |
93
- | 1.9759 | 42.0 | 11970 | 4.7475 |
94
- | 1.9597 | 43.0 | 12255 | 4.7659 |
95
- | 1.9472 | 44.0 | 12540 | 4.7717 |
96
- | 1.9566 | 45.0 | 12825 | 4.7852 |
97
- | 2.1209 | 46.0 | 13110 | 4.7891 |
98
- | 1.9769 | 47.0 | 13395 | 4.7927 |
99
- | 1.8431 | 48.0 | 13680 | 4.7993 |
100
- | 1.8459 | 49.0 | 13965 | 4.8010 |
101
- | 2.0649 | 50.0 | 14250 | 4.8017 |
102
 
103
 
104
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.3080
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 2e-05
39
  - train_batch_size: 8
40
  - eval_batch_size: 16
41
  - seed: 42
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
+ - lr_scheduler_warmup_steps: 400
45
+ - num_epochs: 5
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:-----:|:---------------:|
52
+ | 3.3201 | 1.0 | 7590 | 3.4258 |
53
+ | 3.2002 | 2.0 | 15180 | 3.3526 |
54
+ | 3.1497 | 3.0 | 22770 | 3.3187 |
55
+ | 3.0062 | 4.0 | 30360 | 3.3028 |
56
+ | 3.0219 | 5.0 | 37950 | 3.3080 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
 
59
  ### Framework versions
logs/events.out.tfevents.1763566184.tikgpu10.939660.4 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:18c5c28c7631483fed0a36697c53dcb0b87ffa030cf11e8a6407384088ddbc97
3
- size 815769
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0e5cfdeb7b4fa4856ffd88ea080b6fdadb5412c31231f10c7a8addc5510b4a4
3
+ size 816405
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:442d22f97b943bc6085262e3c199f4ad782bdee4a2f7c1b31afa4239bbcc3039
3
  size 497774208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19b98e4b0580bf736acb417bd509373be5a8b2d0bf2f26af29476ccf4d346055
3
  size 497774208