my_awesome_power_model_llmv2
This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.0347
- Epoch: 599
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 5e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
| Train Loss | Epoch | 
|---|---|
| 14.1299 | 0 | 
| 3.0898 | 1 | 
| 2.8086 | 2 | 
| 2.6899 | 3 | 
| 2.5834 | 4 | 
| 2.5116 | 5 | 
| 2.4435 | 6 | 
| 2.3961 | 7 | 
| 2.3446 | 8 | 
| 2.3011 | 9 | 
| 2.2651 | 10 | 
| 2.2280 | 11 | 
| 2.2007 | 12 | 
| 2.1640 | 13 | 
| 2.1350 | 14 | 
| 2.1105 | 15 | 
| 2.0776 | 16 | 
| 2.0486 | 17 | 
| 2.0297 | 18 | 
| 2.0114 | 19 | 
| 1.9887 | 20 | 
| 1.9679 | 21 | 
| 1.9495 | 22 | 
| 1.9376 | 23 | 
| 1.9145 | 24 | 
| 1.9036 | 25 | 
| 1.8915 | 26 | 
| 1.8738 | 27 | 
| 1.8624 | 28 | 
| 1.8496 | 29 | 
| 1.8310 | 30 | 
| 1.8196 | 31 | 
| 1.8074 | 32 | 
| 1.8021 | 33 | 
| 1.7813 | 34 | 
| 1.7681 | 35 | 
| 1.7548 | 36 | 
| 1.7386 | 37 | 
| 1.7325 | 38 | 
| 1.7149 | 39 | 
| 1.7051 | 40 | 
| 1.7001 | 41 | 
| 1.6815 | 42 | 
| 1.6765 | 43 | 
| 1.6667 | 44 | 
| 1.6528 | 45 | 
| 1.6373 | 46 | 
| 1.6269 | 47 | 
| 1.6237 | 48 | 
| 1.6046 | 49 | 
| 1.6005 | 50 | 
| 1.5919 | 51 | 
| 1.5767 | 52 | 
| 1.5617 | 53 | 
| 1.5556 | 54 | 
| 1.5461 | 55 | 
| 1.5311 | 56 | 
| 1.5313 | 57 | 
| 1.5116 | 58 | 
| 1.5020 | 59 | 
| 1.4975 | 60 | 
| 1.4897 | 61 | 
| 1.4834 | 62 | 
| 1.4677 | 63 | 
| 1.4672 | 64 | 
| 1.4470 | 65 | 
| 1.4409 | 66 | 
| 1.4284 | 67 | 
| 1.4202 | 68 | 
| 1.4174 | 69 | 
| 1.4007 | 70 | 
| 1.3930 | 71 | 
| 1.3868 | 72 | 
| 1.3702 | 73 | 
| 1.3636 | 74 | 
| 1.3557 | 75 | 
| 1.3417 | 76 | 
| 1.3321 | 77 | 
| 1.3206 | 78 | 
| 1.3135 | 79 | 
| 1.3087 | 80 | 
| 1.2974 | 81 | 
| 1.2856 | 82 | 
| 1.2734 | 83 | 
| 1.2660 | 84 | 
| 1.2571 | 85 | 
| 1.2528 | 86 | 
| 1.2330 | 87 | 
| 1.2214 | 88 | 
| 1.2126 | 89 | 
| 1.2075 | 90 | 
| 1.1932 | 91 | 
| 1.1928 | 92 | 
| 1.1717 | 93 | 
| 1.1691 | 94 | 
| 1.1618 | 95 | 
| 1.1453 | 96 | 
| 1.1308 | 97 | 
| 1.1287 | 98 | 
| 1.1187 | 99 | 
| 1.1003 | 100 | 
| 1.0947 | 101 | 
| 1.0822 | 102 | 
| 1.0749 | 103 | 
| 1.0659 | 104 | 
| 1.0546 | 105 | 
| 1.0412 | 106 | 
| 1.0274 | 107 | 
| 1.0248 | 108 | 
| 1.0100 | 109 | 
| 1.0050 | 110 | 
| 0.9935 | 111 | 
| 0.9798 | 112 | 
| 0.9733 | 113 | 
| 0.9604 | 114 | 
| 0.9530 | 115 | 
| 0.9407 | 116 | 
| 0.9290 | 117 | 
| 0.9217 | 118 | 
| 0.9095 | 119 | 
| 0.8929 | 120 | 
| 0.8860 | 121 | 
| 0.8786 | 122 | 
| 0.8684 | 123 | 
| 0.8585 | 124 | 
| 0.8445 | 125 | 
| 0.8398 | 126 | 
| 0.8181 | 127 | 
| 0.8183 | 128 | 
| 0.8030 | 129 | 
| 0.7919 | 130 | 
| 0.7851 | 131 | 
| 0.7743 | 132 | 
| 0.7578 | 133 | 
| 0.7449 | 134 | 
| 0.7329 | 135 | 
| 0.7267 | 136 | 
| 0.7178 | 137 | 
| 0.7089 | 138 | 
| 0.7000 | 139 | 
| 0.6948 | 140 | 
| 0.6842 | 141 | 
| 0.6637 | 142 | 
| 0.6546 | 143 | 
| 0.6454 | 144 | 
| 0.6348 | 145 | 
| 0.6270 | 146 | 
| 0.6150 | 147 | 
| 0.6002 | 148 | 
| 0.5899 | 149 | 
| 0.5803 | 150 | 
| 0.5709 | 151 | 
| 0.5600 | 152 | 
| 0.5534 | 153 | 
| 0.5429 | 154 | 
| 0.5266 | 155 | 
| 0.5207 | 156 | 
| 0.5096 | 157 | 
| 0.4978 | 158 | 
| 0.4878 | 159 | 
| 0.4752 | 160 | 
| 0.4752 | 161 | 
| 0.4633 | 162 | 
| 0.4580 | 163 | 
| 0.4411 | 164 | 
| 0.4268 | 165 | 
| 0.4262 | 166 | 
| 0.4107 | 167 | 
| 0.4053 | 168 | 
| 0.3935 | 169 | 
| 0.4129 | 170 | 
| 0.3874 | 171 | 
| 0.3766 | 172 | 
| 0.3688 | 173 | 
| 0.3505 | 174 | 
| 0.3534 | 175 | 
| 0.3403 | 176 | 
| 0.3310 | 177 | 
| 0.3242 | 178 | 
| 0.3188 | 179 | 
| 0.3130 | 180 | 
| 0.3023 | 181 | 
| 0.2953 | 182 | 
| 0.2907 | 183 | 
| 0.2819 | 184 | 
| 0.2731 | 185 | 
| 0.2706 | 186 | 
| 0.2671 | 187 | 
| 0.2567 | 188 | 
| 0.2512 | 189 | 
| 0.2441 | 190 | 
| 0.2428 | 191 | 
| 0.2378 | 192 | 
| 0.2322 | 193 | 
| 0.2246 | 194 | 
| 0.2223 | 195 | 
| 0.2196 | 196 | 
| 0.2091 | 197 | 
| 0.2052 | 198 | 
| 0.2019 | 199 | 
| 0.2011 | 200 | 
| 0.1975 | 201 | 
| 0.1963 | 202 | 
| 0.1917 | 203 | 
| 0.1898 | 204 | 
| 0.1829 | 205 | 
| 0.1791 | 206 | 
| 0.1733 | 207 | 
| 0.1706 | 208 | 
| 0.1683 | 209 | 
| 0.1646 | 210 | 
| 0.1645 | 211 | 
| 0.1581 | 212 | 
| 0.1533 | 213 | 
| 0.1568 | 214 | 
| 0.1499 | 215 | 
| 0.1490 | 216 | 
| 0.1460 | 217 | 
| 0.1426 | 218 | 
| 0.1444 | 219 | 
| 0.1391 | 220 | 
| 0.1390 | 221 | 
| 0.1380 | 222 | 
| 0.1336 | 223 | 
| 0.1322 | 224 | 
| 0.1316 | 225 | 
| 0.1262 | 226 | 
| 0.1231 | 227 | 
| 0.1235 | 228 | 
| 0.1260 | 229 | 
| 0.1242 | 230 | 
| 0.1218 | 231 | 
| 0.1167 | 232 | 
| 0.1174 | 233 | 
| 0.1169 | 234 | 
| 0.1164 | 235 | 
| 0.1133 | 236 | 
| 0.1138 | 237 | 
| 0.1100 | 238 | 
| 0.1107 | 239 | 
| 0.1079 | 240 | 
| 0.1059 | 241 | 
| 0.1068 | 242 | 
| 0.1023 | 243 | 
| 0.1063 | 244 | 
| 0.1005 | 245 | 
| 0.1014 | 246 | 
| 0.1004 | 247 | 
| 0.0994 | 248 | 
| 0.1061 | 249 | 
| 0.1004 | 250 | 
| 0.0942 | 251 | 
| 0.0975 | 252 | 
| 0.0957 | 253 | 
| 0.0933 | 254 | 
| 0.0924 | 255 | 
| 0.0921 | 256 | 
| 0.0912 | 257 | 
| 0.0897 | 258 | 
| 0.0893 | 259 | 
| 0.0835 | 260 | 
| 0.0861 | 261 | 
| 0.0860 | 262 | 
| 0.0819 | 263 | 
| 0.0830 | 264 | 
| 0.0823 | 265 | 
| 0.0836 | 266 | 
| 0.0800 | 267 | 
| 0.0797 | 268 | 
| 0.0808 | 269 | 
| 0.0785 | 270 | 
| 0.0770 | 271 | 
| 0.0776 | 272 | 
| 0.0780 | 273 | 
| 0.0744 | 274 | 
| 0.0790 | 275 | 
| 0.0765 | 276 | 
| 0.0769 | 277 | 
| 0.0725 | 278 | 
| 0.0740 | 279 | 
| 0.0718 | 280 | 
| 0.0760 | 281 | 
| 0.0741 | 282 | 
| 0.0728 | 283 | 
| 0.0721 | 284 | 
| 0.0726 | 285 | 
| 0.0691 | 286 | 
| 0.0709 | 287 | 
| 0.0710 | 288 | 
| 0.0666 | 289 | 
| 0.0675 | 290 | 
| 0.0690 | 291 | 
| 0.0720 | 292 | 
| 0.0693 | 293 | 
| 0.0685 | 294 | 
| 0.0649 | 295 | 
| 0.0666 | 296 | 
| 0.0669 | 297 | 
| 0.0662 | 298 | 
| 0.0648 | 299 | 
| 0.0663 | 300 | 
| 0.0660 | 301 | 
| 0.0638 | 302 | 
| 0.0628 | 303 | 
| 0.0621 | 304 | 
| 0.0631 | 305 | 
| 0.0611 | 306 | 
| 0.0640 | 307 | 
| 0.0622 | 308 | 
| 0.0643 | 309 | 
| 0.0622 | 310 | 
| 0.0623 | 311 | 
| 0.0607 | 312 | 
| 0.0603 | 313 | 
| 0.0591 | 314 | 
| 0.0620 | 315 | 
| 0.0609 | 316 | 
| 0.0596 | 317 | 
| 0.0594 | 318 | 
| 0.0608 | 319 | 
| 0.0606 | 320 | 
| 0.0587 | 321 | 
| 0.0620 | 322 | 
| 0.0601 | 323 | 
| 0.0590 | 324 | 
| 0.0600 | 325 | 
| 0.0576 | 326 | 
| 0.0581 | 327 | 
| 0.0556 | 328 | 
| 0.0588 | 329 | 
| 0.0561 | 330 | 
| 0.0563 | 331 | 
| 0.0554 | 332 | 
| 0.0596 | 333 | 
| 0.0570 | 334 | 
| 0.0570 | 335 | 
| 0.0552 | 336 | 
| 0.0566 | 337 | 
| 0.0526 | 338 | 
| 0.0528 | 339 | 
| 0.0527 | 340 | 
| 0.0554 | 341 | 
| 0.0574 | 342 | 
| 0.0543 | 343 | 
| 0.0553 | 344 | 
| 0.0530 | 345 | 
| 0.0537 | 346 | 
| 0.0537 | 347 | 
| 0.0536 | 348 | 
| 0.0526 | 349 | 
| 0.0512 | 350 | 
| 0.0506 | 351 | 
| 0.0510 | 352 | 
| 0.0514 | 353 | 
| 0.0496 | 354 | 
| 0.0500 | 355 | 
| 0.0525 | 356 | 
| 0.0533 | 357 | 
| 0.0509 | 358 | 
| 0.0520 | 359 | 
| 0.0523 | 360 | 
| 0.0508 | 361 | 
| 0.0517 | 362 | 
| 0.0513 | 363 | 
| 0.0519 | 364 | 
| 0.0505 | 365 | 
| 0.0490 | 366 | 
| 0.0496 | 367 | 
| 0.0504 | 368 | 
| 0.0467 | 369 | 
| 0.0481 | 370 | 
| 0.0465 | 371 | 
| 0.0480 | 372 | 
| 0.0450 | 373 | 
| 0.0481 | 374 | 
| 0.0515 | 375 | 
| 0.0489 | 376 | 
| 0.0488 | 377 | 
| 0.0481 | 378 | 
| 0.0483 | 379 | 
| 0.0480 | 380 | 
| 0.0490 | 381 | 
| 0.0476 | 382 | 
| 0.0469 | 383 | 
| 0.0489 | 384 | 
| 0.0478 | 385 | 
| 0.0456 | 386 | 
| 0.0465 | 387 | 
| 0.0467 | 388 | 
| 0.0494 | 389 | 
| 0.0506 | 390 | 
| 0.0477 | 391 | 
| 0.0483 | 392 | 
| 0.0449 | 393 | 
| 0.0471 | 394 | 
| 0.0444 | 395 | 
| 0.0469 | 396 | 
| 0.0481 | 397 | 
| 0.0456 | 398 | 
| 0.0448 | 399 | 
| 0.0435 | 400 | 
| 0.0430 | 401 | 
| 0.0441 | 402 | 
| 0.0445 | 403 | 
| 0.0464 | 404 | 
| 0.0469 | 405 | 
| 0.0443 | 406 | 
| 0.0472 | 407 | 
| 0.0458 | 408 | 
| 0.0445 | 409 | 
| 0.0438 | 410 | 
| 0.0443 | 411 | 
| 0.0447 | 412 | 
| 0.0445 | 413 | 
| 0.0436 | 414 | 
| 0.0435 | 415 | 
| 0.0427 | 416 | 
| 0.0429 | 417 | 
| 0.0430 | 418 | 
| 0.0437 | 419 | 
| 0.0445 | 420 | 
| 0.0427 | 421 | 
| 0.0447 | 422 | 
| 0.0447 | 423 | 
| 0.0436 | 424 | 
| 0.0449 | 425 | 
| 0.0445 | 426 | 
| 0.0444 | 427 | 
| 0.0439 | 428 | 
| 0.0426 | 429 | 
| 0.0440 | 430 | 
| 0.0425 | 431 | 
| 0.0418 | 432 | 
| 0.0423 | 433 | 
| 0.0437 | 434 | 
| 0.0431 | 435 | 
| 0.0430 | 436 | 
| 0.0398 | 437 | 
| 0.0405 | 438 | 
| 0.0398 | 439 | 
| 0.0416 | 440 | 
| 0.0407 | 441 | 
| 0.0413 | 442 | 
| 0.0428 | 443 | 
| 0.0414 | 444 | 
| 0.0435 | 445 | 
| 0.0425 | 446 | 
| 0.0411 | 447 | 
| 0.0414 | 448 | 
| 0.0415 | 449 | 
| 0.0436 | 450 | 
| 0.0424 | 451 | 
| 0.0429 | 452 | 
| 0.0400 | 453 | 
| 0.0414 | 454 | 
| 0.0393 | 455 | 
| 0.0389 | 456 | 
| 0.0395 | 457 | 
| 0.0403 | 458 | 
| 0.0386 | 459 | 
| 0.0399 | 460 | 
| 0.0390 | 461 | 
| 0.0379 | 462 | 
| 0.0403 | 463 | 
| 0.0400 | 464 | 
| 0.0396 | 465 | 
| 0.0394 | 466 | 
| 0.0387 | 467 | 
| 0.0401 | 468 | 
| 0.0394 | 469 | 
| 0.0392 | 470 | 
| 0.0418 | 471 | 
| 0.0407 | 472 | 
| 0.0392 | 473 | 
| 0.0414 | 474 | 
| 0.0406 | 475 | 
| 0.0407 | 476 | 
| 0.0409 | 477 | 
| 0.0393 | 478 | 
| 0.0411 | 479 | 
| 0.0399 | 480 | 
| 0.0398 | 481 | 
| 0.0403 | 482 | 
| 0.0382 | 483 | 
| 0.0381 | 484 | 
| 0.0373 | 485 | 
| 0.0390 | 486 | 
| 0.0375 | 487 | 
| 0.0371 | 488 | 
| 0.0393 | 489 | 
| 0.0382 | 490 | 
| 0.0397 | 491 | 
| 0.0389 | 492 | 
| 0.0400 | 493 | 
| 0.0387 | 494 | 
| 0.0388 | 495 | 
| 0.0383 | 496 | 
| 0.0366 | 497 | 
| 0.0380 | 498 | 
| 0.0379 | 499 | 
| 0.0390 | 500 | 
| 0.0401 | 501 | 
| 0.0392 | 502 | 
| 0.0368 | 503 | 
| 0.0386 | 504 | 
| 0.0369 | 505 | 
| 0.0373 | 506 | 
| 0.0376 | 507 | 
| 0.0380 | 508 | 
| 0.0374 | 509 | 
| 0.0401 | 510 | 
| 0.0391 | 511 | 
| 0.0373 | 512 | 
| 0.0383 | 513 | 
| 0.0372 | 514 | 
| 0.0378 | 515 | 
| 0.0384 | 516 | 
| 0.0371 | 517 | 
| 0.0359 | 518 | 
| 0.0354 | 519 | 
| 0.0366 | 520 | 
| 0.0442 | 521 | 
| 0.0393 | 522 | 
| 0.0378 | 523 | 
| 0.0370 | 524 | 
| 0.0382 | 525 | 
| 0.0366 | 526 | 
| 0.0380 | 527 | 
| 0.0370 | 528 | 
| 0.0393 | 529 | 
| 0.0361 | 530 | 
| 0.0364 | 531 | 
| 0.0390 | 532 | 
| 0.0371 | 533 | 
| 0.0367 | 534 | 
| 0.0376 | 535 | 
| 0.0365 | 536 | 
| 0.0371 | 537 | 
| 0.0374 | 538 | 
| 0.0378 | 539 | 
| 0.0355 | 540 | 
| 0.0352 | 541 | 
| 0.0342 | 542 | 
| 0.0348 | 543 | 
| 0.0361 | 544 | 
| 0.0380 | 545 | 
| 0.0367 | 546 | 
| 0.0354 | 547 | 
| 0.0341 | 548 | 
| 0.0352 | 549 | 
| 0.0344 | 550 | 
| 0.0348 | 551 | 
| 0.0354 | 552 | 
| 0.0370 | 553 | 
| 0.0379 | 554 | 
| 0.0362 | 555 | 
| 0.0366 | 556 | 
| 0.0369 | 557 | 
| 0.0355 | 558 | 
| 0.0359 | 559 | 
| 0.0371 | 560 | 
| 0.0359 | 561 | 
| 0.0344 | 562 | 
| 0.0355 | 563 | 
| 0.0361 | 564 | 
| 0.0345 | 565 | 
| 0.0345 | 566 | 
| 0.0348 | 567 | 
| 0.0343 | 568 | 
| 0.0340 | 569 | 
| 0.0351 | 570 | 
| 0.0344 | 571 | 
| 0.0341 | 572 | 
| 0.0350 | 573 | 
| 0.0341 | 574 | 
| 0.0347 | 575 | 
| 0.0336 | 576 | 
| 0.0339 | 577 | 
| 0.0334 | 578 | 
| 0.0340 | 579 | 
| 0.0349 | 580 | 
| 0.0356 | 581 | 
| 0.0353 | 582 | 
| 0.0356 | 583 | 
| 0.0369 | 584 | 
| 0.0360 | 585 | 
| 0.0358 | 586 | 
| 0.0354 | 587 | 
| 0.0350 | 588 | 
| 0.0359 | 589 | 
| 0.0363 | 590 | 
| 0.0342 | 591 | 
| 0.0355 | 592 | 
| 0.0352 | 593 | 
| 0.0337 | 594 | 
| 0.0333 | 595 | 
| 0.0343 | 596 | 
| 0.0352 | 597 | 
| 0.0333 | 598 | 
| 0.0347 | 599 | 
Framework versions
- Transformers 4.35.2
- TensorFlow 2.15.0
- Datasets 2.16.1
- Tokenizers 0.15.1
- Downloads last month
- 1
Model tree for MohamedAAK/my_awesome_power_model_llmv2
Base model
openai-community/gpt2