yalhessi commited on
Commit
622f0ec
·
verified ·
1 Parent(s): 47cd533

End of training

Browse files
Files changed (1) hide show
  1. README.md +65 -65
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.1511
20
 
21
  ## Model description
22
 
@@ -52,71 +52,71 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-------:|:-----:|:---------------:|
55
- | 0.3742 | 0.2001 | 720 | 0.2844 |
56
- | 0.2859 | 0.4001 | 1440 | 0.2708 |
57
- | 0.2527 | 0.6002 | 2160 | 0.2488 |
58
- | 0.2441 | 0.8002 | 2880 | 0.2391 |
59
- | 0.2331 | 1.0003 | 3600 | 0.2389 |
60
- | 0.2201 | 1.2003 | 4320 | 0.2332 |
61
- | 0.2157 | 1.4004 | 5040 | 0.2221 |
62
- | 0.2122 | 1.6004 | 5760 | 0.2217 |
63
- | 0.2158 | 1.8005 | 6480 | 0.2127 |
64
- | 0.2063 | 2.0006 | 7200 | 0.2054 |
65
- | 0.1961 | 2.2006 | 7920 | 0.2097 |
66
- | 0.1942 | 2.4007 | 8640 | 0.2018 |
67
- | 0.1904 | 2.6007 | 9360 | 0.1997 |
68
- | 0.1914 | 2.8008 | 10080 | 0.2001 |
69
- | 0.193 | 3.0008 | 10800 | 0.1980 |
70
- | 0.1803 | 3.2009 | 11520 | 0.1980 |
71
- | 0.1785 | 3.4009 | 12240 | 0.1982 |
72
- | 0.1758 | 3.6010 | 12960 | 0.1906 |
73
- | 0.1756 | 3.8011 | 13680 | 0.1871 |
74
- | 0.1773 | 4.0011 | 14400 | 0.1877 |
75
- | 0.1631 | 4.2012 | 15120 | 0.1840 |
76
- | 0.1665 | 4.4012 | 15840 | 0.1805 |
77
- | 0.1625 | 4.6013 | 16560 | 0.1867 |
78
- | 0.164 | 4.8013 | 17280 | 0.1768 |
79
- | 0.1593 | 5.0014 | 18000 | 0.1796 |
80
- | 0.1508 | 5.2014 | 18720 | 0.1730 |
81
- | 0.151 | 5.4015 | 19440 | 0.1723 |
82
- | 0.1519 | 5.6016 | 20160 | 0.1722 |
83
- | 0.1522 | 5.8016 | 20880 | 0.1705 |
84
- | 0.1465 | 6.0017 | 21600 | 0.1705 |
85
- | 0.1391 | 6.2017 | 22320 | 0.1641 |
86
- | 0.1392 | 6.4018 | 23040 | 0.1660 |
87
- | 0.1374 | 6.6018 | 23760 | 0.1610 |
88
- | 0.1369 | 6.8019 | 24480 | 0.1630 |
89
- | 0.1358 | 7.0019 | 25200 | 0.1565 |
90
- | 0.1258 | 7.2020 | 25920 | 0.1624 |
91
- | 0.1259 | 7.4021 | 26640 | 0.1605 |
92
- | 0.1268 | 7.6021 | 27360 | 0.1555 |
93
- | 0.1231 | 7.8022 | 28080 | 0.1508 |
94
- | 0.1244 | 8.0022 | 28800 | 0.1523 |
95
- | 0.1125 | 8.2023 | 29520 | 0.1530 |
96
- | 0.1107 | 8.4023 | 30240 | 0.1507 |
97
- | 0.1125 | 8.6024 | 30960 | 0.1531 |
98
- | 0.1109 | 8.8024 | 31680 | 0.1498 |
99
- | 0.1115 | 9.0025 | 32400 | 0.1488 |
100
- | 0.0983 | 9.2026 | 33120 | 0.1506 |
101
- | 0.0992 | 9.4026 | 33840 | 0.1500 |
102
- | 0.0999 | 9.6027 | 34560 | 0.1479 |
103
- | 0.0994 | 9.8027 | 35280 | 0.1443 |
104
- | 0.0951 | 10.0028 | 36000 | 0.1479 |
105
- | 0.0856 | 10.2028 | 36720 | 0.1501 |
106
- | 0.0847 | 10.4029 | 37440 | 0.1473 |
107
- | 0.0856 | 10.6029 | 38160 | 0.1471 |
108
- | 0.0847 | 10.8030 | 38880 | 0.1452 |
109
- | 0.0859 | 11.0031 | 39600 | 0.1483 |
110
- | 0.0759 | 11.2031 | 40320 | 0.1518 |
111
- | 0.0749 | 11.4032 | 41040 | 0.1502 |
112
- | 0.0743 | 11.6032 | 41760 | 0.1519 |
113
- | 0.0741 | 11.8033 | 42480 | 0.1511 |
114
 
115
 
116
  ### Framework versions
117
 
118
- - PEFT 0.15.2
119
- - Transformers 4.57.1
120
- - Pytorch 2.7.0+cu126
121
- - Datasets 4.3.0
122
- - Tokenizers 0.22.1
 
16
 
17
  This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.1539
20
 
21
  ## Model description
22
 
 
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-------:|:-----:|:---------------:|
55
+ | 0.377 | 0.2001 | 720 | 0.2894 |
56
+ | 0.2824 | 0.4001 | 1440 | 0.2605 |
57
+ | 0.2488 | 0.6002 | 2160 | 0.2486 |
58
+ | 0.2424 | 0.8002 | 2880 | 0.2434 |
59
+ | 0.2307 | 1.0003 | 3600 | 0.2332 |
60
+ | 0.2171 | 1.2003 | 4320 | 0.2317 |
61
+ | 0.215 | 1.4004 | 5040 | 0.2200 |
62
+ | 0.2101 | 1.6004 | 5760 | 0.2177 |
63
+ | 0.209 | 1.8005 | 6480 | 0.2163 |
64
+ | 0.2047 | 2.0006 | 7200 | 0.2175 |
65
+ | 0.1972 | 2.2006 | 7920 | 0.2108 |
66
+ | 0.1934 | 2.4007 | 8640 | 0.2045 |
67
+ | 0.1906 | 2.6007 | 9360 | 0.1991 |
68
+ | 0.1889 | 2.8008 | 10080 | 0.2008 |
69
+ | 0.1911 | 3.0008 | 10800 | 0.2007 |
70
+ | 0.1789 | 3.2009 | 11520 | 0.1973 |
71
+ | 0.1799 | 3.4009 | 12240 | 0.1947 |
72
+ | 0.1769 | 3.6010 | 12960 | 0.1933 |
73
+ | 0.176 | 3.8011 | 13680 | 0.1897 |
74
+ | 0.1753 | 4.0011 | 14400 | 0.1833 |
75
+ | 0.1645 | 4.2012 | 15120 | 0.1833 |
76
+ | 0.1684 | 4.4012 | 15840 | 0.1864 |
77
+ | 0.1626 | 4.6013 | 16560 | 0.1839 |
78
+ | 0.1649 | 4.8013 | 17280 | 0.1761 |
79
+ | 0.1613 | 5.0014 | 18000 | 0.1803 |
80
+ | 0.1523 | 5.2014 | 18720 | 0.1755 |
81
+ | 0.1531 | 5.4015 | 19440 | 0.1766 |
82
+ | 0.1539 | 5.6016 | 20160 | 0.1726 |
83
+ | 0.1527 | 5.8016 | 20880 | 0.1778 |
84
+ | 0.1483 | 6.0017 | 21600 | 0.1640 |
85
+ | 0.1404 | 6.2017 | 22320 | 0.1684 |
86
+ | 0.1426 | 6.4018 | 23040 | 0.1666 |
87
+ | 0.1407 | 6.6018 | 23760 | 0.1650 |
88
+ | 0.1413 | 6.8019 | 24480 | 0.1635 |
89
+ | 0.1386 | 7.0019 | 25200 | 0.1640 |
90
+ | 0.1307 | 7.2020 | 25920 | 0.1627 |
91
+ | 0.1291 | 7.4021 | 26640 | 0.1620 |
92
+ | 0.1293 | 7.6021 | 27360 | 0.1626 |
93
+ | 0.128 | 7.8022 | 28080 | 0.1584 |
94
+ | 0.1288 | 8.0022 | 28800 | 0.1580 |
95
+ | 0.1182 | 8.2023 | 29520 | 0.1571 |
96
+ | 0.1166 | 8.4023 | 30240 | 0.1539 |
97
+ | 0.1178 | 8.6024 | 30960 | 0.1571 |
98
+ | 0.1154 | 8.8024 | 31680 | 0.1547 |
99
+ | 0.1163 | 9.0025 | 32400 | 0.1549 |
100
+ | 0.1056 | 9.2026 | 33120 | 0.1567 |
101
+ | 0.1046 | 9.4026 | 33840 | 0.1526 |
102
+ | 0.1063 | 9.6027 | 34560 | 0.1542 |
103
+ | 0.1063 | 9.8027 | 35280 | 0.1481 |
104
+ | 0.1022 | 10.0028 | 36000 | 0.1518 |
105
+ | 0.0939 | 10.2028 | 36720 | 0.1546 |
106
+ | 0.0933 | 10.4029 | 37440 | 0.1507 |
107
+ | 0.0933 | 10.6029 | 38160 | 0.1510 |
108
+ | 0.0927 | 10.8030 | 38880 | 0.1494 |
109
+ | 0.0938 | 11.0031 | 39600 | 0.1498 |
110
+ | 0.085 | 11.2031 | 40320 | 0.1557 |
111
+ | 0.0839 | 11.4032 | 41040 | 0.1534 |
112
+ | 0.0828 | 11.6032 | 41760 | 0.1534 |
113
+ | 0.0829 | 11.8033 | 42480 | 0.1539 |
114
 
115
 
116
  ### Framework versions
117
 
118
+ - PEFT 0.14.0
119
+ - Transformers 4.47.0
120
+ - Pytorch 2.5.1+cu124
121
+ - Datasets 4.2.0
122
+ - Tokenizers 0.21.0