yalhessi
/

lemexp-task1-v3-template_small_nodefs-deepseek-coder-1.3b-base-8lr-12epochs-normal-eos

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions

xet

Community

yalhessi commited on 7 days ago

Commit

622f0ec

verified ·

1 Parent(s): 47cd533

End of training

Browse files

Files changed (1) hide show

README.md +65 -65

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1511
 ## Model description
@@ -52,71 +52,71 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step  | Validation Loss |
 |:-------------:|:-------:|:-----:|:---------------:|
-| 0.3742        | 0.2001  | 720   | 0.2844          |
-| 0.2859        | 0.4001  | 1440  | 0.2708          |
-| 0.2527        | 0.6002  | 2160  | 0.2488          |
-| 0.2441        | 0.8002  | 2880  | 0.2391          |
-| 0.2331        | 1.0003  | 3600  | 0.2389          |
-| 0.2201        | 1.2003  | 4320  | 0.2332          |
-| 0.2157        | 1.4004  | 5040  | 0.2221          |
-| 0.2122        | 1.6004  | 5760  | 0.2217          |
-| 0.2158        | 1.8005  | 6480  | 0.2127          |
-| 0.2063        | 2.0006  | 7200  | 0.2054          |
-| 0.1961        | 2.2006  | 7920  | 0.2097          |
-| 0.1942        | 2.4007  | 8640  | 0.2018          |
-| 0.1904        | 2.6007  | 9360  | 0.1997          |
-| 0.1914        | 2.8008  | 10080 | 0.2001          |
-| 0.193         | 3.0008  | 10800 | 0.1980          |
-| 0.1803        | 3.2009  | 11520 | 0.1980          |
-| 0.1785        | 3.4009  | 12240 | 0.1982          |
-| 0.1758        | 3.6010  | 12960 | 0.1906          |
-| 0.1756        | 3.8011  | 13680 | 0.1871          |
-| 0.1773        | 4.0011  | 14400 | 0.1877          |
-| 0.1631        | 4.2012  | 15120 | 0.1840          |
-| 0.1665        | 4.4012  | 15840 | 0.1805          |
-| 0.1625        | 4.6013  | 16560 | 0.1867          |
-| 0.164         | 4.8013  | 17280 | 0.1768          |
-| 0.1593        | 5.0014  | 18000 | 0.1796          |
-| 0.1508        | 5.2014  | 18720 | 0.1730          |
-| 0.151         | 5.4015  | 19440 | 0.1723          |
-| 0.1519        | 5.6016  | 20160 | 0.1722          |
-| 0.1522        | 5.8016  | 20880 | 0.1705          |
-| 0.1465        | 6.0017  | 21600 | 0.1705          |
-| 0.1391        | 6.2017  | 22320 | 0.1641          |
-| 0.1392        | 6.4018  | 23040 | 0.1660          |
-| 0.1374        | 6.6018  | 23760 | 0.1610          |
-| 0.1369        | 6.8019  | 24480 | 0.1630          |
-| 0.1358        | 7.0019  | 25200 | 0.1565          |
-| 0.1258        | 7.2020  | 25920 | 0.1624          |
-| 0.1259        | 7.4021  | 26640 | 0.1605          |
-| 0.1268        | 7.6021  | 27360 | 0.1555          |
-| 0.1231        | 7.8022  | 28080 | 0.1508          |
-| 0.1244        | 8.0022  | 28800 | 0.1523          |
-| 0.1125        | 8.2023  | 29520 | 0.1530          |
-| 0.1107        | 8.4023  | 30240 | 0.1507          |
-| 0.1125        | 8.6024  | 30960 | 0.1531          |
-| 0.1109        | 8.8024  | 31680 | 0.1498          |
-| 0.1115        | 9.0025  | 32400 | 0.1488          |
-| 0.0983        | 9.2026  | 33120 | 0.1506          |
-| 0.0992        | 9.4026  | 33840 | 0.1500          |
-| 0.0999        | 9.6027  | 34560 | 0.1479          |
-| 0.0994        | 9.8027  | 35280 | 0.1443          |
-| 0.0951        | 10.0028 | 36000 | 0.1479          |
-| 0.0856        | 10.2028 | 36720 | 0.1501          |
-| 0.0847        | 10.4029 | 37440 | 0.1473          |
-| 0.0856        | 10.6029 | 38160 | 0.1471          |
-| 0.0847        | 10.8030 | 38880 | 0.1452          |
-| 0.0859        | 11.0031 | 39600 | 0.1483          |
-| 0.0759        | 11.2031 | 40320 | 0.1518          |
-| 0.0749        | 11.4032 | 41040 | 0.1502          |
-| 0.0743        | 11.6032 | 41760 | 0.1519          |
-| 0.0741        | 11.8033 | 42480 | 0.1511          |
 ### Framework versions
-- PEFT 0.15.2
-- Transformers 4.57.1
-- Pytorch 2.7.0+cu126
-- Datasets 4.3.0
-- Tokenizers 0.22.1

 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1539
 ## Model description
 | Training Loss | Epoch   | Step  | Validation Loss |
 |:-------------:|:-------:|:-----:|:---------------:|
+| 0.377         | 0.2001  | 720   | 0.2894          |
+| 0.2824        | 0.4001  | 1440  | 0.2605          |
+| 0.2488        | 0.6002  | 2160  | 0.2486          |
+| 0.2424        | 0.8002  | 2880  | 0.2434          |
+| 0.2307        | 1.0003  | 3600  | 0.2332          |
+| 0.2171        | 1.2003  | 4320  | 0.2317          |
+| 0.215         | 1.4004  | 5040  | 0.2200          |
+| 0.2101        | 1.6004  | 5760  | 0.2177          |
+| 0.209         | 1.8005  | 6480  | 0.2163          |
+| 0.2047        | 2.0006  | 7200  | 0.2175          |
+| 0.1972        | 2.2006  | 7920  | 0.2108          |
+| 0.1934        | 2.4007  | 8640  | 0.2045          |
+| 0.1906        | 2.6007  | 9360  | 0.1991          |
+| 0.1889        | 2.8008  | 10080 | 0.2008          |
+| 0.1911        | 3.0008  | 10800 | 0.2007          |
+| 0.1789        | 3.2009  | 11520 | 0.1973          |
+| 0.1799        | 3.4009  | 12240 | 0.1947          |
+| 0.1769        | 3.6010  | 12960 | 0.1933          |
+| 0.176         | 3.8011  | 13680 | 0.1897          |
+| 0.1753        | 4.0011  | 14400 | 0.1833          |
+| 0.1645        | 4.2012  | 15120 | 0.1833          |
+| 0.1684        | 4.4012  | 15840 | 0.1864          |
+| 0.1626        | 4.6013  | 16560 | 0.1839          |
+| 0.1649        | 4.8013  | 17280 | 0.1761          |
+| 0.1613        | 5.0014  | 18000 | 0.1803          |
+| 0.1523        | 5.2014  | 18720 | 0.1755          |
+| 0.1531        | 5.4015  | 19440 | 0.1766          |
+| 0.1539        | 5.6016  | 20160 | 0.1726          |
+| 0.1527        | 5.8016  | 20880 | 0.1778          |
+| 0.1483        | 6.0017  | 21600 | 0.1640          |
+| 0.1404        | 6.2017  | 22320 | 0.1684          |
+| 0.1426        | 6.4018  | 23040 | 0.1666          |
+| 0.1407        | 6.6018  | 23760 | 0.1650          |
+| 0.1413        | 6.8019  | 24480 | 0.1635          |
+| 0.1386        | 7.0019  | 25200 | 0.1640          |
+| 0.1307        | 7.2020  | 25920 | 0.1627          |
+| 0.1291        | 7.4021  | 26640 | 0.1620          |
+| 0.1293        | 7.6021  | 27360 | 0.1626          |
+| 0.128         | 7.8022  | 28080 | 0.1584          |
+| 0.1288        | 8.0022  | 28800 | 0.1580          |
+| 0.1182        | 8.2023  | 29520 | 0.1571          |
+| 0.1166        | 8.4023  | 30240 | 0.1539          |
+| 0.1178        | 8.6024  | 30960 | 0.1571          |
+| 0.1154        | 8.8024  | 31680 | 0.1547          |
+| 0.1163        | 9.0025  | 32400 | 0.1549          |
+| 0.1056        | 9.2026  | 33120 | 0.1567          |
+| 0.1046        | 9.4026  | 33840 | 0.1526          |
+| 0.1063        | 9.6027  | 34560 | 0.1542          |
+| 0.1063        | 9.8027  | 35280 | 0.1481          |
+| 0.1022        | 10.0028 | 36000 | 0.1518          |
+| 0.0939        | 10.2028 | 36720 | 0.1546          |
+| 0.0933        | 10.4029 | 37440 | 0.1507          |
+| 0.0933        | 10.6029 | 38160 | 0.1510          |
+| 0.0927        | 10.8030 | 38880 | 0.1494          |
+| 0.0938        | 11.0031 | 39600 | 0.1498          |
+| 0.085         | 11.2031 | 40320 | 0.1557          |
+| 0.0839        | 11.4032 | 41040 | 0.1534          |
+| 0.0828        | 11.6032 | 41760 | 0.1534          |
+| 0.0829        | 11.8033 | 42480 | 0.1539          |
 ### Framework versions
+- PEFT 0.14.0
+- Transformers 4.47.0
+- Pytorch 2.5.1+cu124
+- Datasets 4.2.0
+- Tokenizers 0.21.0