smolvlm-mmocr-sft-round-2
This model is a fine-tuned version of HuggingFaceTB/SmolVLM-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0196
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 96
- eval_batch_size: 96
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 768
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.5099 | 0.0054 | 20 | 0.4944 |
0.4426 | 0.0109 | 40 | 0.4381 |
0.4117 | 0.0163 | 60 | 0.3822 |
0.3378 | 0.0217 | 80 | 0.3352 |
0.3107 | 0.0271 | 100 | 0.2984 |
0.2839 | 0.0326 | 120 | 0.2696 |
0.2629 | 0.0380 | 140 | 0.2478 |
0.2467 | 0.0434 | 160 | 0.2305 |
0.2306 | 0.0489 | 180 | 0.2160 |
0.2283 | 0.0543 | 200 | 0.2037 |
0.1958 | 0.0597 | 220 | 0.1927 |
0.1887 | 0.0651 | 240 | 0.1830 |
0.1699 | 0.0706 | 260 | 0.1738 |
0.155 | 0.0760 | 280 | 0.1655 |
0.1579 | 0.0814 | 300 | 0.1575 |
0.1437 | 0.0869 | 320 | 0.1506 |
0.1549 | 0.0923 | 340 | 0.1437 |
0.1336 | 0.0977 | 360 | 0.1371 |
0.1268 | 0.1032 | 380 | 0.1314 |
0.139 | 0.1086 | 400 | 0.1258 |
0.1215 | 0.1140 | 420 | 0.1205 |
0.1103 | 0.1194 | 440 | 0.1156 |
0.1141 | 0.1249 | 460 | 0.1111 |
0.1175 | 0.1303 | 480 | 0.1066 |
0.0978 | 0.1357 | 500 | 0.1023 |
0.1016 | 0.1412 | 520 | 0.0986 |
0.0899 | 0.1466 | 540 | 0.0947 |
0.0952 | 0.1520 | 560 | 0.0909 |
0.0775 | 0.1574 | 580 | 0.0876 |
0.0862 | 0.1629 | 600 | 0.0845 |
0.0777 | 0.1683 | 620 | 0.0809 |
0.077 | 0.1737 | 640 | 0.0779 |
0.0757 | 0.1792 | 660 | 0.0752 |
0.0627 | 0.1846 | 680 | 0.0724 |
0.0635 | 0.1900 | 700 | 0.0701 |
0.0619 | 0.1954 | 720 | 0.0674 |
0.0593 | 0.2009 | 740 | 0.0651 |
0.0686 | 0.2063 | 760 | 0.0633 |
0.0626 | 0.2117 | 780 | 0.0613 |
0.0596 | 0.2172 | 800 | 0.0591 |
0.0447 | 0.2226 | 820 | 0.0575 |
0.0548 | 0.2280 | 840 | 0.0556 |
0.0555 | 0.2334 | 860 | 0.0546 |
0.0547 | 0.2389 | 880 | 0.0536 |
0.0461 | 0.2443 | 900 | 0.0512 |
0.0445 | 0.2497 | 920 | 0.0501 |
0.0386 | 0.2552 | 940 | 0.0491 |
0.0334 | 0.2606 | 960 | 0.0475 |
0.0498 | 0.2660 | 980 | 0.0458 |
0.0365 | 0.2715 | 1000 | 0.0452 |
0.0364 | 0.2769 | 1020 | 0.0442 |
0.0351 | 0.2823 | 1040 | 0.0433 |
0.0393 | 0.2878 | 1060 | 0.0418 |
0.0359 | 0.2932 | 1080 | 0.0413 |
0.0347 | 0.2986 | 1100 | 0.0405 |
0.0351 | 0.3040 | 1120 | 0.0397 |
0.0386 | 0.3095 | 1140 | 0.0385 |
0.04 | 0.3149 | 1160 | 0.0380 |
0.0288 | 0.3203 | 1180 | 0.0375 |
0.0279 | 0.3258 | 1200 | 0.0370 |
0.0285 | 0.3312 | 1220 | 0.0362 |
0.0308 | 0.3366 | 1240 | 0.0357 |
0.0302 | 0.3420 | 1260 | 0.0351 |
0.026 | 0.3475 | 1280 | 0.0349 |
0.0262 | 0.3529 | 1300 | 0.0345 |
0.0295 | 0.3583 | 1320 | 0.0341 |
0.0315 | 0.3638 | 1340 | 0.0335 |
0.0325 | 0.3692 | 1360 | 0.0327 |
0.0264 | 0.3746 | 1380 | 0.0324 |
0.0321 | 0.3800 | 1400 | 0.0319 |
0.0268 | 0.3855 | 1420 | 0.0312 |
0.0248 | 0.3909 | 1440 | 0.0313 |
0.0361 | 0.3963 | 1460 | 0.0312 |
0.0277 | 0.4018 | 1480 | 0.0303 |
0.0274 | 0.4072 | 1500 | 0.0300 |
0.028 | 0.4126 | 1520 | 0.0294 |
0.0223 | 0.4181 | 1540 | 0.0290 |
0.0247 | 0.4235 | 1560 | 0.0289 |
0.0265 | 0.4289 | 1580 | 0.0289 |
0.0285 | 0.4343 | 1600 | 0.0285 |
0.0284 | 0.4398 | 1620 | 0.0284 |
0.0255 | 0.4452 | 1640 | 0.0279 |
0.0246 | 0.4506 | 1660 | 0.0273 |
0.0196 | 0.4561 | 1680 | 0.0274 |
0.0272 | 0.4615 | 1700 | 0.0270 |
0.0206 | 0.4669 | 1720 | 0.0268 |
0.02 | 0.4723 | 1740 | 0.0268 |
0.0229 | 0.4778 | 1760 | 0.0268 |
0.0214 | 0.4832 | 1780 | 0.0260 |
0.0246 | 0.4886 | 1800 | 0.0261 |
0.0189 | 0.4941 | 1820 | 0.0254 |
0.0199 | 0.4995 | 1840 | 0.0253 |
0.0168 | 0.5049 | 1860 | 0.0251 |
0.0297 | 0.5103 | 1880 | 0.0250 |
0.0237 | 0.5158 | 1900 | 0.0245 |
0.0189 | 0.5212 | 1920 | 0.0250 |
0.0274 | 0.5266 | 1940 | 0.0243 |
0.0143 | 0.5321 | 1960 | 0.0243 |
0.0201 | 0.5375 | 1980 | 0.0241 |
0.0258 | 0.5429 | 2000 | 0.0241 |
0.0256 | 0.5484 | 2020 | 0.0239 |
0.018 | 0.5538 | 2040 | 0.0235 |
0.019 | 0.5592 | 2060 | 0.0236 |
0.0252 | 0.5646 | 2080 | 0.0233 |
0.0215 | 0.5701 | 2100 | 0.0234 |
0.0204 | 0.5755 | 2120 | 0.0235 |
0.0165 | 0.5809 | 2140 | 0.0230 |
0.018 | 0.5864 | 2160 | 0.0230 |
0.0236 | 0.5918 | 2180 | 0.0228 |
0.0199 | 0.5972 | 2200 | 0.0228 |
0.0153 | 0.6026 | 2220 | 0.0226 |
0.0166 | 0.6081 | 2240 | 0.0227 |
0.0212 | 0.6135 | 2260 | 0.0225 |
0.0213 | 0.6189 | 2280 | 0.0224 |
0.0165 | 0.6244 | 2300 | 0.0223 |
0.0197 | 0.6298 | 2320 | 0.0224 |
0.0219 | 0.6352 | 2340 | 0.0223 |
0.0277 | 0.6407 | 2360 | 0.0222 |
0.015 | 0.6461 | 2380 | 0.0220 |
0.0191 | 0.6515 | 2400 | 0.0218 |
0.0204 | 0.6569 | 2420 | 0.0220 |
0.0162 | 0.6624 | 2440 | 0.0219 |
0.0174 | 0.6678 | 2460 | 0.0219 |
0.0144 | 0.6732 | 2480 | 0.0217 |
0.0219 | 0.6787 | 2500 | 0.0215 |
0.0152 | 0.6841 | 2520 | 0.0217 |
0.0107 | 0.6895 | 2540 | 0.0215 |
0.0173 | 0.6949 | 2560 | 0.0213 |
0.0173 | 0.7004 | 2580 | 0.0214 |
0.0147 | 0.7058 | 2600 | 0.0214 |
0.015 | 0.7112 | 2620 | 0.0211 |
0.0158 | 0.7167 | 2640 | 0.0211 |
0.0128 | 0.7221 | 2660 | 0.0211 |
0.0185 | 0.7275 | 2680 | 0.0210 |
0.0214 | 0.7329 | 2700 | 0.0210 |
0.0162 | 0.7384 | 2720 | 0.0209 |
0.017 | 0.7438 | 2740 | 0.0210 |
0.0124 | 0.7492 | 2760 | 0.0211 |
0.0187 | 0.7547 | 2780 | 0.0207 |
0.0166 | 0.7601 | 2800 | 0.0209 |
0.0204 | 0.7655 | 2820 | 0.0207 |
0.0168 | 0.7710 | 2840 | 0.0207 |
0.0209 | 0.7764 | 2860 | 0.0206 |
0.0166 | 0.7818 | 2880 | 0.0206 |
0.0174 | 0.7872 | 2900 | 0.0207 |
0.0208 | 0.7927 | 2920 | 0.0205 |
0.013 | 0.7981 | 2940 | 0.0204 |
0.0112 | 0.8035 | 2960 | 0.0205 |
0.0174 | 0.8090 | 2980 | 0.0203 |
0.0169 | 0.8144 | 3000 | 0.0204 |
0.0129 | 0.8198 | 3020 | 0.0202 |
0.0191 | 0.8252 | 3040 | 0.0201 |
0.0166 | 0.8307 | 3060 | 0.0202 |
0.0182 | 0.8361 | 3080 | 0.0201 |
0.0208 | 0.8415 | 3100 | 0.0201 |
0.015 | 0.8470 | 3120 | 0.0200 |
0.0246 | 0.8524 | 3140 | 0.0200 |
0.0158 | 0.8578 | 3160 | 0.0201 |
0.0183 | 0.8633 | 3180 | 0.0201 |
0.0171 | 0.8687 | 3200 | 0.0201 |
0.0173 | 0.8741 | 3220 | 0.0199 |
0.017 | 0.8795 | 3240 | 0.0199 |
0.0123 | 0.8850 | 3260 | 0.0199 |
0.0144 | 0.8904 | 3280 | 0.0200 |
0.0149 | 0.8958 | 3300 | 0.0199 |
0.0139 | 0.9013 | 3320 | 0.0199 |
0.0199 | 0.9067 | 3340 | 0.0198 |
0.0164 | 0.9121 | 3360 | 0.0197 |
0.0154 | 0.9175 | 3380 | 0.0198 |
0.0225 | 0.9230 | 3400 | 0.0198 |
0.0232 | 0.9284 | 3420 | 0.0198 |
0.0135 | 0.9338 | 3440 | 0.0197 |
0.016 | 0.9393 | 3460 | 0.0197 |
0.0164 | 0.9447 | 3480 | 0.0197 |
0.0147 | 0.9501 | 3500 | 0.0197 |
0.015 | 0.9555 | 3520 | 0.0197 |
0.0215 | 0.9610 | 3540 | 0.0196 |
0.0174 | 0.9664 | 3560 | 0.0197 |
0.0203 | 0.9718 | 3580 | 0.0197 |
0.0191 | 0.9773 | 3600 | 0.0197 |
0.02 | 0.9827 | 3620 | 0.0196 |
0.0128 | 0.9881 | 3640 | 0.0196 |
0.0145 | 0.9936 | 3660 | 0.0197 |
0.0179 | 0.9990 | 3680 | 0.0196 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.5.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.1
- Downloads last month
- 702
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for chuuhtetnaing/smolvlm-mmocr-sft-round-2
Base model
HuggingFaceTB/SmolLM2-1.7B
Quantized
HuggingFaceTB/SmolLM2-1.7B-Instruct
Quantized
HuggingFaceTB/SmolVLM-Instruct