GRPO-Gsmath-Llama-1B-IT
Model Description:
This is a fine-tuned version of meta-llama/Llama-3.2-1B!
- recommended settings for inference: min_p = 0.1 and temperature = 1.5 , Read this Tweet to understand why.
- License : apache-2.0
Benchmarks:
We evaluate all models on GSM8K using the standard lm-eval 5-shot exact-match protocol. Under identical decoding and extraction settings,GsMath-Llama-1B-IT and GsMath-Llama-1B outperform Meta’s Llama-3.2-1B by 2x-4.75x ,demonstrating an improvement in small-model mathematical capability.
| Model | Params | GSM8K (5-shot, EM) |
|---|---|---|
| GRPO-Gsmath-Llama-1B-IT | 1B | 0.323 |
| GsMath-Llama-1B | 1B | 0.137 |
| Llama-3.2-1B | 1B | 0.068 |
- Downloads last month
- 7
Model tree for Cannae-AI/GRPO-Gsmath-Llama-1B-IT
Base model
meta-llama/Llama-3.2-1B