DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
ThomasBaruzier
ThomasBaruzier
AI & ML interests
None yet
Recent Activity
liked
a model
1 day ago
LLM360/K2-Think
liked
a model
3 days ago
openbmb/MiniCPM4.1-8B
liked
a model
6 days ago
moonshotai/Kimi-K2-Instruct-0905
Organizations
None yet
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen QwQ
Qwen with Questions
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 186 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 183 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 352 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 182
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 278 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 210 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 189 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 155
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.
DeepScaleR-1.5B-Preview
DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Beats o1 preview in math.
DeepSeek-R1-ReDistill
Re-distilled DeepSeek R1 models
EXAONE-3.5
EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B.
Qwen 2.5 Coder Instruct
Code-specific model series based on Qwen2.5
-
ThomasBaruzier/Qwen2.5-Coder-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 186 • 1 -
ThomasBaruzier/Qwen2.5-Coder-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 183 -
ThomasBaruzier/Qwen2.5-Coder-3B-Instruct-GGUF
Text Generation • 3B • Updated • 352 -
ThomasBaruzier/Qwen2.5-Coder-7B-Instruct-GGUF
Text Generation • 8B • Updated • 182
Qwen QwQ
Qwen with Questions
Qwen 2.5 Instruct
Qwen 2.5 language models, featuring instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
-
ThomasBaruzier/Qwen2.5-0.5B-Instruct-GGUF
Text Generation • 0.5B • Updated • 278 -
ThomasBaruzier/Qwen2.5-1.5B-Instruct-GGUF
Text Generation • 2B • Updated • 210 -
ThomasBaruzier/Qwen2.5-3B-Instruct-GGUF
Text Generation • 3B • Updated • 189 -
ThomasBaruzier/Qwen2.5-7B-Instruct-GGUF
Text Generation • 8B • Updated • 155
Llama 3.2 Instruct
Llama 3.2 language models, featuring instruction-tuned models of 2 sizes, including 1B and 3B.
Llama 3.1 Instruct
Llama 3.1 language models, featuring instruction-tuned models of 3 sizes, including 8B, 70B, and 405B.
Llama 3 Instruct
Llama 3 language models, featuring instruction-tuned models of 2 sizes, including 8B and 70B.
Gemma 2
Gemma 2 language models, featuring instruction-tuned models of 3 sizes, including 2B, 9B, and 27B.