"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
about 23 hours ago
cerebras/MiniMax-M2-REAP-172B-A10B
liked
a model
about 24 hours ago
cerebras/MiniMax-M2-REAP-162B-A10B
liked
a model
4 days ago
WeiboAI/VibeThinker-1.5B