view reply llama-server by default in most implementation keeps the reasoning content in reasoning_content variable in response attribute. You can get it from there. Otherwise use reasoning-format flag and pass DeepSeek value to get pure tokens
deepseek-ai/DeepSeek-V3.2-Speciale Text Generation • 685B • Updated Dec 1, 2025 • 27.3k • 637
cerebras/MiniMax-M2-REAP-172B-A10B Text Generation • 173B • Updated Nov 15, 2025 • 1.03k • 17
cerebras/MiniMax-M2-REAP-162B-A10B Text Generation • 162B • Updated Nov 15, 2025 • 1.04k • 75
baidu/ERNIE-4.5-VL-28B-A3B-Thinking Image-Text-to-Text • 30B • Updated 17 days ago • 553 • 515
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16, 2025 • 52