How to turn off thinking mode
I know there are three thinking modes, but in some scenarios the Low mode is still too slow for me to output.
May I ask if there is a way like Qwen that has a no_think mode?
How to turn on the thinking mode?
There is no way to turn off reasoning, how effor you can control the amount of effort by specifying Reasoning effort
- it can be either low
, medium
or high
.
A tricky workaround to disable reasoning mode is to edit the chat_template.jinja file, changing the add_generation_prompt from:
"<|start|>assistant"
to:
"<|start|>assistant<|channel|>analysis<|message|><|end|><|start|>assistant"
A tricky workaround to disable reasoning mode is to edit the chat_template.jinja file, changing the add_generation_prompt from:
"<|start|>assistant"
to:
"<|start|>assistant<|channel|>analysis<|message|><|end|><|start|>assistant"
That's a hit!
A tricky workaround to disable reasoning mode is to edit the chat_template.jinja file, changing the add_generation_prompt from:
"<|start|>assistant"
to:
"<|start|>assistant<|channel|>analysis<|message|><|end|><|start|>assistant"
How does it impact the model performance?
Yeah, that’s the hacky way to fully kill reasoning mode in GPT-OSS. But just keep in mind, in their paper they mention the model was post-trained with CoT-RL, which means it always uses reasoning (with variable effort: low/med/high). So turning it off like this isn’t really how the model was designed to work.
I haven’t benchmarked it myself, but I think it will decrease performance a lot — not just on reasoning-heavy stuff (math, coding, logic), but even on simpler tasks, since the model was never trained to operate without reasoning.