Enhanced Chat Template for Qwen3.6
Based on froggeric/Qwen-Fixed-Chat-Templates with additional hardening.
Drop-in Jinja2 template that fixes 9 bugs from the official Qwen3.6 template and adds dynamic thinking toggle and defensive type checking.
What's Fixed
| # | Bug | Impact |
|---|---|---|
| 1 | No item is not mapping guard |
C++ engines (llama.cpp/LM Studio) crash |
| 2 | add_vision_id undefined access |
Strict engines crash |
| 3 | developer role rejected |
OpenAI-compatible API fails |
| 4 | Missing video_url key |
Video input ignored |
| 5 | raise_exception on tool-only chains |
Agent frameworks crash (LangChain/AutoGen/Claude Code) |
| 6 | Solo <think tag not handled |
Parsing errors |
| 7 | Empty thinking blocks rendered | Wastes context tokens |
| 8 | Tool call missing name check | Malformed output |
| 9 | Tool call arguments not validated | Silent corruption |
New Features
- Dynamic thinking toggle: Insert
<|think_on|>/<|think_off|>in any message to switch per-turn - Thinking state machine:
ns_flags.enable_thinkingpersists across all messages developerrole support: Fully compatible with OpenAI API- Enhanced error messages: All exceptions include diagnostic context
- Tool call validation: Name presence + arguments type + message ordering
video_urlsupport: Video inputs viavideo_urlkey detected
Usage
vLLM
chat_template: /path/to/qwen3.6/chat_template.jinja
llama.cpp / LM Studio
Place chat_template.jinja alongside your model files.
Dynamic Thinking Toggle
messages = [
{"role": "system", "content": "You are helpful. <|think_off|>"},
{"role": "user", "content": "What is 2+2?"}, # no thinking
{"role": "assistant", "content": "4"},
{"role": "user", "content": "<|think_on|> Prove Fermat's theorem"} # thinking ON
]
vLLM Template Parameters
response = client.chat.completions.create(
model="Qwen3.6-27B-FP8",
messages=messages,
extra_body={
"chat_template_kwargs": {
"enable_thinking": False, # disable thinking
"preserve_thinking": True # keep thinking blocks in history
}
}
)
Compatibility
| Engine | Status |
|---|---|
| vLLM >= 0.6 | Tested (primary) |
| llama.cpp | Compatible |
| LM Studio | Compatible |
| Ollama | Compatible |
| SGLang | Compatible |
Related
- GitHub: GeerMrc/qwen3-enhanced-chat-template — full documentation (EN/CN), optimization details, client guide
- ModelScope: MaricZg/qwen3-enhanced-chat-template
- Upstream: froggeric/Qwen-Fixed-Chat-Templates
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support