Enhanced Chat Template for Qwen3.6

Based on froggeric/Qwen-Fixed-Chat-Templates with additional hardening.

Drop-in Jinja2 template that fixes 9 bugs from the official Qwen3.6 template and adds dynamic thinking toggle and defensive type checking.

What's Fixed

# Bug Impact
1 No item is not mapping guard C++ engines (llama.cpp/LM Studio) crash
2 add_vision_id undefined access Strict engines crash
3 developer role rejected OpenAI-compatible API fails
4 Missing video_url key Video input ignored
5 raise_exception on tool-only chains Agent frameworks crash (LangChain/AutoGen/Claude Code)
6 Solo <think tag not handled Parsing errors
7 Empty thinking blocks rendered Wastes context tokens
8 Tool call missing name check Malformed output
9 Tool call arguments not validated Silent corruption

New Features

  • Dynamic thinking toggle: Insert <|think_on|> / <|think_off|> in any message to switch per-turn
  • Thinking state machine: ns_flags.enable_thinking persists across all messages
  • developer role support: Fully compatible with OpenAI API
  • Enhanced error messages: All exceptions include diagnostic context
  • Tool call validation: Name presence + arguments type + message ordering
  • video_url support: Video inputs via video_url key detected

Usage

vLLM

chat_template: /path/to/qwen3.6/chat_template.jinja

llama.cpp / LM Studio

Place chat_template.jinja alongside your model files.

Dynamic Thinking Toggle

messages = [
    {"role": "system", "content": "You are helpful. <|think_off|>"},
    {"role": "user", "content": "What is 2+2?"},           # no thinking
    {"role": "assistant", "content": "4"},
    {"role": "user", "content": "<|think_on|> Prove Fermat's theorem"}  # thinking ON
]

vLLM Template Parameters

response = client.chat.completions.create(
    model="Qwen3.6-27B-FP8",
    messages=messages,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": False,    # disable thinking
            "preserve_thinking": True     # keep thinking blocks in history
        }
    }
)

Compatibility

Engine Status
vLLM >= 0.6 Tested (primary)
llama.cpp Compatible
LM Studio Compatible
Ollama Compatible
SGLang Compatible

Related

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support