Why is a 'raw prompt' performing better than a 'chat template prompt'
#250
by
MLooten
- opened
Hi
I currently have a prompt that is containing instruction (let's call it system part) and the sentence I want to use as 'user' input (user part).
On one side I use a single prompt composed of system + user (as a string) and I simply call tokenizer + model.generate.
prompt = system_prompt + user_prompt
inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)
outputs = model.generate(**inputs, max_new_tokens=500, temperature=temperature, do_sample=False)
On the other side I am creating a message, tokenize it with chat_template and apply model.generate
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)
outputs = model.generate(**inputs, max_new_tokens=500, temperature=temperature, do_sample=False)
The issue I am currently facing and do not understand actually is that the 'raw prompt' is performing better than the formatted one.
Can somebody explain my why?
If I look at the huggingface page of the model it seems that it is recommended to use system + user and format it correctly with the tokenizer.
Thanks in advance.