Evaluate Gemma with Chat Template #2069

pyf98 · 2024-07-05T19:15:37Z

Hi, I'm trying to evaluate gemma-it models from Hugging Face on MMLU. When I set --apply_chat_template --fewshot_as_multiturn, the tokenizer will raise an error below. This is because Gemma does not support system messages: https://huggingface.co/google/gemma-2b-it/blob/main/tokenizer_config.json#L1507

jinja2.exceptions.TemplateError: System role not supported

What is the best way to evaluate Gemma chat models? Should I use chat templating or not? Should I remove the description for each document or move the description to the first user turn instead of the system prompt? Thank you for any help!

The text was updated successfully, but these errors were encountered:

haileyschoelkopf · 2024-07-08T19:18:27Z

Hi, this is blocked on #2058 , I'll get that merged ASAP!

pyf98 · 2024-07-09T06:11:17Z

Thanks @haileyschoelkopf for your reply! It seems that we will just remove the system instruction for Gemma-style models. I'm wondering if it hurts the performance of Gemma chat models? I guess many benchmarks provide a brief description at the beginning, but those will be removed.

KT313 · 2024-09-05T17:45:53Z

how about making an option for these kinds of models to just put the system instruction at the beginning of the first user message instead, something like --system-as-user?

so originally something like

<bos><role>system
This is a system message.
<role>user
This is a user message.

would become:

<bos><role>user
This is a system message.
This is a user message.

or with tags so that the llm can understand that it's like a system instruction:

<bos><role>user
<system>This is a system message.</system>
This is a user message.

for now i modified the chat template in the tokenizer config to this:

"chat_template": "{{ bos_token }}{% if messages[0]['role'] == 'system' %}{% for message in messages[1:] %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% else %}{% set role = message['role'] %}{% endif %}{% if (loop.index0  == 0)  %}{{ '<start_of_turn>' + role + '\n' + '<system>' + messages[0]['content'] + '</system>' + '\n'+ message['content'] | trim + '<end_of_turn>\n' }}{% endif %}{% if (loop.index0  != 0)  %}{{ '<start_of_turn>' + role + '\n' + message['content'] | trim + '<end_of_turn>\n' }}{% endif %}{% endfor %}{% endif %}{% if messages[0]['role'] != 'system' %}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% else %}{% set role = message['role'] %}{% endif %}{{ '<start_of_turn>' + role + '\n' + message['content'] | trim + '<end_of_turn>\n' }}{% endfor %}{% endif %}{% if add_generation_prompt %}{{'<start_of_turn>model\n'}}{% endif %}"

it would be easier if you can add that option --system-as-user in lm_eval which does something like:

if messages[0]['role'] == "system":
    messages[1]['content'] = "<system>" + messages[0]['content'] + "</system>\n" + messages[1]['content']
    messages = messages[1:]

haileyschoelkopf linked a pull request Jul 8, 2024 that will close this issue

Chat template fix #2058

Closed

chimezie mentioned this issue Jul 16, 2024

Getting "jinja2.exceptions.TemplateError: System role not supported" exception with some tasks using --apply_chat_template #2109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate Gemma with Chat Template #2069

Evaluate Gemma with Chat Template #2069

pyf98 commented Jul 5, 2024

haileyschoelkopf commented Jul 8, 2024

pyf98 commented Jul 9, 2024

KT313 commented Sep 5, 2024 •

edited

Loading

Evaluate Gemma with Chat Template #2069

Evaluate Gemma with Chat Template #2069

Comments

pyf98 commented Jul 5, 2024

haileyschoelkopf commented Jul 8, 2024

pyf98 commented Jul 9, 2024

KT313 commented Sep 5, 2024 • edited Loading

KT313 commented Sep 5, 2024 •

edited

Loading