Issue with `bigbench_gender_inclusive_sentences_german_multiple_choice` #1473

ayulockin · 2024-02-26T13:59:56Z

While running this task I get the following error stack trace:

Traceback (most recent call last):
  File "/opt/conda/envs/fr-de-lb/bin/lm_eval", line 8, in <module>
    sys.exit(cli_evaluate())
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/__main__.py", line 288, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
    return fn(*args, **kwargs)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/evaluator.py", line 154, in simple_evaluate
    task_dict = get_task_dict(tasks, task_manager)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 412, in get_task_dict
    task_name_from_string_dict = task_manager.load_task_or_group(string_task_name_list)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 253, in load_task_or_group
    collections.ChainMap(
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 164, in _load_individual_task_or_group
    return load_task(task_config, task=name_or_config, group=parent_name)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/tasks/__init__.py", line 153, in load_task
    task_object = ConfigurableTask(config=config)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/api/task.py", line 731, in __init__
    test_target = self.doc_to_target(test_doc)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/api/task.py", line 962, in doc_to_target
    target_string = utils.apply_template(doc_to_target, doc)
  File "/home/ayushthakur/lm-eval/llm-leaderboard-fr-de/lm-evaluation-harness/lm_eval/utils.py", line 426, in apply_template
    return rtemplate.render(**doc)
  File "/opt/conda/envs/fr-de-lb/lib/python3.10/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/opt/conda/envs/fr-de-lb/lib/python3.10/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
ValueError: 'Potsdam ist eine kreisfreie Stadt und mit gut 180.000 Einwohner*innen die bevölkerungsreichste Stadt und Hauptstadt des Landes Brandenburg.' is not in list

I am using the following command:

lm_eval --model hf --model_args pretrained=microsoft/phi-2,trust_remote_code=True --tasks bigbench_gender_inclusive_sentences_german_multiple_choice --device cuda:0 --batch_size 1 --output_path output/phi-2-mmlu-arc --limit 2 --wandb_args project=lm-eval-harness-integration --log_samples

cc: @haileyschoelkopf

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with `bigbench_gender_inclusive_sentences_german_multiple_choice` #1473

Issue with `bigbench_gender_inclusive_sentences_german_multiple_choice` #1473

ayulockin commented Feb 26, 2024

Issue with bigbench_gender_inclusive_sentences_german_multiple_choice #1473

Issue with bigbench_gender_inclusive_sentences_german_multiple_choice #1473

Comments

ayulockin commented Feb 26, 2024

Issue with `bigbench_gender_inclusive_sentences_german_multiple_choice` #1473

Issue with `bigbench_gender_inclusive_sentences_german_multiple_choice` #1473