Question aboud evaluating MMLU #1347

HypherX · 2024-01-24T09:38:11Z

Hi, thanks for the great work!
I want to evaluate the MMLU benchmark. Because I could not access the huggingface hub, I download the 'hails/mmlu_no_train' on huggingface. But when I run this command 'lm_eval --model hf --model_args pretrained=/root/paddlejob/workspace/env_run/huitingfeng/models/llama-2-7b-chat-hf --tasks mmlu --device cuda:4 --batch_size 8', there has some tracebacks:

ValueError: BuilderConfig 'logical_fallacies' not found. Available: ['default']

My datasets version is 2.16.1. When I downgrade the datasets version to 2.15.0, there is another traceback:

column names don't match

I want to know how could I organize the downloaded MMLU benchmark and its corresponding yaml file.

The text was updated successfully, but these errors were encountered:

haileyschoelkopf · 2024-01-24T15:02:21Z

Hi!

could you share more how you're downloading the dataset manually beforehand?

We describe a few ways to load from a local dataset here: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/new_task_guide.md#using-local-datasets which may be helpful. For instance, the latter example should work with dataset.save_to_disk()

haileyschoelkopf · 2024-02-07T13:56:24Z

Hi, please let us know if you're continuing to experience issues not solved by following the instructions in the linked docs page.

Jp-17 · 2024-05-12T13:12:10Z

i have the same problems. I use the "dataset.save_to_disk()" to save gsm8k dataset into "llm/dataset/gsm8k". however when i set gsm8k.yaml as
"
task: try_gsm8k
dataset_path: /mnt/nfs/vault/jiangp/llm/dataset/gsm8k
dataset_name: main
"
or
"
task: try_gsm8k
dataset_path: gsm8k
dataset_kwargs:
data_dir: /mnt/nfs/vault/jiangp/llm/dataset/gsm8k/
dataset_name: main
"
it doesn't work neither, and show the same bug info " File "/home/jiangp/.conda/envs/llm2/lib/python3.8/site-packages/datasets/builder.py", line 371, in init
self.config, self.config_id = self._create_builder_config(
File "/home/jiangp/.conda/envs/llm2/lib/python3.8/site-packages/datasets/builder.py", line 592, in _create_builder_config
raise ValueError(
ValueError: BuilderConfig 'main' not found. Available: ['default']"

Want any help if possible

haileyschoelkopf · 2024-05-12T16:36:34Z

Hi @Jp-17 , could you open a new issue for this documenting what you're running into + steps to replicate? It sounds like the issue may be related to the use of save_to_disk() in datasets

Jp-17 · 2024-05-12T22:25:20Z

thanks for your reply, i have just open a new issue, which contain more detailed info. #1829

haileyschoelkopf added the asking questions For asking for clarification / support on library usage. label Feb 7, 2024

haileyschoelkopf closed this as completed Feb 7, 2024

Jp-17 mentioned this issue May 12, 2024

eval gsm8k from local dataset folder with the bug info "ValueError: BuilderConfig 'main' not found." #1829

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question aboud evaluating MMLU #1347

Question aboud evaluating MMLU #1347

HypherX commented Jan 24, 2024

haileyschoelkopf commented Jan 24, 2024

haileyschoelkopf commented Feb 7, 2024

Jp-17 commented May 12, 2024

haileyschoelkopf commented May 12, 2024

Jp-17 commented May 12, 2024 •

edited

Loading

Question aboud evaluating MMLU #1347

Question aboud evaluating MMLU #1347

Comments

HypherX commented Jan 24, 2024

haileyschoelkopf commented Jan 24, 2024

haileyschoelkopf commented Feb 7, 2024

Jp-17 commented May 12, 2024

haileyschoelkopf commented May 12, 2024

Jp-17 commented May 12, 2024 • edited Loading

Jp-17 commented May 12, 2024 •

edited

Loading