Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, I would like to know if there is a method to use "generate_until" to evaluate on the ceval or cmmlu dataset. I'm using a chat model, which adds a prompt template to make it answer questions. However, the model's answer choices (like A, B, C, D) may not necessarily be the first generated token. #1362

Open
noforit opened this issue Jan 27, 2024 · 1 comment

Comments

@noforit
Copy link

noforit commented Jan 27, 2024

No description provided.

@haileyschoelkopf
Copy link
Collaborator

Hi!

We don't currently have a generative counterpart for all loglikelihood / multiple choice tasks in a one-size-fits-all way.

However, for CEval or CMMLU, you could make a variant that is zero-shot CoT, following _mmlu_flan_cot_zeroshot_template_yaml in this PR: https://github.com/EleutherAI/lm-evaluation-harness/pull/1356/files#diff-77e4a56ffcde165e312239a2abcef4943035c61df2eb09249d8872ce91f29658

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants