Hello, I would like to know if there is a method to use "generate_until" to evaluate on the ceval or cmmlu dataset. I'm using a chat model, which adds a prompt template to make it answer questions. However, the model's answer choices (like A, B, C, D) may not necessarily be the first generated token. #1362

noforit · 2024-01-27T13:34:56Z

No description provided.

haileyschoelkopf · 2024-02-01T13:12:13Z

Hi!

We don't currently have a generative counterpart for all loglikelihood / multiple choice tasks in a one-size-fits-all way.

Provide feedback