-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Truncation #1426
Comments
To add more context, with @mdocekal we found that harness truncates task description, when having n-shot prompt. At least for GPT-2-XL and accelerate model. We would like to truncate the content of "shots" (so if it is 10-shot, and it won't fit, we want to change the particular example to e.g., 9-shot, or truncate from the last example, and not to truncate the preceding task description). |
Might be a bit tricky. The task description is prepended to the fully constructed fewshot string here: lm-evaluation-harness/lm_eval/api/task.py Line 853 in 620d6a1
If you only care about a specific model, one way could be to use a custom sampler and override the |
we found our "hacky" way to do what we wanted here. The question remains whether we should try, implement and pull request such a thing into lm-harness. We are developing a benchmark, and were hoping people could use harness for its evaluation. Do you think the truncation strategy could be specified with user function in yaml? |
Is it possible to change truncation strategy?
For example let's say that I want to remove whole few-shot sample or truncate each few-shot sample from left/right by a fair amount of tokens.
The text was updated successfully, but these errors were encountered: