-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch size auto is wrong? #1323
Comments
Thanks for reporting this, will check it out! I suspect that this is because Mistral has a max length in its config of 32768 , and due to caution we calculate our max batch size based on the model's reported max length to ensure no OOMs will occur at that batch size. You should be able to get around this with |
Thank you! I thought calculation of the batch size is based on generated data (especially that it can be recalculated during evaluation). |
I believe the first instance of batch size calculation is anomalous in this respect (which is perhaps worth changing.) |
@haileyschoelkopf could it make sense to just set the auto batch size to one rather than yielding |
Other option is to just run with defined batch size and if OOM occurs then decrease and repeat. |
A potential hacky workaround: batch_size = "auto"
try:
results = run_evaluation(eval_request=eval_request, task_names=[task.benchmark], num_fewshot=task.num_fewshot,
batch_size=batch_size, device=DEVICE, use_cache=None, limit=LIMIT)
except RuntimeError as e:
if "No executable batch size found" in str(e):
batch_size = 1
results = run_evaluation(eval_request=eval_request, task_names=[task.benchmark], num_fewshot=task.num_fewshot,
batch_size=batch_size, device=DEVICE, use_cache=None, limit=LIMIT)
else:
raise |
This seems like an improvement over the current code to me. Whether or not we write a more robust fix later, opening a PR with this would be an improvement. |
@StellaAthena I think we can do that here:
Let me make a pull request real quick -- @StellaAthena, done (#1405); feel free to double-check that! |
Batch size
auto
is not working correctly (withgenerate_until
tasks?).lm_eval --model hf --model_args pretrained=HuggingFaceH4/zephyr-7b-alpha,dtype=bfloat16 --tasks polemo2_in --device cuda:0 --batch_size auto
The benchmark works with at least
batch_size 2
(on 40GB VRAM card).The text was updated successfully, but these errors were encountered: