-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accuracy gap between single GPU and multiple GPUs #1751
Comments
Thank you for your efforts! Great table with results to compare!
Please check other issues/discussions about speed, batches and multiple GPU usage for ideas. For example (but not limited to), |
Thank @LSinev for the quick reply. However, I didn't find the issue about the difference between different numbers of GPUs. |
No idea. According to your results from table it is also task dependent issue. If even just batch size makes difference, I suppose multiple GPU may propose even more difference. |
I'm using lm-eval v0.4.2 to evaluate Llama 7b on the open llm leaderboard benchmark.
I found that there are accuracy gaps between single GPU and multiple GPUs as below. (I used data parallel)
Single GPU got overall lower accuracies. ARC-c, hellaswag and GSM8K drop 0.3~0.5.
I thought that data-parallel only speeds up the evaluation. Where did the difference come from?
Below is the command line I used for ARC-c.
Use CUDA_VISIBLE_DEVICES to control the number of GPUs.
The text was updated successfully, but these errors were encountered: