-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QA inferencer very slow because of bad default multiprocessing settings #3272
Comments
Confirming performance speedup when using FARMReader |
Considering the findings of issue #3289 my guess is that the multiprocessing present in the |
Updated notebook to include TransformersReader performance. TransformersReader is slightly faster than FarmReader (no multiprocessing). The performance difference in the colab seems a bit more than it actually is. I ran these simple performance tests several times in a more controlled environment, and TransformersReader is about 15% faster than FARMReader. |
Describe the bug
When doing pipeline.eval I realized that it is very slow and also outputs way too many lines of tqdm.
That is why I tested it with and without multiprocessing. My results are incredible:
MP on
Elapsed: 22.35
MP off
Elapsed: 6.079
To Reproduce
I benchmarked the pipeline.eval function on a V100 with only 10 labels + retriever topk 10
with FARMReaders num_processes parameter set to 0 (disable MP) or None (use all available processes)
additional insights
I think the problematic part is in the FARMReader.predict methods that calls:
Since the multiprocessing_chunksize is hard coded to 1 it will split up all data, even when it is very few data, and feed this one by one to the GPU. This results in tiny batches and therefore bad GPU utilization for basically all retriever + reader usage...
The text was updated successfully, but these errors were encountered: