-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuously increasing RAM with Pre-training #77
Comments
Do you have custom code which could have a memory leak? |
We have a a custom dataloder that loads images and text from a parquet file. |
We have 1 Million images stored on disk and we have prepared the JSON file as described in the Github read me page. The Dataloader we have loads the json file in memory in the Now sure why the RAM utilization is so high? Any idea please? Thanks |
Hi, it could be related to the dataloader. |
We ended up using the |
We are happy to follow any other debugging steps to make this a success please. - thanks |
Was wondering if there has been any update on this. We ran the pretrain.py and saw the same issue: RAM size increases when the jason files are being read and at some point, RAM explodes. For pretraining, what python version did you use and what was the RAM size? |
@abhisheksgumadi @asgsaeid |
Thanks, will take a look |
Have you solved this problem?Could you kindly provide some suggestions ? |
Have you solved this problem?Could you kindly provide some suggestions ? |
Dear Team,
I am using the pre-training script to pre-train BLIP on a custom dataset (containing around 1M image/text pairs).
I see that the machine RAM utilization continuously increases and at a point it reaches 100%. The machine has 120GB RAM!.
Any idea where the problem could be?
The text was updated successfully, but these errors were encountered: