RuntimeError: CUDA out of memory #11

happygirlzt · 2022-08-01T07:40:20Z

Hi there, thank you very much for open-sourcing the work!
I wonder what devices you used for the work. Since I tried to run the training in a machine with 8 Tesla V100-SXM2-16GB, but cannot make it. Besides, I found the code would only utilize 2 GPUs, although I did not specify. I modified the device setting inside run.py, but still cannot change the fact that only 2 GPUs are used.
Please kindly suggest. Thank you in advance!

The text was updated successfully, but these errors were encountered:

happygirlzt · 2022-08-02T13:42:43Z

Hi @pkuzqh , I've got another issue when running the code.

pkuzqh · 2022-08-03T06:49:38Z

If you want to change the batch size, you need to change the number in the dict "args". If you want to use multiple GPUs, you need to modify "model = nn.DataParallel(model, device_ids=[0, 1])".

happygirlzt · 2022-08-03T07:26:46Z

Hi @pkuzqh, thank you for the reply. The cuda out of memory issue has been resolved. However, I found the new error above. Please kindly suggest, thanks.

pkuzqh · 2022-08-03T07:32:04Z

How many GPUs do you use? And the batch size?

happygirlzt · 2022-08-04T07:38:49Z

3, I indicated in the train() that: device_ids=[1,2,3]
the batch size is 16

pkuzqh · 2022-08-05T15:01:38Z

You need to change the number "4" in line 103-106 to a multiple of 3. And the batch size needs to be a multiple of 3.

happygirlzt · 2022-08-08T03:16:29Z

OK, thank you very much @pkuzqh ! It can now run. However, I saw in the train(), the number of epochs is 100000, for epoch in range(100000): is that true?

happygirlzt · 2022-08-08T04:40:44Z

BTW, for inference, it looks like the testDefect4j.py can only use 1 GPU? Since I have 4 GPUs, only one was used, and it caused an OOM issue.

pkuzqh · 2022-08-12T10:06:07Z

you can use "nn.DataParallel" to use multiple gpus in testDefect4J.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA out of memory #11

RuntimeError: CUDA out of memory #11

happygirlzt commented Aug 1, 2022

happygirlzt commented Aug 2, 2022

pkuzqh commented Aug 3, 2022

happygirlzt commented Aug 3, 2022

pkuzqh commented Aug 3, 2022

happygirlzt commented Aug 4, 2022

pkuzqh commented Aug 5, 2022

happygirlzt commented Aug 8, 2022

happygirlzt commented Aug 8, 2022

pkuzqh commented Aug 12, 2022

RuntimeError: CUDA out of memory #11

RuntimeError: CUDA out of memory #11

Comments

happygirlzt commented Aug 1, 2022

happygirlzt commented Aug 2, 2022

pkuzqh commented Aug 3, 2022

happygirlzt commented Aug 3, 2022

pkuzqh commented Aug 3, 2022

happygirlzt commented Aug 4, 2022

pkuzqh commented Aug 5, 2022

happygirlzt commented Aug 8, 2022

happygirlzt commented Aug 8, 2022

pkuzqh commented Aug 12, 2022