Learn.py takes all GPU memory #612

r-lipton · 2018-04-11T07:39:08Z

Hello,

When I run a training using learn.py, the process allocated all the memory of the GPU.
Is there a way to avoid this, and make it takes only what it needs?

Thanks

The text was updated successfully, but these errors were encountered:

Hengoo · 2018-04-11T09:00:26Z

I dont think the problem is with tensorflow or the ml-agents (when i start training it uses about 20 mb vram)

You should check if your game is doing what you think it does. If you have some kind of memory leak you have to remember than the game is played 100 times as fast as normal, so that might amplify the problem.

mmattar · 2018-04-11T17:14:17Z

Hi @r-lipton, is this using one of our sample environments or your own? Generally, as @Hengoo and @MarcoMeter pointed out, we haven't noticed this on our environments.

Sohojoe · 2018-04-11T22:21:58Z

I have seen this problem with OpenAI.Baselines when invoking a 2nd training run. Setting gpu_options.allow_growth = True fixed it for me

replace trainer_controller.py line 212 with tf.Session(config=config) as sess: with:

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:

Update: I tested this today and was able to run multiple training runs concurrently on a single GPU

r-lipton · 2018-04-13T12:54:17Z

It's using my own created environment.
The solution of @Sohojoe worked for me, thanks!

awjuliani · 2018-09-06T20:12:39Z

Hi all. I've made a PR for this, and it will be added to the v0.5 release. #1192

lock · 2020-01-03T02:58:15Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

mmattar added help-wanted Issue contains request for help or information. needs-info Issue contains insufficient information to be resolved. labels Apr 11, 2018

r-lipton closed this as completed Apr 13, 2018

Sohojoe mentioned this issue Jun 30, 2018

How and where to reduce GPU memory usage for Tensorflow-GPU #809

Closed

lock bot locked as resolved and limited conversation to collaborators Jan 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learn.py takes all GPU memory #612

Learn.py takes all GPU memory #612

r-lipton commented Apr 11, 2018

Hengoo commented Apr 11, 2018

mmattar commented Apr 11, 2018

Sohojoe commented Apr 11, 2018 •

edited

Loading

r-lipton commented Apr 13, 2018

awjuliani commented Sep 6, 2018

lock bot commented Jan 3, 2020

Learn.py takes all GPU memory #612

Learn.py takes all GPU memory #612

Comments

r-lipton commented Apr 11, 2018

Hengoo commented Apr 11, 2018

mmattar commented Apr 11, 2018

Sohojoe commented Apr 11, 2018 • edited Loading

r-lipton commented Apr 13, 2018

awjuliani commented Sep 6, 2018

lock bot commented Jan 3, 2020

Sohojoe commented Apr 11, 2018 •

edited

Loading