Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph is always saved, resulting in large log files #300

Open
dniku opened this issue May 1, 2019 · 5 comments
Open

Graph is always saved, resulting in large log files #300

dniku opened this issue May 1, 2019 · 5 comments
Labels
enhancement New feature or request

Comments

@dniku
Copy link

dniku commented May 1, 2019

Apparently, there is no way to prevent the computational graph from being saved. This results in large Tensorboard log files for each training run (~20 Mb for Atari/PPO2/NatureCNN that is terminated almost immediately after launching).

@araffin
Copy link
Collaborator

araffin commented May 2, 2019

Hello,

This results in large Tensorboard log files for each training run

I would say that is an expected feature when using tensorboard logging. And that 20MB is quite small compared to capacity of current harddrives.

is terminated almost immediately after launching

Why would you terminate one run immediately after launching?

@dniku
Copy link
Author

dniku commented May 2, 2019

My use case is currently as follows. I am debugging some code in a Colab GPU session. In doing so, I restart runs very often (to see some debugging info or to check if my changes fix the crash I am trying to debug). Each restart is logged automatically to Google Drive. When I am satisfied with the state of my code, I start full-scale training. When training is complete, I download all logging output to my machine to work with it locally, as analyzing output from a Colab session is inconvenient. However (with my current setup) debugging output is mixed with output from normal training output, so when I download logs, a large fraction of them is useless.

I could of course tweak my current setup to add an option for disabling logging while I am debugging code, but that seems error-prone, as I may easily forget to re-enable it before starting a training run. In any case, an option like save_graph seems useful, since probably you don't actually need to log graphs that often.

@araffin
Copy link
Collaborator

araffin commented May 2, 2019

I could of course tweak my current setup to add an option for disabling logging while I am debugging code, but that seems error-prone, as I may easily forget to re-enable it before starting a training run

defining:

tensorboard_log = None if DEBUG_MODE else 'path/to/log_folder'

with a big warning (printing the current state your training: DEBUG_MODE or not) before starting training does not seem so error prone to me.

Something like:

if DEBUG_MODE:
    print("=" *  20)
    print("WARNING: DEBUG_MODE enabled, no tensorboard log will be saved")
    print("=" *  20)
    # with maybe a time.sleep()


model.learn() 

Also, you should keep verbose=1 in both cases.

@araffin araffin added the enhancement New feature or request label May 2, 2019
@araffin
Copy link
Collaborator

araffin commented May 2, 2019

Thinking again about the issue, I'm wondering what does that self.graph contains that needs to be saved. @hill-a ?

EDIT: at first, I thought it was the computation graph but from what i remember, this one is created at the beginning of the run, not the end.

@hill-a
Copy link
Owner

hill-a commented May 2, 2019

I believe the self.writer.add_graph(self.graph) is used to save the strcture of the training graph, often for debugging purposes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants