-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph is always saved, resulting in large log files #300
Comments
Hello,
I would say that is an expected feature when using tensorboard logging. And that 20MB is quite small compared to capacity of current harddrives.
Why would you terminate one run immediately after launching? |
My use case is currently as follows. I am debugging some code in a Colab GPU session. In doing so, I restart runs very often (to see some debugging info or to check if my changes fix the crash I am trying to debug). Each restart is logged automatically to Google Drive. When I am satisfied with the state of my code, I start full-scale training. When training is complete, I download all logging output to my machine to work with it locally, as analyzing output from a Colab session is inconvenient. However (with my current setup) debugging output is mixed with output from normal training output, so when I download logs, a large fraction of them is useless. I could of course tweak my current setup to add an option for disabling logging while I am debugging code, but that seems error-prone, as I may easily forget to re-enable it before starting a training run. In any case, an option like |
defining: tensorboard_log = None if DEBUG_MODE else 'path/to/log_folder' with a big warning (printing the current state your training: Something like: if DEBUG_MODE:
print("=" * 20)
print("WARNING: DEBUG_MODE enabled, no tensorboard log will be saved")
print("=" * 20)
# with maybe a time.sleep()
model.learn() Also, you should keep |
Thinking again about the issue, I'm wondering what does that EDIT: at first, I thought it was the computation graph but from what i remember, this one is created at the beginning of the run, not the end. |
I believe the |
Apparently, there is no way to prevent the computational graph from being saved. This results in large Tensorboard log files for each training run (~20 Mb for Atari/PPO2/NatureCNN that is terminated almost immediately after launching).
The text was updated successfully, but these errors were encountered: