colab-helper

Utility files to help set up colab for experimentation + development

The idea is to let you :

! git clone https://github.com/mdda/colab_helper
from colab_helper import utils as chu

at the top of a notebook, and have a bunch of useful stuff ready-to-go (you can choose the name under which to import it, so as to avoid collisions with your existing code).

Google Drive helper

This mounts your Google Drive at (per convention) ~/gdrive but also optionally adds a link, so that you can use a path that doesn't need the awkard space character introduced by 'My Drive' :

chu.gdrive_mount(point='gdrive', link='my_drive')
! ls -l my_drive/*

Downloader/Unwrapper

This just cleanly downloads data (unwrapping by default), without downloading (or unwrapping) when the required files are already present.

Single file (no unwrap required):

chu.download('https://redcatlabs.com/'
             +'downloads/deep-learning-workshop/notebooks/data/RNN/'
             +'glove.first-100k.6B.50d.txt')

More complex .tar.gz example (the dest_path parameter allows it to check on whether the unwrapped files have appeared in a particular directory) :

chu.download('https://www.openslr.org/'
             +'resources/1/waves_yesno.tar.gz', 
             dest_path='waves_yesno')

Kaggle Credentials helper

Generate the kaggle.json file and upload it to Colab, or just use your username and key in-line :

! pip install kaggle
chu.kaggle_credentials(file='./kaggle.json')

Then you can access the Kaggle CLI (see also the Kaggle API docs):

# Description page : https://www.kaggle.com/ronitf/heart-disease-uci
! kaggle datasets download ronitf/heart-disease-uci

SSH Reverse Proxy

This is for expert use only. If you don't know what this is doing, or how to get it to run, then this isn't something you should be messing with.

Note also that this is far more security conscientious than other scripts you might find on the web : It doesn't allow logins via passwords, for instance, nor execute arbitrary scripts downloaded from a url.

Example use (it will print out the required local ssh command) :

chu.ssh_reverse_proxy("""
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDEQbFcc8U/XMIUoATs+jGFIPMREgMlsLAnatzcc
OTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFFOTHERSTUFF
ihku00gbBwSOu2M38GMdGV9qU9XrEkLSjD/1WtzYJZL7buzpitlGlTvhnqQT+t [email protected]
""")

The pub_key field cleans out any line-breaks pasted in from ~/.ssh/id_rsa.pub for your convenience.
And, as an aside, there's no problem leaving your public key(s) in the colab file itself, since that's not the private key bit (obviously).

Using the rsync command given in the output, one can then do a auto-sync-to-colab for locally edited files (use the %autoreload 2 magic to transparently have the updated code reloaded as you run the notebook cells) :

while rsync-command-from-colab_helper; do inotifywait -qqre close_write,move,create,delete code/; done

File thinning

To reduce the number of saved checkpoints to 3 recent ones, plus 7 others with 'round numbers', simply :

! git clone https://github.com/mdda/colab_helper
from colab_helper import files as chf

chf.thin_numbered_files('./checkpoints/2019-07-26_01-clipnorm', delete=True)
# This will return a dict(keep=?[], ?delete=[], ?comment=''), 
# and if 'delete=True' is passed in, the 'delete' array will have been removed (careful!)

There are obviously more options, but the simple library is intended to 'do the right thing' (and returns its suggestions if delete=False, which is the default value). For instance, it should be able to figure out which filenames form a series (with the longest string prefix), and then extract the epoch/step numbers. Of course, it's assumed that the files are saved using some kind of convention, with the epoch/step as the first group of digits, like :

model_0040000_58.5306.pth.tar  
# ...
model_0085000_49.8920.pth.tar
# ...
model_0115000_46.9202.pth.tar

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.geany		.geany
img		img
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
files.py		files.py
tb_lite.py		tb_lite.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

colab-helper

Google Drive helper

Downloader/Unwrapper

Kaggle Credentials helper

SSH Reverse Proxy

File thinning

About

Releases

Packages

Languages

License

mdda/colab_helper

Folders and files

Latest commit

History

Repository files navigation

colab-helper

Google Drive helper

Downloader/Unwrapper

Kaggle Credentials helper

SSH Reverse Proxy

File thinning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages