Skip to content

HowToBasic for working efficiently and collaborating constructively on machines

License

Notifications You must be signed in to change notification settings

willpower057/medg-toolbox

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How to Work with Computers

Overview

Terminal

  • Grab a good terminal application! (I personally use iTerm2 for Mac)
  • Grab a good font! (I personally use Meslo 13pt)
  • Grab a good screen!

Mobile Shell (mosh)

Ever feel frustrated when

  • you have to reconnect to the server every time your computer wakes up?
  • it lags while you are on tethered internet/airplane WiFi?

No sweat!

local~$ mosh $MY_ID@nightingale.csail.mit.edu

Jupyter Notebook

Ever feel frustrated when

  • you have to remember every argument to properly start notebook?
  • you have to copy the port over to browser?
  • you get
    /usr/bin/xdg-open: 778: /usr/bin/xdg-open: iceweasel: not found
    /usr/bin/xdg-open: 778: /usr/bin/xdg-open: seamonkey: not found
    ...

No sweat!

# remote:~/.bashrc
alias notebook="jupyter notebook --ip 0.0.0.0 --port $MY_FAV_PORT --no-browser"

and always go to http:https://nightingale.csail.mit.edu:$MY_FAV_PORT.

Did you know there's a terminal interface?

Bourne Shell (bash)

Ever feel frustrated when

  • there is not a default virtual environment when you log in?
  • always having to type nvidia-smi for nvidia-smi?

No sweat!

Just add aliases and startup scripts to ~/.bashrc!

# remote:~/.bashrc
if (tty -s); then
    source activate $MY_CONDA_PATH
fi

alias smi="nvidia-smi"

Notes

Some very useful command

  • htop: monitors basically everything, from CPU load, memory, to process IDs

Terminal Multiplexer (tmux) and longtmux

Ever feel frustrated when

  • you want to run very long tasks and need to keep ssh open?
  • seeing this when your session runs over 8 days on the server?
    Could not find platform independent libraries <prefix>
    Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
    Fatal Python error: Py_Initialize: Unable to get the locale encoding
    LookupError: no codec search functions registered: can't find encoding
    
    Current thread 0x00007f2e99afc700 (most recent call first):
    Aborted

No sweat!

# Getting your Kerberos ticket, "--keychain" enables you to use only "kinit" from now on
local~$ kinit --keychain $MY_ID@CSAIL.MIT.EDU
# Fire up a longtmux remotely and quit, because
local~$ ssh $MY_ID@nightingale.csail.mit.edu longtmux
# Because we want to MOSH IN and tmux [a]ttach that session
local~$ mosh $MY_ID@nightingale.csail.mit.edu tmux a

When your ticket expire on the server

remote~$ kinit && aklog

There are more with tmux!

  • Custom status bar
  • Split panes
  • Split tabs
  • Mouse support (Yes, you can click in text editors with mouse! Even drag those separators!)
# remote:~/.tmux.conf
set-option -g base-index 1
set-option -g default-terminal "screen-256color"
# ... (see .tmux.conf for more)

Anaconda (conda)

Ever feel frustrated when

  • you find messy package dependencies that pip does not quite manage well
  • you want to share virtual environment between collaborators

No sweat!

To create from an exported yml file

# The exported yml should contain a minimal list of dependencies to avoid clutter
remote~$ conda env create -f environment.yml

To create a shared environment

# Make sure your current python interpreter is accessible by all users!
remote~$ deactivate
# Please refer to the hard disk section to check the shared folder to use
remote~$ conda env create --prefix $SHARED_FOLDER python=3.6.5 --copy

Notes

  • conda can be slow installing packages as it checks beyond python package dependencies: it also checks for library dependencies
  • When you encounter OSError: [Errno 28] No space left on device: this is because conda caches packages in your ~/.conda. Simply do conda clean -a

pip

Ever feel frustrated when

  • conda takes forever to install simple packages?

No sweat!

remote~$ pip install -r requirements.txt

Notes

Useful packages for various purposes

  • gpustat: GPU stats. Best used as watch --color gpustat -ucp --color

  • tqdm: nice progress bar to monitor training progress
  • htmltag + json2html: pretty demo for your project

remote~$ pip install powerline-shell
# remote:~/.bashrc
function _update_ps1() {
    PS1=$(powerline-shell $?)
}

if [[ $TERM != linux && ! $PROMPT_COMMAND =~ _update_ps1 ]]; then
    PROMPT_COMMAND="_update_ps1; $PROMPT_COMMAND"
fi

Machines

Name Memory GPU Best used for
nightingale 1008 GB 4 x GeForce GTX TITAN X CPU memory intensive tasks
harrison 126 GB 4 x GeForce GTX TITAN X GPU intensive tasks
gray 126 GB 4 x GeForce GTX TITAN X GPU intensive tasks
safar 193 GB CPU intensive tasks
  • Transfer files between machines with rsync rather than using shared disk

GPU (Very important!)

Ever feel frustrated when

  • some takes all available GPU memory but not actually performing any computation?
  • managing multiple experiments running on multiple machines?

No sweat!

  • Limit memory growth!
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    session = tf.Session(config=config, ...)
  • Set CUDA_VISIBLE_DEVICES environment variable before running the program
    remote~$ CUDA_VISIBLE_DEVICES=0 python mine-ethereum-hehe.py 
  • Fill in GPU allocation sheets!

Hard Disk

Ever feel frustrated when

  • running out of disk space right before a deadline?
  • migrating data across devices to work across machines?
  • oh snap I deleted my code!

No sweat!

  • Information in the table may not be accurate, please correct me
  • Need discussion here: periodic housekeeping?
Type Mount path Purpose Backup
AFS ~ or /afs/csail.mit.edu/u/m/$MY_ID lightweight files (code, cache) ~/.snapshot
NFS (production tier) /data/medg (raw data)? /data/medg/misc/.zfs/snapshot
Local (each machine) /crimea datasets
Local (each machine) /scratch (cached data, models)?

Notes

  • How much storage is the current directory using? du -sh
  • What about the disks? How much space is left? df -h

Case study

Starting a project

  1. Figure out a group name of the project, and ask system admin to create a user group on all machines (or the specific machine you are working on) with all collaborators added the group.

  2. Identify a dataset root (e.g., /data/medg/misc/definitely-not-cryptomining) with the correct group access (chgrp -R ...).

  3. Locate a folder for code (e.g., ~/definitely-not-cryptomining). Note that the code should be only accessible by you; any code sharing should happen over version control software.

  4. Find a local working root directory (e.g. /scratch/definitely-not-cryptomining), and sync data over with rsync.

  5. (Optional, but extremely recommended) Instantiate a shared virtual environment in the local directory.

  6. Happy coding.

  7. When you are running tasks, try htop and nvidia-smi (or watch --color gpustat -ucp --color) to determine the best machines/GPUs. Try not to overuse hardware resources.

  8. Save intermediate results/models to local project directory.

  9. Periodically push your local git commits to GitHub.

Harry's setup

  1. I have mosh set up, so basically the following login steps are persistent for a few days before I restart the sessions.

  2. On local machine I have the following bash functions that enable blazing fast access: only need to kinit && tmux-csail nightingale

    function tmux-csail() {
        ssh stmharry@$1.csail.mit.edu longtmux
        mosh stmharry@$1.csail.mit.edu tmux a
    }
  3. I can access different machines on different iTerm tabs, and since inside a tmux session I as well have tabs for file editing, program running, and resource monitoring. (i.e. tmux tabs under iTerm tabs)

  4. I use vim as the editor for most code editing for a lot of python handy plugins. I can edit a handful of files at the same time for its support for tabs. (i.e. vim tabs under tmux tabs under iTerm tabs)

  5. In my ~/.bashrc there are a few aliases that helps me with commands.

About

HowToBasic for working efficiently and collaborating constructively on machines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.0%
  • Vim Script 24.3%
  • Shell 7.7%