Add instruction to use specified GPU id #73

wkentaro · 2017-02-21T01:40:17Z

Related to #72
Is this information available in docs?

Related to pytorch#72 Is this information available in docs?

soumith · 2017-02-21T01:45:23Z

this information is available in the CUDA documentation.

wkentaro · 2017-02-21T01:59:02Z

Thanks!

* updates for torchtext and loading from snapshot * removing default snapshot and setting cuda to avoid pytroch issue 689 * Update main.py (pytorch#68) * Prevent 2 forward passes with detach * Add instruction to use specified GPU id (pytorch#73) Related to pytorch#72 Is this information available in docs? * OpenNMT example now updated and training * Initial checkin * add data text files * cleanup train.py * Remove language features leftovers * train from checkpoint * final cleanup * params should not be reinitialized for loaded models * moving param init * start_epoch should increment from saved epoch * fixed divide by zero error * [onmt] Update README with models; move data to AWS * multi-gpu via DataParallel * allowing the option of single device * altering translate to be compatible with nn.DataParallel * should not re-init params on load from chkpt * friendlier gpu options in translate * update translate.py gpu option * remove unused lines (pytorch#84) * remove unused import statements * remove unused variable and arguments * add flush to print (pytorch#81) With flush, log info can appear immediately when it is directed to a pipe or file. * Command Line Interface backwards compatible fix for models.py (pytorch#85) * commandline backwards compatible fix for models.py * changes formatting to accomodate a 120 char width * add lmdb to requirements of dcgan. Add instructions to download LSUN to dcgan/README * Revert "add flush to print" (pytorch#92) * Fix typo in logging of ImageNet model loading * Replace clip_gradient with torch's newly included clip_grad_norm (fixes pytorch#95) * fix for master * fixes for master * Change the calculation method of the match number. Since if dataset number is smaller than 64, it will be 0. * PTB LM example now has far better perplexity for both large (72.30) and small (87.17) models * update attribution of weight tying in word_language_model (pytorch#109) * Update model.py updated attribution of weight tying * Update README.md updated attribution of weight tying * translate bug fix * README changes for multi-gpu * removing reinit of checkpoint params again * using split instead of chunk * replacing opt.cuda with opt.gpus as needed * using ModuleList * default type for start_decay_at * decoder hidden state fix * nn.clip_grad_norm * adding src/tgt tokens/s * index in verbose translate was fixed * bug in total num predicted words * Variables in Translator can be volatile * removing unnecessary def * allowing lowercase option * pointing out one way to do bleu scores in README * adding files to ignore * preprocess needs to use lower option * tips for non-demo mt via flickr30k example * cleaning up readme * clean up the readme * spacing in readme * cudnn decoder * reverting cudnn decoder to lstmcell * new DataParallel allows dim 1; remove unnecessary transposes; add train_ppl to chkpt * mend * allows use of models trained on dataset to be trained on another; doesn't augment vocab * manual unrolling was broken for brnn; patch until varlen rnn replacement * allowing learning rate update for non-sgd optimizers * adding option to shuffle mini-batches * adding word level accuracy as a metric * touch ups and README updates * allowing validation data to volatile * num_batches was off by one * batch printing was off * curriculum off by one * accuracy now an average over log_interval batches * off by one in printing batch number * removing unused variables * saving with state_dict * state_dicts for translation and optimizer * Grouping bash commands together * backwards compatibility for checkpoints * one more lowercase in dict * Simple typo fix to the download script * Switch the model to evaluation mode before generation * Fix random seeding in DCGAN (pytorch#108) * fix seeding * fix typo * fixing some codestyle issues found by flake8 * move OpenNMT * remove OpenNMT and link to elsewhere * Remove rectifier before softmax (pytorch#117) * some wrong typing (pytorch#125) humber->number --batch-size->--batch_size * update formatting in README sub-headers were not displaying properly * apply normalization for output image of dcgan generator (pytorch#127) * lr floating division * fix direct access to subsections * Handle tied + dimensions mismatch (pytorch#124) * Fix highlighted headers in readme (pytorch#122) * fix a bug in DCGAN (pytorch#121) Using single GPU in DCGAN caused an error because 'gpu_ids' in forward function will be None . * fix dcgan * Add a time sequence prediction example (pytorch#118) * Update README.md * open without 'rb' caused Python 3 to open this in text mode and fail (pytorch#133) * Fix typo in imagenet/main.py (pytorch#135) * Changes in `reinforce.py` (pytorch#140) * save/load optimizer state (pytorch#141) * snli/model.py: can run using Python 2.* (pytorch#145) Python 2.* does not support ``out.view(*size, -1)''. Expand it to ``out.view(size[0], size[1], -1)''. Traceback (most recent call last): File "train.py", line 9, in <module> from model import SNLIClassifier File "snli/model.py", line 13 return out.view(*size, -1) SyntaxError: only named arguments may follow *expression * Fix help message for no-cuda (pytorch#153) * replace model by policy (pytorch#154) * mnist_hogwild manual breaking of gradient sharing removed (pytorch#138) * fast-neural-style example (pytorch#129) * Add fast-neural-style implementation * Rename directory * Add option for stylizing using GPU * Use state_dict for saving and loading model * Update vgg-model download link * Add script for downloading models, remove saved-models folder * Use pytorch's pretrained vgg * Remove cloning of intermediate outputs * Add pytorch vgg results, update README.md * Update README.md * Change default learning rate * Update README.md * Add content scaling in stylize, edit docstring * Refactor code * Use inbuilt Instance-Normalization, refactor code * Fix typo in README.md * Update README.md * Update models, photos, README.md * Refactor * Change affine and momentum parameters for InstanceNorm * Change mode back to training, refactor transformer_net After checkpointing the model remained in evaluation mode and hence no updates were made, add code to put the model back in training mode after checkpointing. Also use Volatile variable when stylizing images during testing * Refactor * Update stylized images * Update candy style image * Change reusing of Variables (pytorch#150) * parameter in test() function is useless * fix test() param, and fix bugs in nll_loss * add comments * remove useless line * unuse average over batch * added comments in snli/train.py, no code changes (pytorch#177) * fix bugs in generalization error calculation (pytorch#179) * added a function makedirs() which works both for python 2 and 3 (pytorch#176) * This PR fixes error raised during element-wise variable division. (pytorch#182) - Fixes pytorch#181 * Fix test_epoch typo (pytorch#183) test_loss /= len(test_loader.dataset) should be test_loss /= len(data_loader.dataset) * Fix test data in time_sequence_prediction (pytorch#186) Previously the training data input[:3] is incorrectly taken as the test data. Change the input for prediction to data[:3]. * README: Correct case and add link to PyTorch (pytorch#188) * Use nn.init from core to init SR example (pytorch#189) Makes it clearer what is going on in the init as well. Closes pytorch#161 * Remove unused imports in SR example (pytorch#190) * Add an option to perform distributed ImageNet training (pytorch#185) * Change "gain" -> "calculate_gain" (pytorch#192) Fixes pytorch#189 (comment) * fix for 0.2 * fix for 0.2 * mnist 0.2 fixes * Add model_names back. Fixes pytorch#195 * reinforcement_learning fix reward threshold * Change test DataLoader to use the test batch size The batch size for the training set was erroneously used for instantiating the DataLoader class for the test set. * minor spelling, evaluted->evaluated * change lr to 0.8 to fix the issue for 0.2 * Update README.md (pytorch#219) * vae: Fix `UserWarning` (pytorch#220) * h rather than c should be fed into the next layer of LSTM (pytorch#222) * Balance VAE losses, add reconstruction + sampling * Add support for CUDA * Fix VAE loss + improve reconstruction viz Closes pytorch#225 * Remove unused math import Closes pytorch#204 * Add link to script for preparing imagenet val data * Fix an argument instruction in mnist_hogwild * bug fix: vocab.load_vectors signature update * Document the data arrangement in word_language_model * Consistent newlines * Add linear layer to time series prediction As is the final network output is modulated with a tanh nonlinearity. This is undesirable. As a simple / realistic fix we add a final linear layer. * Swap PTB for Wikitext-2 (which is open access) * Scale -> Resize + RandomSizedCrop -> RandomResizedCrop * Update RL examples to use torch.distributions instead of reinforce * Fix action indexing * Fix bugs and improve performance * Replace WikiText-2 files with correct dataset (pytorch#264) * fix: `Multinomial` is now `Categorical` * Fix indentation to be self-consistent (pytorch#279) * Fix indentation to be self-consistent Replace 2-space with 4-space indentation * Fix indentation to be self-consistent Replace 2-space with 4-space indentation * Fix indentation Replace 3-space indentation with 4-space indentation * Fix UserWarning in two examples (pytorch#293) * Fix UserWarning This fixes the following warning in mnist/main.py src/torch_mnist.py:68: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. Performance is unaffected. * Fix UserWarning in mnist_hogwild In this case, dim=1 because the input tensor x has ndim=2. See _get_softmax_dim in https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py * Fix VAE losses (sum over everything) (pytorch#296) Closes pytorch#234 and pytorch#290 * Fix actor_critic example (pytorch#301) * smooth_l1_loss now requires shapes to match * once scalars are enabled we must torch.stack() instead of torch.cat() a list of scalars

Add instruction to use specified GPU id

5f6ab12

Related to pytorch#72 Is this information available in docs?

soumith merged commit 2ff9485 into pytorch:master Feb 21, 2017

wkentaro deleted the mnist_specified_gpu branch February 21, 2017 01:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add instruction to use specified GPU id #73

Add instruction to use specified GPU id #73

wkentaro commented Feb 21, 2017

soumith commented Feb 21, 2017

wkentaro commented Feb 21, 2017

Add instruction to use specified GPU id #73

Add instruction to use specified GPU id #73

Conversation

wkentaro commented Feb 21, 2017

soumith commented Feb 21, 2017

wkentaro commented Feb 21, 2017