Merge pull request SeanNaren#222 from ryanleary/posix-args

Switch cmd-line args to POSIX-style
experimenti · Jan 18, 2018 · 7c79fbf · 7c79fbf
2 parents 29b1cc8 + bec7121
commit 7c79fbf
Show file tree

Hide file tree

Showing 14 changed files with 118 additions and 118 deletions.
diff --git a/README.md b/README.md
@@ -12,7 +12,7 @@ Creates a network based on the [DeepSpeech2](http:https://arxiv.org/pdf/1512.02595v1.p
 * Noise injection for online training to improve noise robustness.
 * Audio augmentation to improve noise robustness.
 * Easy start/stop capabilities in the event of crash or hard stop during training.
-* Visdom/Tensorboard support for visualising training graphs.
+* Visdom/Tensorboard support for visualizing training graphs.
 
 # Installation
 
@@ -102,7 +102,7 @@ In order to do this you must create the following folder structure and put the c
 
 ```
 cd data/
-mkdir LibriSpeech/ # This can be anything as long as you specify the directory path as --target_dir when running the librispeech.py script
+mkdir LibriSpeech/ # This can be anything as long as you specify the directory path as --target-dir when running the librispeech.py script
 mkdir LibriSpeech/val/
 mkdir LibriSpeech/test/
 mkdir LibriSpeech/train/
@@ -115,7 +115,7 @@ Optionally you can specify the exact librispeech files you want if you don't wan
 
 ```
 cd data/
-python librispeech.py --files_to_use "train-clean-100.tar.gz, train-clean-360.tar.gz,train-other-500.tar.gz, dev-clean.tar.gz,dev-other.tar.gz, test-clean.tar.gz,test-other.tar.gz"
+python librispeech.py --files-to-use "train-clean-100.tar.gz, train-clean-360.tar.gz,train-other-500.tar.gz, dev-clean.tar.gz,dev-other.tar.gz, test-clean.tar.gz,test-other.tar.gz"
 ```
 
 ### Custom Dataset
@@ -138,27 +138,27 @@ containing all the manifests you want to merge. You can also prune short and lon
 
 ```
 cd data/
-python merge_manifests.py --output_path merged_manifest.csv --merge_dir all_manifests/ --min_duration 1 --max_duration 15 # durations in seconds
+python merge_manifests.py --output-path merged_manifest.csv --merge-dir all-manifests/ --min-duration 1 --max-duration 15 # durations in seconds
 ```
 
 ## Training
 
 ```
-python train.py --train_manifest data/train_manifest.csv --val_manifest data/val_manifest.csv
+python train.py --train-manifest data/train_manifest.csv --val-manifest data/val_manifest.csv
 ```
 
 Use `python train.py --help` for more parameters and options.
 
-There is also [Visdom](https://github.com/facebookresearch/visdom) support to visualise training. Once a server has been started, to use:
+There is also [Visdom](https://github.com/facebookresearch/visdom) support to visualize training. Once a server has been started, to use:
 
 ```
 python train.py --visdom
 ```
 
-There is also [Tensorboard](https://github.com/lanpa/tensorboard-pytorch) support to visualise training. Follow the instructions to set up. To use:
+There is also [Tensorboard](https://github.com/lanpa/tensorboard-pytorch) support to visualize training. Follow the instructions to set up. To use:
 
 ```
-python train.py --tensorboard --logdir log_dir/ # Make sure the tensorboard instance is made pointing to this log directory
+python train.py --tensorboard --logdir log_dir/ # Make sure the Tensorboard instance is made pointing to this log directory
 ```
 
 For both visualisation tools, you can add your own name to the run by changing the `--id` parameter when training.
@@ -176,13 +176,13 @@ Applies small changes to the tempo and gain when loading audio to increase robus
 Dynamically adds noise into the training data to increase robustness. To use, first fill a directory up with all the noise files you want to sample from.
 The dataloader will randomly pick samples from this directory.
 
-To enable noise injection, use the `--noise_dir /path/to/noise/dir/` to specify where your noise files are. There are a few noise parameters to tweak, such as
-`--noise_prob` to determine the probability that noise is added, and the `--noise_min`, `--noise_max` parameters to determine the minimum and maximum noise to add in training.
+To enable noise injection, use the `--noise-dir /path/to/noise/dir/` to specify where your noise files are. There are a few noise parameters to tweak, such as
+`--noise_prob` to determine the probability that noise is added, and the `--noise-min`, `--noise-max` parameters to determine the minimum and maximum noise to add in training.
 
 Included is a script to inject noise into an audio file to hear what different noise levels/files would sound like. Useful for curating the noise dataset.
 
 ```
-python noise_inject.py --input_path /path/to/input.wav --noise_path /path/to/noise.wav --output_path /path/to/input_injected.wav --noise_level 0.5 # higher levels means more noise
+python noise_inject.py --input-path /path/to/input.wav --noise-path /path/to/noise.wav --output-path /path/to/input_injected.wav --noise-level 0.5 # higher levels means more noise
 ```
 
 ### Checkpoints
@@ -197,7 +197,7 @@ python train.py --checkpoint
 To enable checkpoints every N batches through the epoch as well as epoch saving:
 
 ```
-python train.py --checkpoint --checkpoint_per_batch N # N is the number of batches to wait till saving a checkpoint at this batch.
+python train.py --checkpoint --checkpoint-per-batch N # N is the number of batches to wait till saving a checkpoint at this batch.
 ```
 
 Note for the batch checkpointing system to work, you cannot change the batch size when loading a checkpointed model from it's original training
@@ -206,21 +206,21 @@ run.
 To continue from a checkpointed model that has been saved:
 
 ```
-python train.py --continue_from models/deepspeech_checkpoint_epoch_N_iter_N.pth.tar
+python train.py --continue-from models/deepspeech_checkpoint_epoch_N_iter_N.pth.tar
 ```
 
 This continues from the same training state as well as recreates the visdom graph to continue from if enabled.
 
 If you would like to start from a previous checkpoint model but not continue training, add the `--finetune` flag to restart training
-from the `--continue_from` weights.
+from the `--continue-from` weights.
 
 ### Choosing batch sizes
 
 Included is a script that can be used to benchmark whether training can occur on your hardware, and the limits on the size of the model/batch
 sizes you can use. To use:
 
 ```
-python benchmark.py --batch_size 32
+python benchmark.py --batch-size 32
 ```
 
 Use the flag `--help` to see other parameters that can be used with the script.
@@ -230,7 +230,7 @@ Use the flag `--help` to see other parameters that can be used with the script.
 Saved models contain the metadata of their training process. To see the metadata run the below command:
 
 ```
-python model.py --model_path models/deepspeech.pth.tar
+python model.py --model-path models/deepspeech.pth.tar
 ```
 
 To also note, there is no final softmax layer on the model as when trained, warp-ctc does this softmax internally. This will have to also be implemented in complex decoders if anything is built on top of the model, so take this into consideration!
@@ -240,13 +240,13 @@ To also note, there is no final softmax layer on the model as when trained, warp
 To evaluate a trained model on a test set (has to be in the same format as the training set):
 
 ```
-python test.py --model_path models/deepspeech.pth.tar --test_manifest /path/to/test_manifest.csv --cuda
+python test.py --model-path models/deepspeech.pth.tar --test-manifest /path/to/test_manifest.csv --cuda
 ```
 
 An example script to output a transcription has been provided:
 
 ```
-python transcribe.py --model_path models/deepspeech.pth.tar --audio_path /path/to/audio.wav
+python transcribe.py --model-path models/deepspeech.pth.tar --audio-path /path/to/audio.wav
 ```
 
 ### Alternate Decoders

diff --git a/benchmark.py b/benchmark.py
@@ -8,18 +8,18 @@
 from model import DeepSpeech, supported_rnns
 
 parser = argparse.ArgumentParser()
-parser.add_argument('--batch_size', type=int, default=32, help='Size of input')
+parser.add_argument('--batch-size', type=int, default=32, help='Size of input')
 parser.add_argument('--seconds', type=int, default=15,
  help='The size of the fake input in seconds using default stride of 0.01, '
  '15s is usually the maximum duration')
-parser.add_argument('--dry_runs', type=int, default=20, help='Dry runs before measuring performance')
+parser.add_argument('--dry-runs', type=int, default=20, help='Dry runs before measuring performance')
 parser.add_argument('--runs', type=int, default=20, help='How many benchmark runs to measure performance')
-parser.add_argument('--labels_path', default='labels.json', help='Path to the labels to infer over in the model')
-parser.add_argument('--hidden_size', default=400, type=int, help='Hidden size of RNNs')
-parser.add_argument('--hidden_layers', default=4, type=int, help='Number of RNN layers')
-parser.add_argument('--rnn_type', default='lstm', help='Type of the RNN. rnn|gru|lstm are supported')
-parser.add_argument('--sample_rate', default=16000, type=int, help='Sample rate')
-parser.add_argument('--window_size', default=.02, type=float, help='Window size for spectrogram in seconds')
+parser.add_argument('--labels-path', default='labels.json', help='Path to the labels to infer over in the model')
+parser.add_argument('--hidden-size', default=400, type=int, help='Hidden size of RNNs')
+parser.add_argument('--hidden-layers', default=4, type=int, help='Number of RNN layers')
+parser.add_argument('--rnn-type', default='lstm', help='Type of the RNN. rnn|gru|lstm are supported')
+parser.add_argument('--sample-rate', default=16000, type=int, help='Sample rate')
+parser.add_argument('--window-size', default=.02, type=float, help='Window size for spectrogram in seconds')
 args = parser.parse_args()
 
 input = torch.randn(args.batch_size, 1, 161, args.seconds * 100).cuda()

diff --git a/data/an4.py b/data/an4.py
@@ -8,10 +8,10 @@
 from utils import create_manifest
 
 parser = argparse.ArgumentParser(description='Processes and downloads an4.')
-parser.add_argument('--target_dir', default='an4_dataset/', help='Path to save dataset')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument('--target-dir', default='an4_dataset/', help='Path to save dataset')
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes training samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes training samples longer than the max duration (given in seconds, default 15)')
 args = parser.parse_args()
 

diff --git a/data/common_voice.py b/data/common_voice.py
@@ -8,14 +8,14 @@
 from utils import create_manifest
 
 parser = argparse.ArgumentParser(description='Downloads and processes Mozilla Common Voice dataset.')
-parser.add_argument("--target_dir", default='CommonVoice_dataset/', type=str, help="Directory to store the dataset.")
-parser.add_argument("--tar_path", type=str, help="Path to the Common Voice *.tar file if downloaded (Optional).")
-parser.add_argument('--sample_rate', default=16000, type=int, help='Sample rate')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument("--target-dir", default='CommonVoice_dataset/', type=str, help="Directory to store the dataset.")
+parser.add_argument("--tar-path", type=str, help="Path to the Common Voice *.tar file if downloaded (Optional).")
+parser.add_argument('--sample-rate', default=16000, type=int, help='Sample rate')
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes training samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes training samples longer than the max duration (given in seconds, default 15)')
-parser.add_argument('--files_to_process', default="cv-valid-dev.csv,cv-valid-test.csv,cv-valid-train.csv",
+parser.add_argument('--files-to-process', default="cv-valid-dev.csv,cv-valid-test.csv,cv-valid-train.csv",
  type=str, help='list of *.csv file names to process')
 args = parser.parse_args()
 COMMON_VOICE_URL = "https://common-voice-data-download.s3.amazonaws.com/cv_corpus_v1.tar.gz"
@@ -85,4 +85,4 @@ def main():
  args.max_duration)
 
 if __name__ == "__main__":
- main()
+ main()
diff --git a/data/librispeech.py b/data/librispeech.py
@@ -7,16 +7,16 @@
 import shutil
 
 parser = argparse.ArgumentParser(description='Processes and downloads LibriSpeech dataset.')
-parser.add_argument("--target_dir", default='LibriSpeech_dataset/', type=str, help="Directory to store the dataset.")
-parser.add_argument('--sample_rate', default=16000, type=int, help='Sample rate')
-parser.add_argument('--files_to_use', default="train-clean-100.tar.gz,"
+parser.add_argument("--target-dir", default='LibriSpeech_dataset/', type=str, help="Directory to store the dataset.")
+parser.add_argument('--sample-rate', default=16000, type=int, help='Sample rate')
+parser.add_argument('--files-to-use', default="train-clean-100.tar.gz,"
  "train-clean-360.tar.gz,train-other-500.tar.gz,"
  "dev-clean.tar.gz,dev-other.tar.gz,"
  "test-clean.tar.gz,test-other.tar.gz", type=str,
  help='list of file names to download')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes training samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes training samples longer than the max duration (given in seconds, default 15)')
 args = parser.parse_args()
 

diff --git a/data/merge_manifests.py b/data/merge_manifests.py
@@ -8,12 +8,12 @@
 from utils import order_and_prune_files
 
 parser = argparse.ArgumentParser(description='Merges all manifest CSV files in specified folder.')
-parser.add_argument('--merge_dir', default='manifests/', help='Path to all manifest files you want to merge')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument('--merge-dir', default='manifests/', help='Path to all manifest files you want to merge')
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes any samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes any samples longer than the max duration (given in seconds, default 15)')
-parser.add_argument('--output_path', default='merged_manifest.csv', help='Output path to merged manifest')
+parser.add_argument('--output-path', default='merged_manifest.csv', help='Output path to merged manifest')
 
 args = parser.parse_args()
 

diff --git a/data/ted.py b/data/ted.py
@@ -9,12 +9,12 @@
 from tqdm import tqdm
 
 parser = argparse.ArgumentParser(description='Processes and downloads TED-LIUMv2 dataset.')
-parser.add_argument("--target_dir", default='TEDLIUM_dataset/', type=str, help="Directory to store the dataset.")
-parser.add_argument("--tar_path", type=str, help="Path to the TEDLIUM_release tar if downloaded (Optional).")
-parser.add_argument('--sample_rate', default=16000, type=int, help='Sample rate')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument("--target-dir", default='TEDLIUM_dataset/', type=str, help="Directory to store the dataset.")
+parser.add_argument("--tar-path", type=str, help="Path to the TEDLIUM_release tar if downloaded (Optional).")
+parser.add_argument('--sample-rate', default=16000, type=int, help='Sample rate')
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes training samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes training samples longer than the max duration (given in seconds, default 15)')
 args = parser.parse_args()
 

diff --git a/data/voxforge.py b/data/voxforge.py
@@ -14,12 +14,12 @@
 VOXFORGE_URL_16kHz = 'http:https://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/'
 
 parser = argparse.ArgumentParser(description='Processes and downloads VoxForge dataset.')
-parser.add_argument("--target_dir", default='voxforge_dataset/', type=str, help="Directory to store the dataset.")
-parser.add_argument('--sample_rate', default=16000,
+parser.add_argument("--target-dir", default='voxforge_dataset/', type=str, help="Directory to store the dataset.")
+parser.add_argument('--sample-rate', default=16000,
  type=int, help='Sample rate')
-parser.add_argument('--min_duration', default=1, type=int,
+parser.add_argument('--min-duration', default=1, type=int,
  help='Prunes training samples shorter than the min duration (given in seconds, default 1)')
-parser.add_argument('--max_duration', default=15, type=int,
+parser.add_argument('--max-duration', default=15, type=int,
  help='Prunes training samples longer than the max duration (given in seconds, default 15)')
 args = parser.parse_args()
 

diff --git a/model.py b/model.py
@@ -286,7 +286,7 @@ def get_meta(model):
  import argparse
 
  parser = argparse.ArgumentParser(description='DeepSpeech model information')
- parser.add_argument('--model_path', default='models/deepspeech_final.pth.tar',
+ parser.add_argument('--model-path', default='models/deepspeech_final.pth.tar',
  help='Path to model file created by training')
  args = parser.parse_args()
  package = torch.load(args.model_path, map_location=lambda storage, loc: storage)

diff --git a/noise_inject.py b/noise_inject.py
@@ -6,11 +6,11 @@
 from data.data_loader import load_audio, NoiseInjection
 
 parser = argparse.ArgumentParser()
-parser.add_argument('--input_path', default='input.wav', help='The input audio to inject noise into')
-parser.add_argument('--noise_path', default='noise.wav', help='The noise file to mix in')
-parser.add_argument('--output_path', default='output.wav', help='The noise file to mix in')
-parser.add_argument('--sample_rate', default=16000, help='Sample rate to save output as')
-parser.add_argument('--noise_level', type=float, default=1.0,
+parser.add_argument('--input-path', default='input.wav', help='The input audio to inject noise into')
+parser.add_argument('--noise-path', default='noise.wav', help='The noise file to mix in')
+parser.add_argument('--output-path', default='output.wav', help='The noise file to mix in')
+parser.add_argument('--sample-rate', default=16000, help='Sample rate to save output as')
+parser.add_argument('--noise-level', type=float, default=1.0,
  help='The Signal to Noise ratio (higher means more noise)')
 args = parser.parse_args()