complete instructions for tts/asr repos
donate
content
- tts
- melspec generators
- tacotron2 (nvidia)
- vocoders
- end to end
- seq to seq
- melspec generators
- asr
first time set-up (ubuntu 18.04.5 lts)
sudo apt update;sudo apt upgrade -y
sudo apt install wget git python3 python3-pip python3-venv nvidia-driver-460 -y
# (cuda+cudnn)
how to make v. env
# (replace name with your project name)
mkdir name
python3.6 -m venv name/env
source name/env/bin/activate
pip install -U pip setuptools wheel
cd name
tacotron2 (nvidia)
source
# (make v. env)
pip install torch==1.4.0 torchvision==0.5.0 #cuda 10.1 (update 2)
export CUDA_HOME=/usr/local/cuda #if apex setup gives error
git clone https://github.com/NVIDIA/apex
pip install --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex
git clone https://www.github.com/nvidia/tacotron2
cd tacotron2
git submodule init; git submodule update
# (in requirements.txt delete numpy==1.13.3)
pip install numpy==1.16 numba==0.48 tensorboard==2.5.0 jupyter
pip install -r req*
inference
# (download waveglow and tacotron checkpoints)
# (write in terminal: "jupyter notebook" and navigate to "inference.ipynb")
# (adjust checkpoint paths and run)
train
prepare dataset and filelists (default test dataset)
wget https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2 (inside tacotron2 git cloned folder)
tar -xvf *.tar.bz2
sed -i -- 's,DUMMY,LJSpeech-1.1/wavs,g' filelists/*.txt
edit hparams.py (batch size, filelists location, amp, text cleaners)
edit text/symbols.py (add symbols for trained language)
single gpu
python train.py --output_directory=outdir --log_directory=logdir
multiple gpu
python multiproc train.py --output_directory=outdir --log_directory=logdir --n_gpus <number of gpus>
continue training. non-english language can be trained faster by continuing to train on english checkpoint. non-english continued checkpoint can be inferenced with english waveglow checkpoint. after 100 epochs good speech quality on a dataset with 3700wav files
python multiproc train.py --output_directory=outdir --log_directory=logdir -c tacotron2_statedict.pt --warm_start