nachotron-voice

Project completed in December 2022, training a Text-to-Speech model to clone the voice of Nach using a GAN. Examples:

For the project, we generated a dataset of Nach's voice, transcribing their discography with Whisper, and using Demucs to separate the voice from the music. Then we trained a TTS model using a CoquiTTS model as the base, and had fun generating new songs with the model. A presentation for the course 'Deep Learning for Audio Signal Processing' that describes the project can be found here in PDF or PowerPoint format.

Code

All the project work was done using Colab, with the following notebooks:

Data Collection. Process original discography in zip and store as separate songs.
Music Source Separation. Separate voice from music using Demucs for all songs.
Speech Transcription. Transcribe voice from all songs using Whisper.
Noise Reduction. Reduce noise from all voice tracks.
Dataset Preparation. Cut voice into segments detected as separate sentences by Whisper.
Speaker Identification. Perform speaker identification to detect Nach's voice, and filter out other voices and cuts without voice.
Train GAN. Train a TTS model using CoquiTTS as the base.
Align Voice and Music. Experiment to automatically align voice generated with our model and a music track.
Other Training. Retrain the model with a different dataset, and with different parameters.
Demo. Demo of the model, loads a model checkpoint and generate a voice track with a text.

The data used and the dataset is not released due to potential copyright issues. However, the scripts can be used to generate a similar dataset with any discography and replicate the voice cloning (possibly with a more recent and better model).

There are many aspects of this project that could be improved, but it was created in just a weekend purely for fun. We hope you enjoy the results!

Acknowledgments

We thank Nach for many years of great music, and the resources and libraries used for this project: Colab, CoquiTTS, Demucs, deep-speaker and Whisper.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nachotron-voice

Code

Acknowledgments

About

Languages

pablomm/nachotron-voice

Folders and files

Latest commit

History

Repository files navigation

nachotron-voice

Code

Acknowledgments

About

Resources

Stars

Watchers

Forks

Languages