Skip to content

Project to recognize user's voice through Deep Learning with Python with the LibriSpeech data.

License

Notifications You must be signed in to change notification settings

Martins6/speaker_recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice Recognition

Project to recognize voices with Neural Networks with Python. The main purpose of this project is to explore the capabilities of Neural Networks and the extraction of features from signals in order to recognize different voices. This project was born out of the idea to build the hability to a virtual assistant to recognize solely your own voice. This idea came to me when contributing to the Jarvis open-source project, which is an virtual assistant for desktop.

Data preparation

The dataset that this project uses comes soley from the LibriSpeech. The files are very well organized. I've build two Juypter Notebooks to process the dataset first as a set of files with each speaker id (LibriSpeech_Files_Pre_Processing.ipynb), and then to extract features from their voices (Signal_Feature_Extraction.ipynb). You can define hyperparameters in order to prepare more data automatically. In the most recent run, I've used 30 different speakers with 30 seconds of audio recordings from each, approximately.

Neural Network Model

I've choosen to model the features extracted from the signal through dense layer Neural networks (a.k.a Deep Learning) so far I was able to achieve a 84% accuracy. I'm trying to avoid feeding a whole lot of data and exploring how to achieve more with less.

Future Works and Contribution

I hope to explore more the Fourier Transformation and Wavelets for the feature extraction from signals. Also, explore more different types of Neural Networks to better achieve results.

Acknowledgment

Jurgen Arias has documented a similar project very well. It has really helped this project of mine kickstart.

About

Project to recognize user's voice through Deep Learning with Python with the LibriSpeech data.

Topics

Resources

License

Stars

Watchers

Forks