Skip to content

red4711/SpeechRecognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Program overview This github contains two different python script: one to train the neural network model and one to test it against a given data sets. The neural network is a recurrent neural network (LTSM) of 3 layer each with 256 nodes. The network takes in a 256^2 sized array which represents an equivalent spectrogram of the spoken word audio file. The data is already preprocessed. The neural network outputs the first letter of the letter. The resulted accuracy of the best neural network configuration is 40%.

Program dependencies: Tensorflow and scikit-image

To run the predictions on a trained model: run test-deep-net.py Outputs will be averages for N number of batches

To run training algorithm: You will need to acquire the data set at https://www.dropbox.com/s/2ff0x8z60bjjz7b/spoken_words.tar?dl=0 Extract all the data into a folder call spoken_words relative to your current directory of deep-net.py run deep-net.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages