GitHub

#Program overview This github contains two different python script: one to train the neural network model and one to test it against a given data sets. The neural network is a recurrent neural network (LTSM) of 3 layer each with 256 nodes. The network takes in a 256^2 sized array which represents an equivalent spectrogram of the spoken word audio file. The data is already preprocessed. The neural network outputs the first letter of the letter. The resulted accuracy of the best neural network configuration is 40%.

Program dependencies: Tensorflow and scikit-image

To run the predictions on a trained model: run test-deep-net.py Outputs will be averages for N number of batches

To run training algorithm: You will need to acquire the data set at https://www.dropbox.com/s/2ff0x8z60bjjz7b/spoken_words.tar?dl=0 Extract all the data into a folder call spoken_words relative to your current directory of deep-net.py run deep-net.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
deep-net.py		deep-net.py
model.ckpt		model.ckpt
speech_data.py		speech_data.py
test-deep-net.py		test-deep-net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

red4711/SpeechRecognition

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages