Name		Name	Last commit message	Last commit date
parent directory ..
Wav2Letter		Wav2Letter
README.md		README.md
infer.py		infer.py
infer.sh		infer.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

README.md

Wav2Letter Speech Recognition with oneflow

Implementation of Wav2Letter (a speech recognition model from Facebooks AI Research (FAIR)) with Oneflow.

Requirements

pip install -r requirements.txt

Data

We train and evaluate our models on Google Speech Command Dataset. This is a simple to use lightweight dataset for testing model performance.

Data Preprocess

data.py contains scripts to process google speech command audio data into features compatible with Wav2Letter.

This will process the google speech commands audio data into 13 mfcc features with a max framelength of 250 (these are short audio clips). Anything less will be padded with zeros. Target data will be integer encoded and also padded to have the same length. Final outputs are numpy arrays saved as x.npy and y.npy in the ./speech_data directory.

Train

bash train.sh

Infer

bash infer.sh

Wer

oneflow 0.3031 pytorch 0.3099

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wav2Letter

Wav2Letter

README.md

Wav2Letter Speech Recognition with oneflow

Requirements

Data

Data Preprocess

Train

Infer

Wer

Files

Wav2Letter

Directory actions

More options

Directory actions

More options

Latest commit

History

Wav2Letter

Folders and files

parent directory

README.md

Wav2Letter Speech Recognition with oneflow

Requirements

Data

Data Preprocess

Train

Infer

Wer