An opensource speech-to-text software written in tensorflow.
Python3, portaudio19-dev and ffmpeg are required.
On Ubuntu install via
sudo apt install python3-pip portaudio19-dev ffmpeg
pip3 install git+https://github.com/timediv/speechT
Currently speechT is based on the Wav2Letter paper and the CTC loss function.
The speech corpus from http:https://www.openslr.org/12/ is automatically downloaded.
Note: The corpus is about 30GB!
The data must be preprocessed before training
speecht-cli preprocess
Then, to run the training, execute
speecht-cli train
Use --help
for more details.
To evaluate on the test set run
speecht-cli evaluate
Use --help
for more details.
To record using your microphone and then print the transcription run
speecht-cli record
Use --help
for more details.