Starred repositories
An implementation of WaveNet with fast generation
A Flow-based Generative Network for Speech Synthesis
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Implements the unsupervised pre-training of convolutional neural networks
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
The Places365-CNNs for Scene Classification
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Speaker diarization scripts, based on AaltoASR
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
Tensorflow implementation of "Generalized End-to-End Loss for Speaker Verification"
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Easy/Updated Tensorflow Image Classification
An opinionated list of awesome Python frameworks, libraries, software and resources.
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model