Stars
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.
[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training
Starter code for working with the YouTube-8M dataset.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Implementation for paper: Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Support for Clarity Enhancement and Prediction Challenges (obsolete - see README)
Clarity Challenge toolkit - software for building Clarity Challenge systems
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
In defence of metric learning for speaker recognition
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
A time-domain extension to "Perceptual Contrast Stretching on Target Feature for Speech Enhancement"
Spectral Normalization for Keras Dense and Convolution Layers
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awards)
End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks (TASLP 2018)