Stars
A curated collection of SQA research
A toolkit dedicate for speech evaluation.
Speech Security and Privacy Compendium - Mini
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
Official repository for "AM-RADIO: Reduce All Domains Into One"
The official repository of Dynamic-SUPERB.
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation"
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Spoofing-robust speaker verification evaluation toolkit
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. Official website: https://asappresearch.github.io/slue-toolkit/
Script to download corpora from the Linguistic Data Consortium (LDC)
Confidence interval computation for evaluation in machine learning using the bootstrapping approach
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
A repo containing download guidance and corresponding scripts of the VoxBlink dataset.
An open source implementation of CLIP.
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
State-of-the-art 2D and 3D Face Analysis Project
Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation
A collection of papers related to speech model compression