- Metz, France
- https://cassiotbatista.github.io
Block or Report
Block or report cassiotbatista
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (14)
Sort Name ascending (A-Z)
ASR
speech recognitioncareer
DataPipe
FA
forced alignmentFE
front end stuff in ASR pipelineLinux
bunch of plugins and dotfiles 😋LLM
OnDev
on device processing toolsSD
speaker diarization (potentially online-based)SER
speech emotion recognitionSPE
speaker profiling estimationSSL
self supervised learning in speechprocTTS
speech synthesisVAD
voice activity detectionStars
Language
Sort by: Recently starred
🌱 a curated list of tools to help you with your research/life; I built a front end around this repo, please use the link below [This repo is Not Maintained Anymore]
The official repository of Dynamic-SUPERB.
Efficient implementations of Needleman-Wunsch and other sequence alignment algorithms written in Rust with Python bindings via PyO3.
Artie Bias Corpus: an audio corpus + code for detecting demographic bias
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Machine Learning Engineering Open Book
Collection of audio-focused loss functions in PyTorch
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Multilingual Voice Understanding Model
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation.
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
💯 Curated coding interview preparation materials for busy software engineers
Portable package manager for Neovim that runs everywhere Neovim runs. Easily install and manage LSP servers, DAP servers, linters, and formatters.
Whisper fine-tuning event script to use multiple hf datasets
Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
Whisper realtime streaming for long speech-to-text transcription and translation
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
Collection of scripts from mHuBERT-147.