Skip to content
View loretoparisi's full-sized avatar
🐍
NightShift
🐍
NightShift

Organizations

@Musixmatchdev @musixmatchresearch
Block or Report

Block or report loretoparisi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Audio

43 repositories

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 5,509 726 Updated Jul 12, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 31,958 3,837 Updated Jul 8, 2024

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

C++ 2,199 265 Updated Mar 11, 2024

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

C 422 31 Updated Jul 1, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 129,183 25,612 Updated Jul 12, 2024

A repository for demos illustrating features of the Web Speech API. See https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API for more details.

JavaScript 1,415 731 Updated Sep 10, 2022

Experiments with Hugging Face 🔬 🤗

Python 44 6 Updated Jun 17, 2024
Python 49 14 Updated May 31, 2023

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Python 1,254 176 Updated Apr 22, 2024

Control adaptive filters with neural networks.

Python 214 39 Updated Oct 4, 2023

Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

Python 40 6 Updated May 24, 2022

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Python 3,138 242 Updated Jul 11, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 64,470 7,511 Updated Jul 2, 2024

PyTorch implementation of DiffRoll, a diffusion-based generative automatic music transcription (AMT) model

Jupyter Notebook 64 11 Updated Dec 6, 2023

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,305 298 Updated Jan 4, 2024

Port of OpenAI's Whisper model in C/C++

C++ 33,081 3,311 Updated Jul 12, 2024

SDX23 startkit for the Demucs baselines.

Python 21 1 Updated Mar 3, 2023

Waveform generation from audio file

JavaScript 4 Updated Jan 24, 2020

Tiny data-over-sound library

C++ 1,895 141 Updated Feb 3, 2024

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook 4,252 358 Updated Apr 3, 2024

A fast, local neural text to speech system

C++ 5,095 357 Updated Jul 11, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 33,799 4,011 Updated Jul 10, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 20,207 2,025 Updated Jun 19, 2024

FFmpeg for browser, powered by WebAssembly

C 13,504 779 Updated Jul 11, 2024

A Web-based FFProbe. Powered by FFmpeg, Vue and Web Assembly!

Vue 139 33 Updated Dec 13, 2023

The Open Source Code of UniAudio

Python 479 31 Updated May 3, 2024

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 7,931 989 Updated Apr 24, 2024

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 3,348 243 Updated Jul 12, 2024
Python 356 30 Updated Nov 6, 2023
Python 146 4 Updated Feb 14, 2024