Skip to content
View HenryZhou7's full-sized avatar

Block or report HenryZhou7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 5,902 754 Updated Sep 11, 2024

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,122 101 Updated Jul 11, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 20,600 2,092 Updated Jul 18, 2024

A feature-rich command-line audio/video downloader

Python 82,435 6,426 Updated Sep 14, 2024

Inference and training library for high-quality TTS models.

Python 4,184 411 Updated Aug 19, 2024

HellaSwag: Can a Machine _Really_ Finish Your Sentence?

Python 177 22 Updated May 28, 2020

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,374 255 Updated Jan 27, 2024

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 3,313 274 Updated Sep 5, 2024

simple and efficient python implemention of a series of adaptive filters. including time domain adaptive filters(lms、nlms、rls、ap、kalman)、nonlinear adaptive filters(volterra filter、functional link a…

Python 316 97 Updated Nov 29, 2021

Adaptive filtering module for Python

Python 101 35 Updated Jul 14, 2024

Python high-level interface and ctypes-based bindings for PulseAudio (libpulse)

Python 170 36 Updated Aug 27, 2024

LLM101n: Let's build a Storyteller

28,259 1,540 Updated Aug 1, 2024

MacOS system extension that allows applications to pass audio to other applications. Soundflower works on macOS Catalina.

Objective-C 8,858 610 Updated Feb 1, 2021

Whisper realtime streaming for long speech-to-text transcription and translation

Python 1,766 217 Updated Sep 1, 2024

Faster Whisper transcription with CTranslate2

Python 11,370 950 Updated Aug 21, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,440 196 Updated Aug 1, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 32,512 3,746 Updated Sep 13, 2024

Port of OpenAI's Whisper model in C/C++

C 34,406 3,498 Updated Sep 15, 2024

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 10,906 1,827 Updated Sep 5, 2024

A fast multimodal LLM for real-time voice

Python 845 45 Updated Sep 13, 2024

SpeechGPT Series: Speech Large Language Models

Python 1,218 81 Updated Jul 22, 2024

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,330 177 Updated Jul 16, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 11,871 836 Updated Sep 13, 2024

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 17,630 2,149 Updated Feb 4, 2024
JavaScript 116 24 Updated Apr 27, 2023

Examples in the MLX framework

Python 5,833 828 Updated Sep 14, 2024

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 43,597 5,194 Updated Aug 21, 2024

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 23,547 1,959 Updated Apr 24, 2024

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 53,858 5,564 Updated Aug 24, 2024
Next