open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,854 259 Updated Sep 25, 2024

hirofumi0810 / asr_preprocessing

Python implementation of pre-processing for End-to-End speech recognition

Python 69 23 Updated Feb 19, 2018

fschmid56 / EfficientAT

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

Python 223 43 Updated Apr 29, 2024

gmendes9 / multilingual_va_prediction

Repository for Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers

Python 25 Updated Feb 26, 2023

AI-S2-Lab / GPT-Talker

[ACMMM'2024] Generative Expressive Conversational Speech Synthesis

18 1 Updated Aug 20, 2024

youngyangyang04 / leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Shell 51,197 11,408 Updated Oct 14, 2024

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 151 10 Updated Jul 12, 2024

jrgillick / laughter-detection

Python 216 47 Updated Jul 25, 2024

Linear95 / CLUB

Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

Jupyter Notebook 305 39 Updated May 10, 2024

winddori2002 / DEX-TTS

DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability

Python 86 6 Updated Jul 10, 2024

liutaocode / TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 247 19 Updated Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

、、 hopingZ

Achievements

Achievements

Block or report hopingZ

Lists (1)

LightningCLI Examples

Stars

OpenBMB / MiniCPM

lochenchou / MOSNet

lucidrains / minGRU-pytorch

xingchensong / S3Tokenizer

yangdongchao / RSTnet

MahmoudAshraf97 / whisper-diarization

haoheliu / SemantiCodec-inference

kyutai-labs / moshi

YuanGongND / vocalsound

numediart / EmoV-DB

FireRedTeam / FireRedTTS

RWKV / rwkv.cpp

Plachtaa / seed-vc

vectominist / spin

jishengpeng / WavTokenizer

jingzhunxue / flow_mirror

XiaoMi / dasheng

gwh22 / LAFMA

HindujaB / AudioPreprocessing_amicorpus

gpt-omni / mini-omni