-
NWPU
- China
Block or Report
Block or report IMYBo
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
Expressive Anechoic Recordings of Speech (EARS)
Predicts the level of noise and reverberation on your audiofiles
Some comprehensive papers about speaker diarization
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Open-Sora: Democratizing Efficient Video Production for All
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Python package for combining diarization system outputs.
Variational Bayes HMM over x-vectors diarization
Structured state space sequence models
This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
VB Diarization with Eigenvoice and HMM Priors, refactored
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
CHIME-7 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence architecture
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) l…
BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION
Official repository of NeXt-TDNN for speaker verification
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Python package to add text to images, textures and different backgrounds
speech-enhacement
(N=1,2,3)-dimensional unfold (im2col) and fold (col2im) in PyTorch
A series of large language models trained from scratch by developers @01-ai
BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab