Skip to content
View DiLiangWU's full-sized avatar
  • Westlake University
  • Hangzhou

Organizations

@Audio-WestlakeU

Block or report DiLiangWU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Python implementation of COP-KMEANS algorithm

Python 159 45 Updated Mar 11, 2019

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NIPS 2024]

Python 71 7 Updated Sep 29, 2024

Some comprehensive papers about speaker diarization

199 3 Updated Aug 13, 2024

Rotary Transformer

Python 789 48 Updated Mar 21, 2022

Tools for handling speech data in machine learning projects.

Python 935 214 Updated Oct 1, 2024

Python interface to the WebRTC Voice Activity Detector

C 2,024 406 Updated Jul 4, 2024

Mamba SSM architecture

Python 12,720 1,070 Updated Sep 26, 2024

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]

Python 75 4 Updated Jan 24, 2024
Shell 47 7 Updated May 11, 2024

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 944 174 Updated Dec 22, 2023

The dataset of Speech Recognition

383 72 Updated Jul 2, 2024

A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"

Shell 43 2 Updated Sep 19, 2024

Multimodal speaker diarization using pre-trained audio-visual synchronization model

Python 9 6 Updated May 12, 2020

Both audio-only and audio-visual speaker diarization datasets are listed here.

10 Updated Feb 22, 2023

Out of time: automated lip sync in the wild

Python 651 145 Updated Jan 23, 2024

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recogniti…

Python 111 18 Updated Jun 10, 2022

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,170 5,318 Updated Sep 16, 2024

End-to-End Neural Diarization

Python 368 57 Updated Aug 30, 2021
Python 71 9 Updated Aug 21, 2024
Python 42 8 Updated Jan 15, 2021

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Python 70 17 Updated Oct 18, 2022

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

1,580 225 Updated Sep 20, 2024

A PyTorch implementation of End-to-End Neural Diarization

Python 98 15 Updated Jun 19, 2023

Exploring Unsupervised Cell Recognition with Prior Self-activation Maps (MICCAI 2023)

Python 8 1 Updated Oct 27, 2023

A library built for easier audio self-supervised training, downstream tasks evaluation

Python 98 10 Updated Aug 27, 2024

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Python 538 153 Updated Aug 19, 2023

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Python 102 13 Updated Mar 24, 2023

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Jupyter Notebook 84 12 Updated Aug 17, 2024

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Python 40 4 Updated Mar 20, 2024

This repository is the official implementation of "Unimodal Aggregation for CTC-based Speech Recognition".

Shell 13 3 Updated Sep 23, 2024
Next