DiLiangWU

liangdi DiLiangWU

joint Phd student at Zhejiang University & Westlake University. Research interests: speaker diarization, speaker recognition, speech and language processing.

6 followers · 3 following

Westlake University
Hangzhou

Achievements

Organizations

Stars

Behrouz-Babaki / COP-Kmeans

A Python implementation of COP-KMEANS algorithm

Python 159 45 Updated Mar 11, 2019

Audio-WestlakeU / RealMAN

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NIPS 2024]

Python 71 7 Updated Sep 29, 2024

DongKeon / Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

199 3 Updated Aug 13, 2024

ZhuiyiTechnology / roformer

Rotary Transformer

Python 789 48 Updated Mar 21, 2022

lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.

Python 935 214 Updated Oct 1, 2024

wiseman / py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

C 2,024 406 Updated Jul 4, 2024

state-spaces / mamba

Mamba SSM architecture

Python 12,720 1,070 Updated Sep 26, 2024

Audio-WestlakeU / FS-EEND

The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]

Python 75 4 Updated Jan 24, 2024

BUTSpeechFIT / EEND_dataprep

Shell 47 7 Updated May 11, 2024

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 944 174 Updated Dec 22, 2023

double22a / speech_dataset

The dataset of Speech Recognition

383 72 Updated Jul 2, 2024

Maokui-He / NSD-MA-MSE

A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"

Shell 43 2 Updated Sep 19, 2024

Rehan-Ahmad / MultimodalDiarization

Multimodal speaker diarization using pre-trained audio-visual synchronization model

Python 9 6 Updated May 12, 2020

liutaocode / AwesomeDiarizationDataset

Both audio-only and audio-visual speaker diarization datasets are listed here.

10 Updated Feb 22, 2023

joonson / syncnet_python

Out of time: automated lip sync in the wild

Python 651 145 Updated Jan 23, 2024

yufan-aslp / AliMeeting

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recogniti…

Python 111 18 Updated Jun 10, 2022

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,170 5,318 Updated Sep 16, 2024

hitachi-speech / EEND

End-to-End Neural Diarization

Python 368 57 Updated Aug 30, 2021

BUTSpeechFIT / EEND

Python 71 9 Updated Aug 21, 2024

dodohow1011 / TS-VAD

Python 42 8 Updated Jan 15, 2021

nttcslab-sp / EEND-vector-clustering

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Python 70 17 Updated Oct 18, 2022

wq2012 / awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

1,580 225 Updated Sep 20, 2024

Xflick / EEND_PyTorch

A PyTorch implementation of End-to-End Neural Diarization

Python 98 15 Updated Jun 19, 2023

cpystan / PSM

Exploring Unsupervised Cell Recognition with Prior Self-activation Maps (MICCAI 2023)

Python 8 1 Updated Oct 27, 2023

Audio-WestlakeU / audiossl

A library built for easier audio self-supervised training, downstream tasks evaluation

Python 98 10 Updated Aug 27, 2024

Audio-WestlakeU / FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Python 538 153 Updated Aug 19, 2023

Audio-WestlakeU / McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Python 102 13 Updated Mar 24, 2023

Audio-WestlakeU / ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Jupyter Notebook 84 12 Updated Aug 17, 2024

Audio-WestlakeU / RVAE-EM

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Python 40 4 Updated Mar 20, 2024

Audio-WestlakeU / UMA-ASR

This repository is the official implementation of "Unimodal Aggregation for CTC-based Speech Recognition".

Shell 13 3 Updated Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

liangdi DiLiangWU

Achievements

Achievements

Organizations

Block or report DiLiangWU

Stars

Behrouz-Babaki / COP-Kmeans

Audio-WestlakeU / RealMAN

DongKeon / Awesome-Speaker-Diarization

ZhuiyiTechnology / roformer

lhotse-speech / lhotse

wiseman / py-webrtcvad

state-spaces / mamba

Audio-WestlakeU / FS-EEND

BUTSpeechFIT / EEND_dataprep

sooftware / conformer

double22a / speech_dataset

Maokui-He / NSD-MA-MSE

Rehan-Ahmad / MultimodalDiarization

liutaocode / AwesomeDiarizationDataset

joonson / syncnet_python

yufan-aslp / AliMeeting

kaldi-asr / kaldi

hitachi-speech / EEND

BUTSpeechFIT / EEND

dodohow1011 / TS-VAD

nttcslab-sp / EEND-vector-clustering

wq2012 / awesome-diarization

Xflick / EEND_PyTorch

cpystan / PSM

Audio-WestlakeU / audiossl

Audio-WestlakeU / FullSubNet

Audio-WestlakeU / McNet

Audio-WestlakeU / ATST-SED

Audio-WestlakeU / RVAE-EM

Audio-WestlakeU / UMA-ASR