XiaoYuanJun-zy

carlito XiaoYuanJun-zy

4 followers · 37 following

Block or Report

Block or report XiaoYuanJun-zy

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

Labmem-Zhouyx / CDFSE_FastSpeech2

The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”

Python 79 12 Updated Dec 20, 2022

PeiranLi0930 / L-SVD

Large-Scale Selfie Video Dataset (L-SVD): A Benchmark for Emotion Recognition

407 42 Updated Jun 10, 2024

BladeDancer957 / DualGATs

Code for ACL2023 paper 《DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations》

Python 58 12 Updated Aug 5, 2023

PetarV- / GAT

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

Python 3,161 642 Updated Apr 9, 2022

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,577 363 Updated Aug 10, 2024

Chris10M / Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

Python 73 19 Updated Nov 25, 2021

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 3,912 372 Updated Aug 16, 2024

dingchaoyue / AcFormer

Python 18 1 Updated Aug 2, 2023

983632847 / All-in-One

All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment

Python 11 Updated Dec 27, 2023

pliang279 / awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

5,778 836 Updated Jun 19, 2024

wyang-vis / EIFNet

Event-based Motion Deblurring with Modality-Aware Decomposition and Recomposition

Python 4 Updated Jul 8, 2024

Jay1Zhang / AVFAS

Python 2 Updated Jan 5, 2024

DreamMr / EST

Expression Snippet Transformer for Robust Video-based Facial Expression Recognition

Python 13 2 Updated Jan 27, 2024

nku-zhichengzhang / CTEN

[CVPR 2023] This is the official implementation of "Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network"

Python 28 Updated Jul 10, 2024

sunlicai / MAE-DFER

MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression Recognition (ACM MM 2023)

Python 89 12 Updated Jan 16, 2024

liutaocode / TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 177 17 Updated Aug 16, 2024

fengdu78 / deeplearning_ai_books

deeplearning.ai（吴恩达老师的深度学习课程笔记及资源）

HTML 17,795 5,860 Updated Apr 29, 2022

GalaxyCong / StyleDubber

[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"

Python 30 2 Updated Aug 13, 2024

VikParuchuri / texify

Math OCR model that outputs LaTeX and markdown

Python 681 55 Updated Jun 30, 2024

kornia / kornia

Geometric Computer Vision Library for Spatial AI

Python 9,713 949 Updated Aug 15, 2024

JeongHun0716 / vsr-low

Visual Speech Recognition For Low-Resource Languages with Automatic Labels

Python 5 Updated May 21, 2024

YasserdahouML / VSR_test_set

WildVSR

Python 11 Updated Dec 13, 2023

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 822 130 Updated Dec 7, 2023

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

Python 7,118 1,538 Updated Dec 29, 2023

modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,015 89 Updated Aug 12, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 5,509 597 Updated Aug 16, 2024

prajwalkr / vtp

Official Implementation of Visual Transformer Pooling for Lip reading

Roff 35 5 Updated Aug 8, 2022

ahaliassos / raven

Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)

Python 50 3 Updated Jul 18, 2024

mpc001 / auto_avsr

Auto-AVSR: Lip-Reading Sentences Project

Python 154 40 Updated Apr 16, 2024

Sally-SH / VSP-LLM

Python 286 25 Updated May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly