-
Tsinghua University
- Beijing, China
- https://longcw.github.io
Stars
Build real-time multimodal AI applications 🤖🎙️📹
Access large archives as a filesystem efficiently, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
An optimized pipeline for DINet reducing inference latency for up to 60% 🚀. Kudos for the authors of the original repo for this amazing work.
The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"
🤢 LipSick: Fast, High Quality, Low Resource Lipsync Tool 🤮
Inference and training library for high-quality TTS models.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion
End-to-end stack for WebRTC. SFU media server and SDKs.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Industry leading face manipulation platform
[ICLR 2024] Generalizable and Precise Head Avatar from Image(s)
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
PyTorch implementation for NED (CVPR 2022). It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.
ICPR 2020: Facial Expression Recognition using Residual Masking Network
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
A generative speech model for daily dialogue.