-
The Chinese University of Hong Kong
- Hong Kong
- www.xtao.website
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
CoTracker is a model for tracking any point (pixel) on a video.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Lumina-T2X is a unified framework for Text to Any Modality Generation
[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Open-Sora: Democratizing Efficient Video Production for All
TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)
[CVPR 2024] The official repo for "GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians"
[CSUR] A Survey on Video Diffusion Models
Universal LLM Deployment Engine with ML Compilation
"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)
Mixture-of-Experts for Large Vision-Language Models
A high-throughput and memory-efficient inference and serving engine for LLMs
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Character Animation (AnimateAnyone, Face Reenactment)
Code and dataset for photorealistic Codec Avatars driven from audio
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
State-of-the-art 2D and 3D Face Analysis Project
A unified framework for 3D content generation.
A curated list of awesome projects and resources related to autonomous AI agents.
A lightweight framework for building LLM-based agents