Skip to content
View jiangsutx's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report jiangsutx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,536 882 Updated Oct 22, 2024

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 3,790 249 Updated Oct 23, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,578 150 Updated Sep 25, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,069 87 Updated Aug 6, 2024

[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research

Python 628 44 Updated Sep 27, 2024

Unified Multi-modal IAA Baseline and Benchmark

Python 70 5 Updated Sep 27, 2024

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,215 151 Updated Oct 8, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Python 2,791 179 Updated Oct 31, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,190 2,169 Updated Aug 9, 2024

TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)

Python 175 12 Updated Nov 17, 2023

[CVPR 2024] The official repo for "GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians"

Python 433 32 Updated Mar 26, 2024

[CSUR] A Survey on Video Diffusion Models

1,797 90 Updated Nov 8, 2024

Universal LLM Deployment Engine with ML Compilation

Python 19,155 1,573 Updated Nov 11, 2024

"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)

Python 2,238 143 Updated Dec 12, 2023

Mixture-of-Experts for Large Vision-Language Models

Python 1,977 125 Updated May 15, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 29,925 4,519 Updated Nov 11, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,606 421 Updated Nov 11, 2024

Character Animation (AnimateAnyone, Face Reenactment)

Python 3,168 248 Updated May 31, 2024

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,707 254 Updated Sep 15, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,794 257 Updated Jun 4, 2024

LLM Frontend for Power Users.

JavaScript 8,217 2,417 Updated Nov 8, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 20,935 2,139 Updated Jul 18, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,617 978 Updated Nov 6, 2024

State-of-the-art 2D and 3D Face Analysis Project

Python 23,385 5,412 Updated Nov 10, 2024

Consistency Distilled Diff VAE

Python 2,135 75 Updated Nov 7, 2023

A unified framework for 3D content generation.

Python 6,305 478 Updated Oct 21, 2024

A curated list of awesome projects and resources related to autonomous AI agents.

273 18 Updated Dec 29, 2023

A lightweight framework for building LLM-based agents

Python 1,851 195 Updated Nov 6, 2024
Next