litwellchi

Follow

Xiaowei Chi litwellchi

Follow

PhD Student in HKUST

8 followers · 3 following

Achievements

Achievements

Lists (1)

Sort

🚀 My stack

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

UMass-Foundation-Model / 3D-VLA

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Python 332 13 Updated Oct 7, 2024

lucidrains / phenaki-pytorch

Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch

Python 748 78 Updated Jul 29, 2024

H-Freax / Awesome-Video-Robotic-Papers

This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.

117 6 Updated Aug 12, 2024

litwellchi / videocrafter-training-pytorch

Training code for the videocrafter.

Python 4 Updated May 27, 2024

RoyZry98 / MMTrail-Pytorch

Forked from litwellchi/MMTrail

[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

4 Updated Aug 7, 2024

opendilab / awesome-diffusion-model-in-rl

A curated list of Diffusion Model in RL resources (continually updated)

786 42 Updated Oct 10, 2024

1x-technologies / 1xgpt

world modeling challenge for humanoid robots

Python 329 21 Updated Aug 23, 2024

Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)

Python 1,565 169 Updated Sep 7, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 21,913 2,134 Updated Aug 9, 2024

haoranD / Awesome-Embodied-AI

A curated list of awesome papers on Embodied AI and related research/industry-driven resources.

272 9 Updated Jul 26, 2024

mihirp1998 / AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods…

Python 234 8 Updated Oct 7, 2024

litwellchi / MMTrail

[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

22 1 Updated Sep 2, 2024

bytedance / 1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Jupyter Notebook 429 16 Updated Oct 16, 2024

eric-ai-lab / MMWorld

Official repo of the paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"

Python 20 2 Updated Sep 21, 2024

mees / calvin

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

Python 380 56 Updated Sep 1, 2024

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 3,703 281 Updated Aug 31, 2024

maitrix-org / Pandora

Pandora: Towards General World Model with Natural Language Actions and Video States

Python 471 33 Updated Sep 23, 2024

pengHTYX / Era3D

Python 522 24 Updated Oct 16, 2024

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,165 92 Updated Mar 8, 2023

litwellchi / lvm_datapipe

data pipeline code of large video generation model

Python 7 Updated Sep 2, 2024

yzxing87 / Seeing-and-Hearing

[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Python 123 6 Updated Jul 6, 2024

pixeli99 / SVD_Xtend

Stable Video Diffusion Training Code and Extensions.

Python 583 57 Updated Jul 25, 2024

Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,669 177 Updated Sep 28, 2024

jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 513 28 Updated Oct 6, 2024

victorsungo / MMDialog

The official site of paper MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation

Python 190 7 Updated Sep 3, 2023

google-research / magvit

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

Python 947 42 Updated Jan 17, 2024

lucidrains / magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Python 554 34 Updated Oct 14, 2024

guoyww / AnimateDiff

Official implementation of AnimateDiff.

Python 10,432 859 Updated Jul 31, 2024

litwellchi / M2Chat

Python 32 1 Updated Jul 5, 2024

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

HTML 326 17 Updated Oct 17, 2024