Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…

Python 154 10 Updated Jul 22, 2024

thuanz123 / enhancing-transformers

An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch

Python 275 35 Updated May 23, 2023

facebookresearch / ego4d-goalstep

Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)

Python 31 Updated Apr 15, 2024

open-mmlab / FoleyCrafter

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝

Python 323 20 Updated Jul 26, 2024

snap-research / VIMI

HTML 13 Updated Jul 10, 2024

KwaiVGI / LivePortrait

Bring portraits to life!

Python 8,897 845 Updated Jul 30, 2024

ehristoforu / FluentlyDiffusion

Modern Stable Diffusion models family - Fluently

Python 24 2 Updated Jun 6, 2024

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 256 13 Updated Jul 28, 2024

bytedance / 1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Jupyter Notebook 300 8 Updated Jul 3, 2024

PhoenixZ810 / MG-LLaVA

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 124 2 Updated Jul 19, 2024

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 6,030 538 Updated Jul 30, 2024

apple / ml-4m

4M: Massively Multimodal Masked Modeling

Python 1,454 84 Updated Jul 17, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,612 99 Updated Jul 26, 2024

zh460045050 / VQGAN-LC

Python 75 6 Updated Jun 28, 2024

eric-ai-lab / via-video

22 Updated Jun 20, 2024

DepthAnything / Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 2,742 203 Updated Jul 29, 2024

teacherpeterpan / Logic-LLM

The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"

C 215 34 Updated Jun 13, 2024

ali-vilab / MimicBrush

Official implementations for paper: Zero-shot Image Editing with Reference Imitation

Python 966 70 Updated Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tsu-Jui Fu tsujuifu

Achievements

Achievements

Highlights

Block or report tsujuifu

Stars

Xiaojiu-z / Stable-Hair

EmergenceAI / Agent-E

eric-ai-lab / Screen-Point-and-Read

weixi-feng / TC-Bench

NUS-HPC-AI-Lab / OpenDiT

facebookresearch / vggsfm

stanford-oval / storm

Leezekun / MMSci

ChenWu98 / cycle-diffusion

lsl001006 / ZONE

muzishen / IMAGDressing

chuanyangjin / fast-DiT

mihirp1998 / VADER