GitHub - liujianzhi/EchoReel: An innovative method designed to augment the capabilities of existing video diffusion models

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

Jianzhi Liu Junchen Zhu Lianli Gao Jingkuan Song

University of Electronic Science and Technology of China

An innovative method designed to augment the capabilities of existing video diffusion models that can:
1️⃣ utilize multiple reference videos to achieve a broader spectrum of action imitation and generate novel actions without fine-tuning;
2️⃣ distill effective and related visual motion features instead of replicating the referred content.

"Imitation is the sincerest form of flattery that mediocrity can pay to greatness." — Oscar Wilde

✌️ Results

input text	Original VideoCrafter2	+ EchoReel
"A man is studying in the library"
"A man is skiing"
"A man is running"
"Couple walking on the beach"
"A man is carving a stone statue"

📝 Changelog

[2024.4.21] Release pretrain weight
[2024.3.18] Release train and inference code

⏳ TODO

Release code of LVDM text-to-video with EchoReel
Release training code
Release pretrained weight
Release image-to-video VideoCrafter code with EchoReel

⚙️ Setup

Please prepare .json data in the following format:

[
	{
		"input_text": ...,
		"gt_video_path": ...,
		"reference_text": ...,
		"reference_video_path": ...
	},
    ...
]

Install Environment via Anaconda

conda create -n EchoReel python=3.10.13
conda activate EchoReel
pip install -r requirements.txt

💫 For Try

Please ensure the pretrained weights are downloaded from our Hugging Face repository and subsequently placed in the designated 'checkpoint' folder. To optimize functionality, it is strongly advised to download the WebVid .csv file into the specified 'dataset' directory, thereby enabling seamless automatic reference video selection.

mkdir checkpoint
cd checkpoint
wget https://huggingface.co/cscrisp/EchoReel/resolve/main/checkpoint/checkpoint.pt
cd ..
mkdir dataset
cd datset
wget wget https://www.robots.ox.ac.uk/~maxbain/webvid/results_10M_train.csv
cd ..
python gr.py

💫 For Train

% use original LVDM pretrain weight to initialize model
wget -O models/t2v/model.ckpt https://huggingface.co/Yingqing/LVDM/resolve/main/lvdm_short/t2v.ckpt
bash train_EchoReel.sh

💫 For Sample

bash sample_EchoReel.sh

🔮 Pipeline

😉 Citation

@article{Liu2024EchoReel,
      title={EchoReel: Enhancing Action Generation of Existing Video Diffusion Models}, 
      author={Jianzhi Liu, Junchen Zhu, Lianli Gao, Jingkuan Song},
      year={2024},
      eprint={2403.11535},
      archivePrefix={arXiv},
}

🤗 Acknowledgements

We built our code partially based on latent video diffusion models. Thanks for their wonderful work!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
configs		configs
lvdm		lvdm
.gitignore		.gitignore
README.md		README.md
data.py		data.py
gr.py		gr.py
main.py		main.py
requirements.txt		requirements.txt
sample_EchoReel.sh		sample_EchoReel.sh
train_EchoReel.sh		train_EchoReel.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

✌️ Results

📝 Changelog

⏳ TODO

⚙️ Setup

💫 For Try

💫 For Train

💫 For Sample

🔮 Pipeline

😉 Citation

🤗 Acknowledgements

About

Languages

liujianzhi/EchoReel

Folders and files

Latest commit

History

Repository files navigation

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

✌️ Results

📝 Changelog

⏳ TODO

⚙️ Setup

💫 For Try

💫 For Train

💫 For Sample

🔮 Pipeline

😉 Citation

🤗 Acknowledgements

About

Resources

Stars

Watchers

Forks

Languages