GitHub - ZHU-Zhiyu/NVS_Solver: Source code of paper "NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer"

NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

Single-View	Ignatius	Family	Palace	Church	Barn
Text2Nerf
MotionCtrl
NVS-Solver (DGS)
NVS-Solver (Post)

Multi-View	Truck	Playground	Caterpillar	Scan55	Scan3
3D-Aware
MotionCtrl
NVS-Solver (DGS)
NVS-Solver (Post)

Mono-Vid	Train	Bus	Kangroo	Train	Deer	Street
4DGS
MotionCtrl
NVS-Solver (DGS)
NVS-Solver (Post)

NVS-Solver for 360° NVS

NVS-Solver for moving animal/person

News!!

🔥🔥🔥 [08/08/2024] NVS-Solver is available in rerun, and a mini-demo.

To do list

Release NVS-Solver for SVD;
Release NVS-Solver for abitrary trajectory;
Support T2V diffusion models, we are testing some effective video diffusion models, e.g., CogVideo, LaVie, t2v-turbo;
Acceleration with ODE-Solvers, e.g., DPM-Solver;
....

Getting Started

Dependencies

Linux
Anaconda 3
Python 3.9
CUDA 12.0
RTX A6000

Installing

To get started, please create the virtual environment by:

python -m venv .env
source .env/bin/activate

Please install the PyTorch by:

pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

We use PyTorch 2.2.1 with CUDA 12.0, please install the version corresponding to your CUDA.

Please install the diffusers by:

pip install diffusers["torch"] transformers
pip install accelerate
pip install -e ".[torch]"

Please install required packages by:

pip install -r requirements.txt

Inference

Run

bash demo.sh

We provide the prompt images utilized in our experiments at single image and dynamic videos.

Our pipeline for abitrary trajectory

Prepare depth model

First please prepare Depth Anything V2. As our pipeline read depth as npy file, so please edit the run.py for saving predicted depth maps as npy:

add this after line 57

depth_np = depth

add this after line 73

np.save(os.path.join(args.outdir, os.path.splitext(os.path.basename(filename))[0] + '.npy'),depth_np)

Prepare your image

Put your image to a directory, e.g. /path_to_img.

DIR_PATH=/path_to_img
IMG_PATH=/path_to_img/001.jpg
DEPTH_PATH="${DIR_PATH}/depth"
mkdir -p "$DEPTH_PATH"

Predict depth for your image

cd /your_path_to_Depth-Anything-V2
python run.py --encoder vitl --img-path "$IMG_PATH" --outdir "$DEPTH_PATH"

Run the pipeline

cd /your_path_to_NVS_Solver
python svd_interpolate_single_img_traj.py --image_path "$IMG_PATH" --folder_path "$DIR_PATH" --iteration any_trajectory --radius 40 --end_position 30 2 -10 --lr 0.02 --weight_clamp 0.2

--raidus is the distance from the original camera and the center of the image, i.e., the depth of the center pixel, may need to change to accommodate different images. The original camera position is set to [radius,0,0].

--end_position is where the camera trajectory ends at as you like. It need three inputs for the camera position in X, Y, Z. The trajectory will be generated between original camera position to end position and the camera will always face to the center object of the given image.

Acknowledgement

Thanks for the following wonderful works: Diffusers, Depth Anything, Depth AnythingV2..

Citation

If you find the project is interesting, please cite

@article{you2024nvs,
title={NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer},
author={You, Meng and Zhu, Zhiyu and Liu, Hui and Hou, Junhui},
journal={arXiv preprint arXiv:2405.15364},
year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
Assets		Assets
example_imgs		example_imgs
sigmas		sigmas
src/diffusers		src/diffusers
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
_typos.toml		_typos.toml
demo.sh		demo.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
svd_interpolate_dyn_img.py		svd_interpolate_dyn_img.py
svd_interpolate_dyn_img_dgs.py		svd_interpolate_dyn_img_dgs.py
svd_interpolate_single_img.py		svd_interpolate_single_img.py
svd_interpolate_single_img_dgs.py		svd_interpolate_single_img_dgs.py
svd_interpolate_single_img_traj.py		svd_interpolate_single_img_traj.py
svd_interpolate_two_img.py		svd_interpolate_two_img.py
svd_interpolate_two_img_dgs.py		svd_interpolate_two_img_dgs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

NVS-Solver for 360° NVS

NVS-Solver for moving animal/person

News!!

To do list

Getting Started

Dependencies

Installing

Inference

Our pipeline for abitrary trajectory

Prepare depth model

Prepare your image

Run the pipeline

Acknowledgement

Citation

About

Releases

Packages

Contributors 2

Languages

ZHU-Zhiyu/NVS_Solver

Folders and files

Latest commit

History

Repository files navigation

NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

NVS-Solver for 360° NVS

NVS-Solver for moving animal/person

News!!

To do list

Getting Started

Dependencies

Installing

Inference

Our pipeline for abitrary trajectory

Prepare depth model

Prepare your image

Run the pipeline

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages