Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
figs		figs
genpercept		genpercept
input		input
scripts		scripts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean_pycache.sh		clean_pycache.sh
empty_text_embed.npy		empty_text_embed.npy
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

GenPercept: Diffusion Models Trained with Large Data Are Transferable Visual Models

Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, Chunhua Shen,

Zhejiang University

HuggingFace | arXiv

🔥 Fine-tune diffusion models for perception tasks, and inference with only one step! ✈️

Dependencies

conda create -n genpercept python=3.10
conda activate genpercept
pip install -r requirements.txt
pip install -e .

Inference

Download the pre-trained depth model depth_v1.zip from BaiduNetDisk (Extract code: z938) or Rec Cloud Disk. Put the package under ./weights/ and unzip it, the checkpoint will be stored under ./weights/depth_v1/.

Then, place images in the ./input/ dictionary, and run the following script. The output depth will be saved in ./output/.

source scripts/inference_depth.sh

Thanks to our one-step perception paradigm, the inference process runs much faster. (Around 0.4s for each image on an A800 GPU card.)

Recommanded Works

Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation. arXiv, GitHub.
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image. arXiv, GitHub.
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models. arXiv, GitHub.

Results in Paper

Depth and Surface Normal

Dichotomous Image Segmentation

Image Matting

Human Pose Estimation

🎫 License

For non-commercial use, this code is released under the LICENSE. For commercial use, please contact Chunhua Shen.

🖊️ Citation

@article{xu2024diffusion,
  title={Diffusion Models Trained with Large Data Are Transferable Visual Models},
  author={Xu, Guangkai and Ge, Yongtao and Liu, Mingyu and Fan, Chengxiang and Xie, Kangyang and Zhao, Zhiyue and Chen, Hao and Shen, Chunhua},
  journal={arXiv preprint arXiv:2403.06090},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenPercept: Diffusion Models Trained with Large Data Are Transferable Visual Models

HuggingFace | arXiv

🔥 Fine-tune diffusion models for perception tasks, and inference with only one step! ✈️

Dependencies

Inference

Recommanded Works

Results in Paper

Depth and Surface Normal

Dichotomous Image Segmentation

Image Matting

Human Pose Estimation

🎫 License

🖊️ Citation

About

Releases

Packages

Contributors 4

Languages

License

aim-uofa/GenPercept

Folders and files

Latest commit

History

Repository files navigation

GenPercept: Diffusion Models Trained with Large Data Are Transferable Visual Models

HuggingFace | arXiv

🔥 Fine-tune diffusion models for perception tasks, and inference with only one step! ✈️

Dependencies

Inference

Recommanded Works

Results in Paper

Depth and Surface Normal

Dichotomous Image Segmentation

Image Matting

Human Pose Estimation

🎫 License

🖊️ Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages