FreeDoM

The official implementation of the paper:

"FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model"

By Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang

FreeDoM is a simple but effective training-free method generating results under control from various conditions using unconditional diffusion models. Specifically, we use off-the-shelf pre-trained networks to construct the time-independent energy function, which measures the distance between the given conditions and the intermediately generated images. Then we compute the energy gradient and use it to guide the generation process. FreeDoM supports various conditions, including texts, segmentation maps, sketches, landmarks, face IDs, and style images. FreeDoM applies to different data domains, including human faces, images from ImageNet, and latent codes.

This paper is under review, and we will release the codes and supplementary materials with implementation details after reviewing! You can check the demonstrated results generated by FreeDoM below.

Overall Experimental Configurations

Model Source	Data Domain	Resolution	Original Conditions	Additional Training-free Conditions	Sampling Time*(s/image)
SDEdit	aligned human face	$256\times256$	None	parsing maps, sketches, landmarks, face IDs, texts	≈20s
guided-diffusion	ImageNet	$256\times256$	None	texts, style images	≈140s
guided-diffusion	ImageNet	$256\times256$	class label	style images	≈50s
Stable Diffusion	general images	$512\times512$(standard)	texts	style images	≈84s
ControlNet	general images	$512\times512$(standard)	human poses, scribbles, texts	face IDs, style images	≈120s

*The sampling time is tested on a GeForce RTX 3090 GPU card.

Results

Training-free style guidance + Stable Diffusion (click to expand)

Training-free style guidance + Scribble ControlNet (click to expand)

Training-free face ID guidance + Human-pose ControlNet (click to expand)

Training-free text guidance on human faces (click to expand)

Training-free segmentation guidance on human faces (click to expand)

Training-free sketch guidance on human faces (click to expand)

Training-free landmarks guidance on human faces (click to expand)

Training-free face ID guidance on human faces (click to expand)

Training-free face ID guidance + landmarks guidance on human faces (click to expand)

Training-free text guidance + segmentation guidance on human faces (click to expand)

Training-free style transferring guidance + Stable Diffusion (click to expand)

Training-free text-guided face editting (click to expand)

Acknowledgments

Our work is standing on the shoulders of giants. We want to thank the following contributors that our code is based on:

open-source pre-trained diffusion models:
- (human face models) https://github.com/ermongroup/SDEdit
- (ImageNet mdoels) https://github.com/openai/guided-diffusion
- (Stable Diffusion) https://github.com/CompVis/stable-diffusion
- (ControlNet) https://github.com/lllyasviel/ControlNet
pre-trained networks for constructing the training-free energy functions:
- (texts, style images) https://github.com/openai/CLIP
- (face parsing maps) https://github.com/zllrunning/face-parsing.PyTorch
- (sketches) https://github.com/Mukosame/Anime2Sketch
- (face landmarks) https://github.com/cunjian/pytorch_face_landmark
- (face IDs) ArcFace(https://arxiv.org/abs/1801.07698)
time-travel strategy for better sampling:
- (DDNM) https://github.com/wyhuai/DDNM
- (Repaint) https://github.com/andreas128/RePaint

We also introduce some recent works that shared similar ideas by updating the clean intermediate results $\mathbf{x}_{0|t}$:

concurrent conditional image generation methods:
- https://github.com/arpitbansal297/Universal-Guided-Diffusion
- https://github.com/pix2pixzero/pix2pix-zero
zero-shot image restoration methods:
- (DDNM) https://github.com/wyhuai/DDNM
- (DDRM) https://github.com/bahjat-kawar/ddrm
- (Repaint) https://github.com/andreas128/RePaint
- (DPS) https://github.com/DPS2022/diffusion-posterior-sampling

Citation

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{yu2023freedom,
title={FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model},
author={Yu, Jiwen and Wang, Yinhuai and Zhao, Chen and Ghanem, Bernard and Zhang, Jian},
journal={arXiv:2303.09833},
year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
figure		figure
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FreeDoM

Overall Experimental Configurations

Results

Acknowledgments

Citation

About

Releases

Packages

wjgaas/FreeDoM

Folders and files

Latest commit

History

Repository files navigation

FreeDoM

Overall Experimental Configurations

Results

Acknowledgments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages