FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Mo, Sicheng; Mu, Fangzhou; Lin, Kuan Heng; Liu, Yanli; Guan, Bochen; Li, Yin; Zhou, Bolei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.07536 (cs)

[Submitted on 12 Dec 2023]

Title:FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Authors:Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou

View PDF HTML (experimental)

Abstract:Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models. However, auxiliary modules have to be trained for each type of spatial condition, model architecture, and checkpoint, putting them at odds with the diverse intents and preferences a human designer would like to convey to the AI models during the content creation process. In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. Extensive qualitative and quantitative experiments demonstrate the superior performance of FreeControl across a variety of pre-trained T2I models. In particular, FreeControl facilitates convenient training-free control over many different architectures and checkpoints, allows the challenging input conditions on which most of the existing training-free methods fail, and achieves competitive synthesis quality with training-based approaches.

Comments:	Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.07536 [cs.CV]
	(or arXiv:2312.07536v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.07536

Submission history

From: Sicheng Mo [view email]
[v1] Tue, 12 Dec 2023 18:59:14 UTC (29,503 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators