Skip to content

Official code for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

Notifications You must be signed in to change notification settings

LingjieKong-fdu/AnyMaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 

Repository files navigation

AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

[Paper]   [Project Page]  

Core Properties:

  1. General field: Applicable to general objects, including but not limited to: human face, clothes, tools, animal, ...
  2. Fast generation: Achieve object customization within seconds, without any further fine-tuning.
  3. User friendly: Just one single reference image and a text prompt can meet the customization requirement.
  4. Outstanding results: Ensure high ID fidelity, flexible text editability and high quality.
  5. Various applications: Diverse applications such as general object customization, virtual try-on, ID-mixing, ...

Method Framework:

AnyMaker consists of three crucial ID processing modules: General ID Extraction Module, Dual-Level ID Injection Module, ID-Aware Decoupling Module.


Comparisons with Previous Works:

The AnyMaker exhibits outstanding capabilities of high-quality customization for general objects, and even beat task-specialized methods in the specific domains, such as human customization and virtual try-on, in terms of ID fidelity and text editability.


General ID Dataset:

To promot the research of the general object customization, we construct the first large-scale general ID dataset, named as Multi-Category ID-Consistent (MC-IDC) dataset. Our dataset consists of approximately 315,000 samples in total with more than 10,000 categories, covering various types such as human faces, animals, clothes, human-made tools, etc. Each sample consists of a reference image, a segmentation mask of the object of interest in the reference image, a target image, and a text caption of the target image. The reference image with its segmentation mask provides ID information, the text caption of the target image offers semantic-level guidance for generation, and the target image serves as the ground truth.

Updates

Codes and dataset will be released soon!

BibTeX

If you find AnyMaker useful for your research and applications, please cite using this BibTeX:

@article{kong2024anymaker,
  title={AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection},
  author={Kong, Lingjie and Wu, Kai and Hu, Xiaobin and Han, Wenhui and Peng, Jinlong and Xu, Chengming and Luo, Donghao and Zhang, Jiangning and Wang, Chengjie and Fu, Yanwei},
  journal={arXiv preprint arXiv:2406.11643},
  year={2024}
}

About

Official code for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published