Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.
/ maskgit Public archive

Official Jax Implementation of MaskGIT

License

Notifications You must be signed in to change notification settings

google-research/maskgit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MaskGIT: Masked Generative Image Transformer

Official Jax Implementation of the CVPR 2022 Paper

PWC PWC

[Paper] [Project Page] [Demo Colab]

teaser

Summary

MaskGIT is a novel image synthesis paradigm using a bidirectional transformer decoder. During training, MaskGIT learns to predict randomly masked tokens by attending to tokens in all directions. At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Running pretrained models

Class conditional Image Genration models:

Dataset Resolution Model Link FID
ImageNet 256 x 256 Tokenizer checkpoint 2.28 (reconstruction)
ImageNet 512 x 512 Tokenizer checkpoint 1.97 (reconstruction)
ImageNet 256 x 256 MaskGIT Transformer checkpoint 6.06 (generation)
ImageNet 512 x 512 MaskGIT Transformer checkpoint 7.32 (generation)

You can run these models for class-conditional image generation and editing in the demo Colab.

teaser

Training

[Coming Soon]

BibTeX

@InProceedings{chang2022maskgit,
  title = {MaskGIT: Masked Generative Image Transformer},
  author={Huiwen Chang and Han Zhang and Lu Jiang and Ce Liu and William T. Freeman},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2022}
}

Disclaimer

This is not an officially supported Google product.