An implementation of Denoising Diffusion Probabilistic Models for image generation written in PyTorch. This roughly follows the original code by Ho et al. Unlike their implementation, however, my model allows for class conditioning through bias in residual blocks.
I have trained the model on MNIST and CIFAR-10 datasets. The model seemed to converge well on the MNIST dataset, producing realistic samples. However, I am yet to report the same CIFAR-10 quality that Ho. et al. provide in their paper. Here are the samples generated with a linear schedule after 2000 epochs:
Here is a sample of a diffusion sequence on MNIST:
- Training:
python scripts/train_cifar.py
- Testing:
python scripts/sample_images.py
- Environment:
-
pytorch 1.11.0, cudatoolkit 10.2
-
torchvision 0.12.0
-
wandb
- Preparation:
-
register an account on wandb
-
login with that account
-
record the personal private API key, e.g.,
44c0a5e96a4bae0dde739559a4e0e7035890ebb3
for my account[email protected]
-
open a terminal on the local machine:
pip install wandb
wandb login (insert that API key)
- create a new project, record the
project name
, e.g.,DDPM
- Training:
-
insert the
project name
at line 150, e.g.,project_name="DDPM"
-
comment the
entity
if the wandb account is private -
change the log destination at line 142, e.g.,
log_dir="./ddpm_logs"
:
mkdir ddpm_logs
- run the example file:
python train_cifar.py
- the training process uses approximately 9000Mb GPU memory, and the trained models will be automatically saved in
ddpm_logs
folder
I gave a talk about diffusion models, NCSNs, and their applications in audio generation. The slides are available here.
I also compiled a report with what are, in my opinion, the most crucial findings on the topic of denoising diffusion models. It is also available in this repository.
I used Phil Wang's implementation and the official Tensorflow repo as a reference for my work.
@misc{ho2020denoising,
title = {Denoising Diffusion Probabilistic Models},
author = {Jonathan Ho and Ajay Jain and Pieter Abbeel},
year = {2020},
eprint = {2006.11239},
archivePrefix = {arXiv},
primaryClass = {cs.LG}
}
@inproceedings{anonymous2021improved,
title = {Improved Denoising Diffusion Probabilistic Models},
author = {Anonymous},
booktitle = {Submitted to International Conference on Learning Representations},
year = {2021},
url = {https://openreview.net/forum?id=-NEXDKk8gZ},
note = {under review}
}