Denoising Diffusion Probabilistic Models
review DDPM.
Generative deep learning models like VAEs and GANs have shown brilliant performance. “Denoising Diffusion Probabilistic Models” (DDPM) [1] is novel generative model published in 2020. This model is based on “Deep Unsupervised Learning Using Nonequilibrium Thermodynamics (2015)”.
1. What are diffusion probabilistic models?
figure1: The directed graphical model considered in this work.
Diffusion probabilistic models (briefly, diffusion models) are latents variable models that handle latents of the same dimension as the original data. They are also parametrized Markov chains trained by variational inference to produce samples matching given data in finite time. Trainsitions in a chain (the probability of state change between events) are learned to reverse a diffusion process, which is a Markov chain that gradually add noise to the data in the opposite direction of sampling until signal is destroyed. If the diffusion process is a series of small amount of Gaussian noise, it is sufficient to set the sampling chain transitions to the posterior.
1.1. Diffusion and its reverse process
During the diffusion process
An impressive property of the diffusion process is that we can sample
i.e.,
Meanwhile, during the reverse process
Consequently, it models
To maximize the likelihood, we train a model by optimizing the variational bound on the negative log likelihood:
We can rewrite the loss for each
(For details, see appendix A of the paper.) It is noteworthy that the loss compares
Let us label each component of the loss.
1.2. Parameterization of , and
To obtain discrete log likelihood, the authors set the last term of the reverse process as independent discrete decoder
For
where
Therefore we need to make
where
By the parameterization of
The authors claim that empirically better results were yielded from the following unweighted loss function:
This corrsponds to
2. Python code
Let us talk about DDPM in TensorFlow, Python. The authors provide their code here. We will only look into the core file “diffusion-master/diffusion_tf/diffusion_utils.py” that contains theoretical contents. “diffusion_utils.py” is used in “diffusion-master/scripts/run_celebahq.py”.
Default setting:
Markov chain time steps
2.1. Training
The following numberings do not correspond to each of pseudocodes.
T,S.0
Initialize the Model and load GaussianDiffusion in diffusion_utils.py. GaussianDiffusion contains almost all mathematical functions for the diffusion model.
Let’s follow the training code flow.
T.1
To train the Model, run Model.train_fn. For each input data in minibatch, sample
T.2
p_losses corresponds to the equation
- The result:
, returned from step 2.
2.2. Sampling
It’s time for the sampling code.
S.1
We need to run Model.samples_fn to generate samples. self.diffusion.p_sample_loop is a functions of GaussianDiffusion. It is the iteration of GaussianDiffusion.p_sample that samples
S.2
p_sample corresponds to
S.3
There are predict_start_from_noise and q_posterior. Let’s take a look at each of them.
S.3.1
predict_start_from_noise is the reverse process of sampling
S.3.2
q_posterior is equivalent to equation
- The result: synthetic sample
, returned from step 1 as a result of the iteration.
Note
For many mathematical concepts and proofs, I recommend that you read lilianweng’s comprehensive posts 1 & 2.
Also it would be helpful to refer to the following review.
References
[1] Ho, Jonathan, Ajay Jain, and Pieter Abbeel. “Denoising diffusion probabilistic models.” Advances in neural information processing systems 33 (2020): 6840-6851.
[2] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. “U-net: Convolutional networks for biomedical image segmentation.” Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer International Publishing, 2015.