← Back to list

Diffusion Models for Imaging and Vision

VAE (Variational Auto-Encoder)

Setting

Evidence Lower Bound (Similar to Maximum Likelihood Estimation)

Training VAE

Loss Function

Inference with VAE

Generate the decoded image $\hat{x}$ with a latent $z$ which is sampled from $p(z) = \mathcal{N}(0, I)$ with decode parameter

Denoising Diffusion Probabilistic Model (DDPM)

Building Blocks

The Magical Scalars $\sqrt{\alpha_{t}}$ and $1-\alpha_{t}$

Distribution $q_{\phi}(x_{t}|x_{0})$

Evidence Lower Bound

Rewrite the Consistency Term

Derivation of $q_{\phi}(x_{t-1}|x_{t}, x_{0})$, which is a Guassian

Training and Inference

Derivation Based on Noise Vector

Inversion by Direct Denoising (InDI): linear combinations $x_{t-1} = (\text{something})\cdot x_{t}+(\text{something else})\cdot \text{denoise}(x_{t}) + \text{noise}$