Latent Diffusion / DiT

Latent Diffusion Models (High-Resolution Image Synthesis with Latent Diffusion Models[)

https://arxiv.org/pdf/2112.10752

To generate high-resolution image.

The noise predictor is trained in the latent space of AutoEncoder.

Earlier models used U-Net with the attention module at each layer for the noise prediction.

DiT: Diffusion Models with Transformers

https://arxiv.org/pdf/2212.09748

DDPM 也可以不用Unet做扩散, DiT 就是用 ViT 代替 Unet

DiT is based on the Vision Transformer (ViT) architecture which operates on sequences of patches




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • 3DGS Ray Tracing
  • 4DGS
  • Modeling (3) — 3D Shape
  • Modeling (2) — Surface Reconstruction
  • Modeling (1) — Curve, Surface, Mesh