From the course: Programming Generative AI: From Variational Autoencoders to Stable Diffusion with PyTorch and Hugging Face
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
Decoding images from the stable diffusion latent space
From the course: Programming Generative AI: From Variational Autoencoders to Stable Diffusion with PyTorch and Hugging Face
Decoding images from the stable diffusion latent space
- [Instructor] A new bit, since this is a latent diffusion model, is that while we can visualize the latent themselves, in this case, just again forcing them into an image, we can see that we have some low resolution since this is the latent space, and also weird color mapping. So again, this is visualizing the variational autoencoder latent space. It doesn't have a one-to-one mapping with RGB channels. When we try to decode these, just kind of naively, we get a weird distorted image. To actually properly convert these latents back into the pixel space, we first need to scale the latent. And this kind of magic 0.18215 just comes from the original latent diffusion paper. And the authors found that kind of just this simple rescaling normalization of the latents just happened to work well and they just arrived at this and put it in the paper and their initial code. So I wouldn't think too much about this. But for us, it's important since again, we need to kind of match what the…
Contents
-
-
-
-
-
-
-
-
(Locked)
Topics51s
-
(Locked)
Components of a multimodal model5m 24s
-
(Locked)
Vision-language understanding9m 33s
-
(Locked)
Contrastive language-image pretraining6m 8s
-
(Locked)
Embedding text and images with CLIP14m 7s
-
(Locked)
Zero-shot image classification with CLIP3m 36s
-
(Locked)
Semantic image search with CLIP10m 40s
-
(Locked)
Conditional generative models5m 26s
-
(Locked)
Introduction to latent diffusion models8m 42s
-
(Locked)
The latent diffusion model architecture5m 50s
-
(Locked)
Failure modes and additional tools6m 40s
-
(Locked)
Stable diffusion deconstructed11m 30s
-
(Locked)
Writing your own stable diffusion pipeline11m 16s
-
(Locked)
Decoding images from the stable diffusion latent space4m 32s
-
(Locked)
Improving generation with guidance9m 12s
-
(Locked)
Playing with prompts30m 14s
-
(Locked)
-
-