From the course: Programming Generative AI: From Variational Autoencoders to Stable Diffusion with PyTorch and Hugging Face

Unlock this course with a free trial

Join today to access over 24,800 courses taught by industry experts.

Decoding images from the stable diffusion latent space

Decoding images from the stable diffusion latent space

- [Instructor] A new bit, since this is a latent diffusion model, is that while we can visualize the latent themselves, in this case, just again forcing them into an image, we can see that we have some low resolution since this is the latent space, and also weird color mapping. So again, this is visualizing the variational autoencoder latent space. It doesn't have a one-to-one mapping with RGB channels. When we try to decode these, just kind of naively, we get a weird distorted image. To actually properly convert these latents back into the pixel space, we first need to scale the latent. And this kind of magic 0.18215 just comes from the original latent diffusion paper. And the authors found that kind of just this simple rescaling normalization of the latents just happened to work well and they just arrived at this and put it in the paper and their initial code. So I wouldn't think too much about this. But for us, it's important since again, we need to kind of match what the…

Contents