From the course: Programming Generative AI: From Variational Autoencoders to Stable Diffusion with PyTorch and Hugging Face
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
Adding conditional control to text-to-image diffusion models
From the course: Programming Generative AI: From Variational Autoencoders to Stable Diffusion with PyTorch and Hugging Face
Adding conditional control to text-to-image diffusion models
- So, when I presented conditional generative models and when I presented the initial latent diffusion model paper, I presented it and mentioned that in theory, you can use any conditioning you want, it doesn't have to be text, it could be an image, it could be a class of something you want to generate, the conditioning, as long as you've trained the latent diffusion model with the conditioning, you can use that conditioning to guide the generation process. But, the downside to this approach is that if you want to do a new task, let's say you have a latent diffusion model and you want to do something like in painting or do something like a pose-generation model where you pass in a kind of skeleton pose and it generates a person in the same pose, if the model isn't trained for that conditioning, you'll have to retrain everything from scratch. Now, one approach that basically researchers have developed is ControlNet, which is a really clever way, similar to how LoRA works, where you…
Contents
-
-
-
-
-
-
-
-
-
(Locked)
Topics46s
-
(Locked)
Methods and metrics for evaluating generative AI7m 5s
-
(Locked)
Manual evaluation of stable diffusion with DrawBench13m 56s
-
(Locked)
Quantitative evaluation of diffusion models with human preference predictors20m 1s
-
(Locked)
Overview of methods for fine-tuning diffusion models9m 34s
-
(Locked)
Sourcing and preparing image datasets for fine-tuning7m 41s
-
(Locked)
Generating automatic captions with BLIP-28m 28s
-
(Locked)
Parameter efficient fine-tuning with LoRa11m 50s
-
(Locked)
Inspecting the results of fine-tuning5m 2s
-
(Locked)
Inference with LoRas for style-specific generation12m 22s
-
(Locked)
Conceptual overview of textual inversion8m 14s
-
(Locked)
Subject-specific personalization with DreamBooth7m 43s
-
(Locked)
DreamBooth versus LoRa fine-tuning6m 28s
-
(Locked)
DreamBooth fine-tuning with Hugging Face14m 11s
-
(Locked)
Inference with DreamBooth to create personalized AI avatars14m 21s
-
(Locked)
Adding conditional control to text-to-image diffusion models4m 7s
-
(Locked)
Creating edge and depth maps for conditioning15m 35s
-
(Locked)
Depth and edge-guided stable diffusion with ControlNet17m 10s
-
(Locked)
Understanding and experimenting with ControlNet parameters8m 32s
-
(Locked)
Generative text effects with font depth maps2m 49s
-
(Locked)
Few step generation with adversarial diffusion distillation (ADD)7m 2s
-
(Locked)
Reasons to distill6m 9s
-
(Locked)
Comparing SDXL and SDXL Turbo11m 49s
-
(Locked)
Text-guided image-to-image translation16m 52s
-
(Locked)
Video-driven frame-by-frame generation with SDXL Turbo13m
-
(Locked)
Near real-time inference with PyTorch performance optimizations11m 18s
-
(Locked)
-