Skip to content

Commit b9e1c30

Browse files
authored
[Docs] more elaborate example for peft torch.compile (#7161)
more elaborate example for peft torch.compile
1 parent 03cd625 commit b9e1c30

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

‎docs/source/en/tutorials/using_peft_for_inference.md‎

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ list_adapters_component_wise
169169

170170
If you want to compile your model with `torch.compile` make sure to first fuse the LoRA weights into the base model and unload them.
171171

172-
```py
172+
```diff
173173
pipe.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
174174
pipe.load_lora_weights("CiroN2022/toy-face", weight_name="toy_face_sdxl.safetensors", adapter_name="toy")
175175

@@ -178,12 +178,16 @@ pipe.set_adapters(["pixel", "toy"], adapter_weights=[0.5, 1.0])
178178
pipe.fuse_lora()
179179
pipe.unload_lora_weights()
180180

181-
pipe = torch.compile(pipe)
181+
+ pipe.unet.to(memory_format=torch.channels_last)
182+
+ pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
182183

183184
prompt = "toy_face of a hacker with a hoodie, pixel art"
184185
image = pipe(prompt, num_inference_steps=30, generator=torch.manual_seed(0)).images[0]
185186
```
186187

188+
> [!TIP]
189+
> You can refer to the `torch.compile()` section [here](https://huggingface.co/docs/diffusers/main/en/optimization/torch2.0#torchcompile) and [here](https://huggingface.co/docs/diffusers/main/en/tutorials/fast_diffusion#torchcompile) for more elaborate examples.
190+
187191
## Fusing adapters into the model
188192

189193
You can use PEFT to easily fuse/unfuse multiple adapters directly into the model weights (both UNet and text encoder) using the [`~diffusers.loaders.LoraLoaderMixin.fuse_lora`] method, which can lead to a speed-up in inference and lower VRAM usage.

0 commit comments

Comments
 (0)