Check for attention mask in backends that don't support it #12892

dxqb · 2025-12-26T14:10:05Z

What does this PR do?

Using an attention backend (https://huggingface.co/docs/diffusers/main/optimization/attention_backends) with a model that passes attention masks yields incorrect results.

This is already checked in parallel backends...

diffusers/src/diffusers/models/attention_dispatch.py

Line 978 in f6b6a71

raise ValueError("`attn_mask` is not yet supported for flash-attn 2.")

...but not yet in the regular ones.

This PR changes that.
Fixes #12605

Who can review?

@yiyixuxu and @asomoza
CC @zzlol63 @tolgacangoz

check attention mask

5eef3ef

This was referenced Dec 26, 2025

dispatch_attention_fn silently ignores attn_mask for certain backends #12605

Open

Split attention backends #12870

Draft

Attention backend selection Nerogar/OneTrainer#1227

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check for attention mask in backends that don't support it #12892

Check for attention mask in backends that don't support it #12892

dxqb commented Dec 26, 2025 •

edited

Loading

Labels

1 participant

Check for attention mask in backends that don't support it #12892

Are you sure you want to change the base?

Check for attention mask in backends that don't support it #12892

Conversation

dxqb commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Labels

1 participant

dxqb commented Dec 26, 2025 •

edited

Loading