[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. #609

HIT-cwh · 2024-04-24T07:08:22Z

No description provided.

…already set it

…uld set attn_implementation to flash_attention_2 or do not set this attribute.

…users already set it in XTuner configs. (InternLM#609) * do not set attn_implementation to flash_attention_2 or sdpa if users already set it * check cfg: If we want to use varlen attn or sequence parallel, we should set attn_implementation to flash_attention_2 or do not set this attribute.

HIT-cwh added 2 commits April 24, 2024 15:00

do not set attn_implementation to flash_attention_2 or sdpa if users …

b73ceff

…already set it

check cfg: If we want to use varlen attn or sequence parallel, we sho…

6526379

…uld set attn_implementation to flash_attention_2 or do not set this attribute.

HIT-cwh changed the title ~~[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it~~ Apr 24, 2024

pppppM approved these changes Apr 25, 2024

View reviewed changes

pppppM merged commit 60e0cc9 into InternLM:main Apr 25, 2024

HIT-cwh mentioned this pull request May 30, 2024

How can I disable flash_attn during training?Should I add attn_implementation='eager' in the llm section of the config? #732

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. #609

[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. #609

Uh oh!

HIT-cwh commented Apr 24, 2024

Labels

2 participants

[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. #609

[Fix] Do not set attn_implementation to flash_attention_2 or sdpa if users already set it in XTuner configs. #609

Uh oh!

Conversation

HIT-cwh commented Apr 24, 2024

Labels

2 participants