[Feature] Refine Sequence Parallel API #555

HIT-cwh · 2024-04-09T11:29:50Z

Optimized the API design for increased generality.

…t have shape (bs, seq_len, dim)

… post_process_for_sequence_parallel_attn, to the user

* refine split_for_sequence_parallel API, the tensor to be split may not have shape (bs, seq_len, dim) * Expose the two interfaces, pre_process_for_sequence_parallel_attn and post_process_for_sequence_parallel_attn, to the user * Assert pytorch version != 2.1 * Remove the PyTorch version restriction when using sequence parallel * refine all_to_all op * split sequence in sft but not data_collate_fn * add sequence communications * fix lint * move all_to_all to communications * make compute_sequence_parallel_loss method private * fix sp docs * rename * add explanation about why grad_scale is needed * refine * add docstring * rename communications to comm

HIT-cwh added 16 commits April 9, 2024 12:10

refine split_for_sequence_parallel API, the tensor to be split may no…

c93f8f5

…t have shape (bs, seq_len, dim)

Expose the two interfaces, pre_process_for_sequence_parallel_attn and…

71fb53e

… post_process_for_sequence_parallel_attn, to the user

Assert pytorch version != 2.1

fdb03ed

Remove the PyTorch version restriction when using sequence parallel

8a59e81

refine all_to_all op

acbdee5

split sequence in sft but not data_collate_fn

f9aa42d

add sequence communications

f303f9b

fix lint

994eec8

move all_to_all to communications

08176fc

make compute_sequence_parallel_loss method private

e2b05e6

fix sp docs

9466fee

rename

6a95446

add explanation about why grad_scale is needed

0c336cd

refine

040375e

add docstring

7c3ebf6

rename communications to comm

662ed61

pppppM approved these changes Apr 15, 2024

View reviewed changes

pppppM merged commit 376188a into InternLM:main Apr 15, 2024

HIT-cwh deleted the refine_sp branch April 25, 2024 06:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Refine Sequence Parallel API #555

[Feature] Refine Sequence Parallel API #555

Uh oh!

HIT-cwh commented Apr 9, 2024 •

edited

Loading

Labels

2 participants

[Feature] Refine Sequence Parallel API #555

[Feature] Refine Sequence Parallel API #555

Uh oh!

Conversation

HIT-cwh commented Apr 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

2 participants

HIT-cwh commented Apr 9, 2024 •

edited

Loading