Skip to content

Conversation

@HIT-cwh
Copy link
Collaborator

@HIT-cwh HIT-cwh commented Apr 9, 2024

Optimized the API design for increased generality.

@pppppM pppppM merged commit 376188a into InternLM:main Apr 15, 2024
@HIT-cwh HIT-cwh deleted the refine_sp branch April 25, 2024 06:33
llkn-2 pushed a commit to llkn-2/xtuner that referenced this pull request Jul 31, 2024
* refine split_for_sequence_parallel API, the tensor to be split may not have shape (bs, seq_len, dim)

* Expose the two interfaces, pre_process_for_sequence_parallel_attn and post_process_for_sequence_parallel_attn, to the user

* Assert pytorch version != 2.1

* Remove the PyTorch version restriction when using sequence parallel

* refine all_to_all op

* split sequence in sft but not data_collate_fn

* add sequence communications

* fix lint

* move all_to_all to communications

* make compute_sequence_parallel_loss method private

* fix sp docs

* rename

* add explanation about why grad_scale is needed

* refine

* add docstring

* rename communications to comm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants