[Quantization][MoE] remove unused ep logic from moe marlin #31571

jinzhen-lin · 2025-12-31T09:18:20Z

After #29642 , the ep support for moe marlin kernel is no longer required. Therefore, we can remove this dead code.

Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>

gemini-code-assist

Code Review

This pull request removes unused is_ep (expert parallelism) logic from the MoE Marlin kernel. The changes are consistent across Python, C++, and CUDA files, simplifying the codebase by removing dead code. This refactoring is a good improvement for maintainability, as it makes the kernel logic more straightforward. The assumption is that expert parallelism details are now handled before the kernel is invoked, which is a sound design choice. The changes appear correct and well-executed.

remove unused ep logic from moe marlin

a50eafd

Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>

jinzhen-lin requested review from mgoin and pavanimajety as code owners December 31, 2025 09:18

gemini-code-assist bot reviewed Dec 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Quantization][MoE] remove unused ep logic from moe marlin #31571

[Quantization][MoE] remove unused ep logic from moe marlin #31571

Uh oh!

jinzhen-lin commented Dec 31, 2025 •

edited

Loading

gemini-code-assist bot left a comment

Labels

1 participant

Uh oh!

[Quantization][MoE] remove unused ep logic from moe marlin #31571

Are you sure you want to change the base?

[Quantization][MoE] remove unused ep logic from moe marlin #31571

Uh oh!

Conversation

jinzhen-lin commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Labels

1 participant

jinzhen-lin commented Dec 31, 2025 •

edited

Loading