Skip to content

[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3)#12222

Merged
simon-mo merged 7 commits intovllm-project:mainfrom jinzhen-lin:optimize_moe_align_block_sizeJan 21, 2025

Commits

Commits on Jan 20, 2025