Does DeepSpeed MoE support #device < #expert during inference #4574

iteratorlee · 2023-10-27T01:25:47Z

iteratorlee
Oct 27, 2023

I've been trying out DeepSpeed for MoE inference. Yet I found that codes from lines 269-272 might be buggy in DeepSpeed/deepspeed/inference/engine.py.

As depicted in the following figure, in line 269, if dist.get_world_size() is smaller than moe_ep_size (say there are 2 GPUs and 8 experts, and the moe_ep_size=8), num_ep_groups would be 0. In this case, the if branch in line 270 would never be reached.
I wonder if is this a redundant design. Or otherwise deepspeed does not support cases when #device<#expert (i.e., one device holds multiple experts) in MoE inference (because I'm pretty sure it is supported in training)?
Besides, I want to know if are there any off-the-shelf open-sourced pre-trained MoE models that can be used in DeepSpeed inference to test expert parallelism.

Thanks. Any replies would be appreciated! @tjruwase

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does DeepSpeed MoE support #device < #expert during inference #4574

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Does DeepSpeed MoE support #device < #expert during inference #4574

iteratorlee Oct 27, 2023

Replies: 0 comments

iteratorlee
Oct 27, 2023