
COMET: Accelerating AI with Efficient MoE Communication
Solving the communication bottleneck in trillion-parameter AI models
COMET introduces a fine-grained computation-communication overlapping system that significantly reduces communication overhead in Mixture-of-Experts (MoE) models.
- Addresses a critical bottleneck where inter-device communication can consume 47% of execution time in large MoE models
- Employs data dependency analysis and task rescheduling to optimize parallel processing
- Achieves up to 1.76x speedup in MoE layer execution compared to state-of-the-art methods
- Enables more efficient scaling of trillion-parameter language models without proportional increases in computational costs
This research is crucial for engineering teams building large-scale AI systems, as it provides a practical approach to mitigate communication overhead—one of the primary bottlenecks in distributed MoE deployment.
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts