COMET: Accelerating AI with Efficient MoE Communication

COMET: Accelerating AI with Efficient MoE Communication

Solving the communication bottleneck in trillion-parameter AI models

COMET introduces a fine-grained computation-communication overlapping system that significantly reduces communication overhead in Mixture-of-Experts (MoE) models.

  • Addresses a critical bottleneck where inter-device communication can consume 47% of execution time in large MoE models
  • Employs data dependency analysis and task rescheduling to optimize parallel processing
  • Achieves up to 1.76x speedup in MoE layer execution compared to state-of-the-art methods
  • Enables more efficient scaling of trillion-parameter language models without proportional increases in computational costs

This research is crucial for engineering teams building large-scale AI systems, as it provides a practical approach to mitigate communication overhead—one of the primary bottlenecks in distributed MoE deployment.

Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts

344 | 521