
Making LLMs More Efficient
Scaling 300B parameter models without premium hardware
The Ling team has developed an optimized approach for training massive Mixture-of-Experts (MoE) language models with standard computing resources.
Key innovations:
- Created two MoE models: Ling-Lite (16.8B parameters) and Ling-Plus (290B parameters)
- Achieved efficient training through novel optimization techniques
- Eliminated need for premium GPUs while maintaining performance
- Demonstrated cost-effective scaling approach for large language models
Why it matters: This research democratizes access to cutting-edge LLM technology by reducing hardware barriers and computational costs, enabling more organizations to develop advanced AI capabilities.
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs