Optimizing LLMs for Edge Computing

Optimizing LLMs for Edge Computing

Addressing the challenges of deploying powerful AI on resource-constrained devices

This comprehensive survey examines how to effectively deploy large language models (LLMs) on edge devices despite computational and memory limitations.

Key findings:

  • Edge LLMs require unique resource-efficient model designs to function on devices with limited processing power
  • Pre-deployment strategies and runtime inference optimizations are critical for practical implementation
  • Solutions must address hardware heterogeneity across diverse edge environments

This research is significant for engineering teams developing AI applications for smartphones, IoT devices, and other edge computing scenarios where on-device AI processing provides privacy, latency, and connectivity advantages.

A Review on Edge Large Language Models: Design, Execution, and Applications

95 | 521