Optimizing LLMs for Edge Devices

Optimizing LLMs for Edge Devices

Novel memory-efficient techniques for low-power environments

MEADOW introduces a breakthrough approach for running Large Language Models on edge devices with limited memory and power constraints.

  • Combines innovative dataflow architecture with custom data packing to minimize memory requirements
  • Achieves up to 3.21x reduction in model operation costs compared to traditional GEMM-based methods
  • Implements dynamic resource allocation to maximize inference throughput on FPGA platforms
  • Demonstrates practical deployment of LLMs in resource-constrained environments without sacrificing performance

This research enables wider adoption of AI capabilities in IoT, mobile, and embedded systems where power and memory are severely limited, opening new possibilities for edge computing applications.

MEADOW: Memory-efficient Dataflow and Data Packing for Low Power Edge LLMs

31 | 52