
Optimizing LLMs for Edge Devices
Novel memory-efficient techniques for low-power environments
MEADOW introduces a breakthrough approach for running Large Language Models on edge devices with limited memory and power constraints.
- Combines innovative dataflow architecture with custom data packing to minimize memory requirements
- Achieves up to 3.21x reduction in model operation costs compared to traditional GEMM-based methods
- Implements dynamic resource allocation to maximize inference throughput on FPGA platforms
- Demonstrates practical deployment of LLMs in resource-constrained environments without sacrificing performance
This research enables wider adoption of AI capabilities in IoT, mobile, and embedded systems where power and memory are severely limited, opening new possibilities for edge computing applications.
MEADOW: Memory-efficient Dataflow and Data Packing for Low Power Edge LLMs