Optimizing LLMs for Edge Devices

MEADOW introduces a breakthrough approach for running Large Language Models on edge devices with limited memory and power constraints.

Combines innovative dataflow architecture with custom data packing to minimize memory requirements
Achieves up to 3.21x reduction in model operation costs compared to traditional GEMM-based methods
Implements dynamic resource allocation to maximize inference throughput on FPGA platforms
Demonstrates practical deployment of LLMs in resource-constrained environments without sacrificing performance

This research enables wider adoption of AI capabilities in IoT, mobile, and embedded systems where power and memory are severely limited, opening new possibilities for edge computing applications.

MEADOW: Memory-efficient Dataflow and Data Packing for Low Power Edge LLMs