Optimizing Vision-Language Models for Edge Devices

This comprehensive survey explores how Vision Large Language Models (VLMs) can be optimized for deployment on edge devices with limited resources.

Examines techniques for adapting powerful VLMs to operate within processing, memory, and energy constraints
Covers applications across domains including autonomous vehicles, smart surveillance, and healthcare
Analyzes optimization approaches for real-time visual understanding capabilities in edge networks

For the healthcare sector, this research enables advanced medical imaging analysis, remote patient monitoring, and diagnostic support directly on edge devices without requiring cloud connectivity, enhancing privacy and reducing latency for critical applications.

Vision-Language Models for Edge Networks: A Comprehensive Survey