MLLMs for the Visually Impaired

This study examines how blind and low vision (BLV) individuals use and experience visual interpretation applications powered by multimodal large language models (MLLMs).

BLV users increasingly rely on AI-powered tools for daily visual interpretation needs
MLLMs offer more descriptive and contextual visual interpretations compared to previous technologies
Users show growing trust in these applications, including for critical tasks like medication identification
Research reveals both the promise and potential risks of MLLM adoption in assistive technology

Medical Impact: The findings highlight important safety considerations as BLV individuals increasingly trust these applications for medication identification and dosage interpretation, suggesting a need for improved reliability and appropriate user guidance in medical contexts.

Original Paper: Towards Understanding the Use of MLLM-Enabled Applications for Visual Interpretation by Blind and Low Vision People