
MLLMs for the Visually Impaired
How Multimodal LLMs are Transforming Visual Interpretation Tools
This study examines how blind and low vision (BLV) individuals use and experience visual interpretation applications powered by multimodal large language models (MLLMs).
- BLV users increasingly rely on AI-powered tools for daily visual interpretation needs
- MLLMs offer more descriptive and contextual visual interpretations compared to previous technologies
- Users show growing trust in these applications, including for critical tasks like medication identification
- Research reveals both the promise and potential risks of MLLM adoption in assistive technology
Medical Impact: The findings highlight important safety considerations as BLV individuals increasingly trust these applications for medication identification and dosage interpretation, suggesting a need for improved reliability and appropriate user guidance in medical contexts.