BadVision: The Backdoor Threat to Vision Language Models

This research reveals a novel backdoor attack that exploits self-supervised learning (SSL) vision encoders to compromise Large Vision Language Models, causing them to hallucinate visual content.

Introduces stealthy attack vectors targeting widely shared pre-trained vision encoders
Demonstrates how malicious actors can induce visual hallucinations in LVLMs
Shows these backdoor attacks remain effective even after additional fine-tuning
Highlights the security implications for models using shared vision encoders

This work is critically important for security teams as it exposes vulnerabilities in a foundational component of modern LVLMs, requiring new defensive strategies to protect against visual hallucinations in security-critical applications.

Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models