
BadVision: The Backdoor Threat to Vision Language Models
How stealthy attacks can induce hallucinations in LVLMs
This research reveals a novel backdoor attack that exploits self-supervised learning (SSL) vision encoders to compromise Large Vision Language Models, causing them to hallucinate visual content.
- Introduces stealthy attack vectors targeting widely shared pre-trained vision encoders
- Demonstrates how malicious actors can induce visual hallucinations in LVLMs
- Shows these backdoor attacks remain effective even after additional fine-tuning
- Highlights the security implications for models using shared vision encoders
This work is critically important for security teams as it exposes vulnerabilities in a foundational component of modern LVLMs, requiring new defensive strategies to protect against visual hallucinations in security-critical applications.