BadVision: The Backdoor Threat to Vision Language Models

BadVision: The Backdoor Threat to Vision Language Models

How stealthy attacks can induce hallucinations in LVLMs

This research reveals a novel backdoor attack that exploits self-supervised learning (SSL) vision encoders to compromise Large Vision Language Models, causing them to hallucinate visual content.

  • Introduces stealthy attack vectors targeting widely shared pre-trained vision encoders
  • Demonstrates how malicious actors can induce visual hallucinations in LVLMs
  • Shows these backdoor attacks remain effective even after additional fine-tuning
  • Highlights the security implications for models using shared vision encoders

This work is critically important for security teams as it exposes vulnerabilities in a foundational component of modern LVLMs, requiring new defensive strategies to protect against visual hallucinations in security-critical applications.

Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models

6 | 14