
Securing Vision-Language Models Against Noise Attacks
New defenses against jailbreak attacks using noisy images
This research identifies a critical security gap in Vision-Language Models (VLMs) that makes them vulnerable to jailbreak attacks using noise-augmented images, and proposes effective defense mechanisms.
- VLMs are susceptible to perturbation-based attacks using Gaussian noise that can bypass safety measures
- The paper introduces Robust-VLGuard and DiffPure-VLM, two defense strategies that significantly improve security
- Testing across multiple VLMs showed these defenses can reduce attack success rates by up to 90%
- Implementation requires minimal computational overhead while maintaining model performance
For security professionals, this research highlights the importance of considering noise robustness during VLM training and deployment, closing a significant vulnerability in AI systems that process both visual and textual content.