Cross-Modal AI Safety Dangers

Cross-Modal AI Safety Dangers

How seemingly safe inputs can lead to unsafe AI outputs

This research introduces the Safe Inputs but Unsafe Output (SIUO) challenge for evaluating cross-modality safety alignment in vision-language models.

  • Identifies significant safety gaps when harmless images combine with harmless text prompts
  • Demonstrates AI vulnerabilities across self-harm, illegal activities, privacy violations domains
  • Provides a novel benchmark for testing cross-modal safety alignment effectiveness
  • Reveals current large vision-language models struggle with these safety challenges

This work is critical for security professionals as it exposes how multimodal AI systems can generate harmful content despite seemingly innocuous inputs, highlighting the need for more robust safety alignment mechanisms across modalities.

Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Model

12 | 96