Cross-Modal Safety Vulnerabilities in AI

Cross-Modal Safety Vulnerabilities in AI

When safe inputs still produce unsafe outputs in vision-language models

This research introduces the Safe Inputs but Unsafe Output (SIUO) challenge, exposing critical safety alignment gaps in large vision-language models.

  • Models can generate unsafe content even when individual inputs appear harmless
  • Cross-modal interactions create new vulnerability patterns that bypass traditional safety guardrails
  • Researchers developed a comprehensive benchmark spanning security, ethical, legal, and medical domains
  • Current models demonstrate concerning safety gaps when processing multi-modal inputs

For security professionals, this work highlights urgent safety alignment needs in deployed AI systems that handle both visual and textual information, as seemingly harmless inputs can trigger harmful outputs when combined.

Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Model

3 | 100