Cross-Modal Safety Vulnerabilities in AI

This research introduces the Safe Inputs but Unsafe Output (SIUO) challenge, exposing critical safety alignment gaps in large vision-language models.

Models can generate unsafe content even when individual inputs appear harmless
Cross-modal interactions create new vulnerability patterns that bypass traditional safety guardrails
Researchers developed a comprehensive benchmark spanning security, ethical, legal, and medical domains
Current models demonstrate concerning safety gaps when processing multi-modal inputs

For security professionals, this work highlights urgent safety alignment needs in deployed AI systems that handle both visual and textual information, as seemingly harmless inputs can trigger harmful outputs when combined.

Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Model