Flowchart-Based Security Exploit in Vision-Language Models

Flowchart-Based Security Exploit in Vision-Language Models

Novel attack vectors bypass safety guardrails in leading LVLMs

Researchers have discovered a significant vulnerability in Large Vision-Language Models (LVLMs) using automatically generated flowcharts that can bypass safety mechanisms.

  • FC-Attack leverages flowcharts with partially harmful content to induce models to generate unsafe responses
  • Successfully tested across multiple leading LVLMs including GPT-4V and Claude
  • The attack achieves up to 92.2% success rate on certain models
  • Existing defense mechanisms prove insufficient against this novel attack strategy

This research highlights critical security implications for AI deployment in enterprise environments, demonstrating how visual content processing remains a vulnerable attack surface even in safety-aligned systems.

FC-Attack: Jailbreaking Large Vision-Language Models via Auto-Generated Flowcharts

115 | 157