The FITD Jailbreak Attack

The FITD Jailbreak Attack

How psychological principles enable new LLM security vulnerabilities

This research introduces a novel multi-turn jailbreak method that exploits psychological principles to bypass LLM safety measures, raising important implications for AI security systems.

  • Leverages the foot-in-the-door technique where small initial commitments lead to larger compliance
  • Demonstrates how multi-turn interactions create vulnerabilities not present in single-turn attacks
  • Achieves higher success rates against major language models compared to existing jailbreak methods
  • Reveals critical security gaps in current LLM safeguard implementations

This work highlights the urgent need for more robust defense mechanisms that consider psychological manipulation patterns in conversational AI systems.

Foot-In-The-Door: A Multi-turn Jailbreak for LLMs

113 | 157