Security Blind Spots in Small Language Models

This research provides the first comprehensive security assessment of Small Language Models (SLMs), revealing critical vulnerabilities despite their efficiency advantages.

SLMs are highly susceptible to jailbreak attacks, with success rates approaching 90% for some models
Popular SLM development techniques like distillation and quantization significantly degrade security robustness
Standard defense mechanisms show limited effectiveness against targeted attacks
Security evaluations should be mandatory before deployment on edge devices

As SLMs become increasingly deployed in resource-constrained environments like mobile devices and IoT systems, understanding these security risks is essential for developing safer AI systems.

Behind the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models