PromptGuard: Securing AI-Generated Images

PromptGuard introduces an innovative safety mechanism for text-to-image models, adapting the system prompt concept from LLMs to prevent generation of inappropriate content.

Implements safety soft prompts that invisibly guide image generation toward safe outputs
Creates a safety layer without requiring model retraining or architecture changes
Effectively blocks NSFW content while preserving creative capabilities
Represents a significant advancement in responsible AI deployment

This research addresses critical security concerns in generative AI by providing a practical solution to prevent misuse while maintaining model functionality, helping organizations deploy creative AI tools with greater confidence.

PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models