Prompt Theft Detection

Prompt Theft Detection

Protecting Proprietary System Prompts from Unauthorized Use

Prompt Detective is a novel statistical method that reliably determines if a system prompt has been used by a third-party LLM, addressing a critical gap in prompt privacy protection.

  • Creates a membership inference framework to detect unauthorized prompt usage
  • Achieves high detection accuracy across various LLM architectures and contexts
  • Introduces techniques that withstand evasion attempts by adversaries
  • Establishes a foundation for prompt copyright protection in the emerging AI economy

Why it matters: As proprietary system prompts become valuable intellectual property in AI development, this research provides security professionals with concrete methods to detect theft, enabling companies to protect their prompt engineering investments.

Has My System Prompt Been Used? Large Language Model Prompt Membership Inference

78 | 125