Finding Hidden Memories in Large Language Models

Finding Hidden Memories in Large Language Models

Automated Detection of Privacy Vulnerabilities at Scale

MemHunter introduces a novel, efficient approach to detect when LLMs memorize and potentially reproduce sensitive training data, addressing critical privacy concerns.

  • Uses dataset-level prompting patterns rather than sample-specific approaches
  • Enables automated verification of detected memorization instances
  • Achieves significantly higher efficiency compared to existing methods
  • Offers a practical tool for comprehensive privacy auditing of LLMs

This research provides security professionals with scalable methods to identify and mitigate privacy risks in deployed AI systems, helping prevent inadvertent exposure of sensitive information through language models.

MemHunter: Automated and Verifiable Memorization Detection at Dataset-scale in LLMs

3 | 26