
Finding Hidden Memories in Large Language Models
Automated Detection of Privacy Vulnerabilities at Scale
MemHunter introduces a novel, efficient approach to detect when LLMs memorize and potentially reproduce sensitive training data, addressing critical privacy concerns.
- Uses dataset-level prompting patterns rather than sample-specific approaches
- Enables automated verification of detected memorization instances
- Achieves significantly higher efficiency compared to existing methods
- Offers a practical tool for comprehensive privacy auditing of LLMs
This research provides security professionals with scalable methods to identify and mitigate privacy risks in deployed AI systems, helping prevent inadvertent exposure of sensitive information through language models.
MemHunter: Automated and Verifiable Memorization Detection at Dataset-scale in LLMs