Finding Hidden Memories in Large Language Models

MemHunter introduces a novel, efficient approach to detect when LLMs memorize and potentially reproduce sensitive training data, addressing critical privacy concerns.

Uses dataset-level prompting patterns rather than sample-specific approaches
Enables automated verification of detected memorization instances
Achieves significantly higher efficiency compared to existing methods
Offers a practical tool for comprehensive privacy auditing of LLMs

This research provides security professionals with scalable methods to identify and mitigate privacy risks in deployed AI systems, helping prevent inadvertent exposure of sensitive information through language models.

MemHunter: Automated and Verifiable Memorization Detection at Dataset-scale in LLMs