Privacy at Risk: Stealing Personal Data from LLMs

Researchers demonstrate a novel two-step attack method (R.R.: Recollect and Rank) that successfully reconstructs personal information from LLM training data.

Combines recollection generation to create potential PII candidates
Uses ranking module to identify the most likely correct information
Achieves significantly higher attack success rates than previous methods
Highlights critical privacy vulnerabilities in current LLM architectures

This research exposes serious security concerns for organizations deploying LLMs with sensitive data, emphasizing the need for stronger privacy safeguards and careful consideration of training data handling.

R.R.: Unveiling LLM Training Privacy through Recollection and Ranking