
Selective Forgetting for LLMs
A benchmark for removing specific content from AI models
The LUME benchmark introduces a structured approach to evaluate how effectively large language models can be made to 'forget' specific information without complete retraining.
- Measures unlearning across three key tasks: creative content, synthetic sensitive biographies, and public biographical information
- Provides pre-fine-tuned 1B and 7B parameter models as targets for unlearning experiments
- Establishes comprehensive evaluation metrics to measure unlearning effectiveness
Security Impact: LUME advances techniques to remove copyrighted, private, or sensitive content from deployed AI systems, addressing critical data privacy concerns and enhancing model governance capabilities.