Selective Forgetting for LLMs

Selective Forgetting for LLMs

A benchmark for removing specific content from AI models

The LUME benchmark introduces a structured approach to evaluate how effectively large language models can be made to 'forget' specific information without complete retraining.

  • Measures unlearning across three key tasks: creative content, synthetic sensitive biographies, and public biographical information
  • Provides pre-fine-tuned 1B and 7B parameter models as targets for unlearning experiments
  • Establishes comprehensive evaluation metrics to measure unlearning effectiveness

Security Impact: LUME advances techniques to remove copyrighted, private, or sensitive content from deployed AI systems, addressing critical data privacy concerns and enhancing model governance capabilities.

LUME: LLM Unlearning with Multitask Evaluations

25 | 51