Machine Unlearning for LLMs

Research on methods to make LLMs forget specific knowledge or information to enhance privacy, security, and address copyright concerns

Hero image

Machine Unlearning for LLMs

Research on Large Language Models in Machine Unlearning for LLMs

Strategic Model Forgetting for LLM Security

Strategic Model Forgetting for LLM Security

How Manipulating Internal Representations Makes LLMs Forget

Selective Forgetting for Safer LLMs

Selective Forgetting for Safer LLMs

A robust approach to removing sensitive knowledge from language models

The Unlearning Illusion in AI Safety

The Unlearning Illusion in AI Safety

Why removing hazardous knowledge from LLMs may be harder than we think

Rethinking LLM Unlearning Evaluations

Rethinking LLM Unlearning Evaluations

Current benchmarks overestimate unlearning effectiveness

Beyond Greedy Decoding: A Probabilistic Approach to LLMs

Beyond Greedy Decoding: A Probabilistic Approach to LLMs

Improving security evaluation for unlearning and alignment

Simplifying LLM Unlearning

Simplifying LLM Unlearning

A more effective approach to removing unwanted content from AI models

Erasing the Unwanted in LLMs

Erasing the Unwanted in LLMs

Machine Unlearning as a Solution for Data Privacy and Legal Compliance

Unmasking LLM Memory Erasure

Unmasking LLM Memory Erasure

Evaluating if harmful information is truly removed from language models

The Illusion of Forgetting in LLMs

The Illusion of Forgetting in LLMs

How quantization can resurrect 'unlearned' knowledge in language models

Strategic Unlearning in LLMs

Strategic Unlearning in LLMs

Precision-targeted weight modification for secure AI models

The Efficacy of LLM Unlearning

The Efficacy of LLM Unlearning

A critical evaluation of techniques to remove harmful information from AI models

Strengthening LLM Unlearning Security

Strengthening LLM Unlearning Security

Making Representation Misdirection methods robust against backdoor-like vulnerabilities

Selective Forgetting in AI

Selective Forgetting in AI

A Framework for Agentic LLM Unlearning

Forgetting What Tools Know

Forgetting What Tools Know

A Novel Framework for Security-Focused Tool Unlearning in LLMs

Safer AI Through Selective Forgetting

Safer AI Through Selective Forgetting

Precision-targeted knowledge removal in large language models

Preventing Copyright Violations in LLMs

Preventing Copyright Violations in LLMs

A lightweight solution to disrupt memorized content generation

Selective Forgetting in Language Models

Selective Forgetting in Language Models

A novel approach to removing private information from LLMs

Selective Forgetting in MLLMs

Selective Forgetting in MLLMs

A Novel Approach to Multimodal Machine Unlearning

Smarter Forgetting for AI Models

Smarter Forgetting for AI Models

A novel approach to targeted unlearning in LLMs without sacrificing performance

Optimizing LLM Unlearning: The Retain Set Perspective

Optimizing LLM Unlearning: The Retain Set Perspective

Strategic data retention for effective entity unlearning in LLMs

SafeEraser: Making AI Forget Harmful Content

SafeEraser: Making AI Forget Harmful Content

Advancing safety through multimodal machine unlearning

Protecting IP in the Age of LLMs

Protecting IP in the Age of LLMs

Selective memory suppression without compromising performance

Balanced Data Forgetting in LLMs

Balanced Data Forgetting in LLMs

Selective unlearning without compromising model utility

Selective Forgetting for LLMs

Selective Forgetting for LLMs

A benchmark for removing specific content from AI models

The Illusion of Forgetting in LLMs

The Illusion of Forgetting in LLMs

Why soft token attacks fail as reliable auditing tools for machine unlearning

Securing MLLMs Through Smart Forgetting

Securing MLLMs Through Smart Forgetting

Novel neuron pruning technique for targeted information removal in multimodal AI

Enhancing LLM Unlearning

Enhancing LLM Unlearning

A framework to remove sensitive data while preserving model utility

Advancing LLM Security through Better Unlearning

Advancing LLM Security through Better Unlearning

A comprehensive framework for auditing knowledge removal in large language models

Surgical Knowledge Removal in LLMs

Surgical Knowledge Removal in LLMs

A gradient-based approach to selective unlearning without compromising model integrity

Securing Knowledge Erasure in LLMs

Securing Knowledge Erasure in LLMs

Beyond Surface Deletion: Comprehensive Unlearning for True Knowledge Forgetting

Erasing Sensitive Data from AI Models

Erasing Sensitive Data from AI Models

A More Effective Approach to AI Unlearning

Forgetting by Design: LLM Unlearning

Forgetting by Design: LLM Unlearning

Selectively removing sensitive data without full retraining

Selective Memory: Unlearning Sensitive Content in LLMs

Selective Memory: Unlearning Sensitive Content in LLMs

Parameter-Efficient Techniques for Enhanced AI Privacy

Enhanced LLM Unlearning for Security

Enhanced LLM Unlearning for Security

Beyond Forgetting: Removing Related Knowledge for Complete Unlearning

Selective Unlearning in LLMs

Selective Unlearning in LLMs

Efficiently Removing Sensitive Data Without Full Retraining

Balancing Unlearning & Retention in LLMs

Balancing Unlearning & Retention in LLMs

A Gradient-Based Approach to Selective Knowledge Removal

Safer AI: Selective Memory Control for LLMs

Safer AI: Selective Memory Control for LLMs

Targeted knowledge removal without compromising overall performance

Securing MLLMs Through Machine Unlearning

Securing MLLMs Through Machine Unlearning

A novel benchmark to evaluate privacy protection in multimodal AI

Selective Concept Unlearning for AI Security

Selective Concept Unlearning for AI Security

Fine-grained knowledge removal in vision-language models using sparse autoencoders

Privacy-Preserving AI: Making Models Forget

Privacy-Preserving AI: Making Models Forget

A novel contrastive unlearning framework for language models

Selective Knowledge Erasure in LLMs

Selective Knowledge Erasure in LLMs

Advancing Security Through Model Merging Techniques

Selective Skill Unlearning in LLMs

Selective Skill Unlearning in LLMs

Training-free techniques to control model capabilities

Selective Forgetting in AI Art

Selective Forgetting in AI Art

Teaching GANs to unlearn problematic content without retraining

Erasing Sensitive Data from LLMs

Erasing Sensitive Data from LLMs

A Systematic Approach to Making AI Forget

Selective Forgetting in AI

Selective Forgetting in AI

Not all data need to be unlearned with equal priority

Understanding Unlearning Difficulty in LLMs

Understanding Unlearning Difficulty in LLMs

A neuro-inspired approach to selective knowledge removal

The Coreset Effect in LLM Unlearning

The Coreset Effect in LLM Unlearning

Why current unlearning benchmarks may be easier than they appear

Selective Forgetting in AI Models

Selective Forgetting in AI Models

A Novel Approach to Privacy-Compliant Unlearning

Selective Memory Wiping for AI

Selective Memory Wiping for AI

Precision-targeted unlearning keeps LLMs both safe and smart

Key Takeaways

Summary of Research on Machine Unlearning for LLMs