Attacking the Knowledge Base

This research introduces MARAGE, a novel approach for extracting private data from Retrieval-Augmented Generation (RAG) systems through automated, transferable adversarial attacks.

MARAGE optimizes adversarial prompts that can extract information from RAG knowledge bases
Demonstrates high transferability across different LLMs and retrieval systems
Achieves up to 3× higher success rates than manual attacks
Shows that RAG systems remain vulnerable despite being designed to reduce hallucinations

This work highlights critical security vulnerabilities in RAG deployments that use private or sensitive data, requiring urgent defensive measures for enterprise deployments.

MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction