
Attacking the Knowledge Base
New transferable adversarial attacks against RAG systems
This research introduces MARAGE, a novel approach for extracting private data from Retrieval-Augmented Generation (RAG) systems through automated, transferable adversarial attacks.
- MARAGE optimizes adversarial prompts that can extract information from RAG knowledge bases
- Demonstrates high transferability across different LLMs and retrieval systems
- Achieves up to 3× higher success rates than manual attacks
- Shows that RAG systems remain vulnerable despite being designed to reduce hallucinations
This work highlights critical security vulnerabilities in RAG deployments that use private or sensitive data, requiring urgent defensive measures for enterprise deployments.