Attacking the Knowledge Base

Attacking the Knowledge Base

New transferable adversarial attacks against RAG systems

This research introduces MARAGE, a novel approach for extracting private data from Retrieval-Augmented Generation (RAG) systems through automated, transferable adversarial attacks.

  • MARAGE optimizes adversarial prompts that can extract information from RAG knowledge bases
  • Demonstrates high transferability across different LLMs and retrieval systems
  • Achieves up to 3× higher success rates than manual attacks
  • Shows that RAG systems remain vulnerable despite being designed to reduce hallucinations

This work highlights critical security vulnerabilities in RAG deployments that use private or sensitive data, requiring urgent defensive measures for enterprise deployments.

MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction

6 | 27