Detecting RAG Hallucinations

This research introduces a novel approach for detecting when LLMs in retrieval-augmented generation (RAG) systems hallucinate information instead of grounding responses in retrieved documents.

Key Findings:

Focuses on closed-domain hallucinations specific to RAG applications
Utilizes the LLM's internal states to detect when it fabricates information
Creates a specialized dataset for training hallucination detection systems
Offers a systematic approach for improving RAG system reliability and security

Business Security Impact: By identifying when AI systems invent facts not present in source documents, organizations can prevent misinformation, protect against reputational damage, and ensure compliance with emerging AI regulations.

The HalluRAG Dataset: Detecting Closed-Domain Hallucinations in RAG Applications Using an LLM's Internal States