
Combating LVLM Hallucinations
A new benchmark for detecting AI visual falsehoods
HALLUCINOGEN is a novel benchmark that evaluates hallucinations in Large Vision-Language Models through contextual reasoning prompts.
- Systematically tests models' vulnerability to generate false information about visual content
- Identifies critical weaknesses across leading models including GPT-4V and Gemini
- Provides a structured methodology to quantify and categorize different types of visual hallucinations
- Demonstrates particular concerns in security, medical and educational contexts
For security professionals, this research is vital as it reveals how visual AI systems can be manipulated to fabricate information, potentially leading to security breaches when these systems are used in critical infrastructure, surveillance, or authentication applications.
Towards a Systematic Evaluation of Hallucinations in Large-Vision Language Models