
Zero-Shot Anomaly Detection with MLLMs
Detecting anomalies without prior training data
This research introduces a groundbreaking approach that leverages Multimodal Large Language Models (MLLMs) to detect and reason about anomalies without requiring training on normal samples.
- Establishes a new paradigm for anomaly detection that works with limited data
- Creates MM-RAD, the first multimodal reasoning anomaly detection benchmark
- Evaluates 12 state-of-the-art MLLMs on anomaly detection capabilities
- Demonstrates MLLMs can identify and explain abnormalities across security, medical and engineering domains
For security applications, this approach enables rapid anomaly detection in scenarios where collecting large training datasets is impractical or impossible, potentially transforming threat detection and surveillance systems.
Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models