Combating Visual Disinformation in News

This research introduces a novel approach for detecting misleading news by specifically verifying if entities mentioned in text actually match what appears in accompanying images.

Focuses on identifying inconsistencies between specific entities (people, places, events) across text and images
Leverages advanced vision-language models to detect manipulated or false information
Creates a benchmark dataset for evaluating cross-modal entity verification
Demonstrates improved accuracy in identifying misleading content compared to previous approaches

For security professionals, this research provides essential tools to automatically verify information integrity across modalities, helping combat increasingly sophisticated disinformation campaigns.

Verifying Cross-modal Entity Consistency in News using Vision-language Models