Combating Multilingual Hallucinations

Poly-FEVER introduces the first large-scale multilingual benchmark specifically designed to detect hallucinations in Large Language Models across 11 languages.

Addresses a critical gap in hallucination detection beyond English-centric evaluation
Enables systematic assessment of AI systems' factual reliability across diverse linguistic contexts
Reveals significant performance disparities between high- and low-resource languages
Provides essential tools for building more reliable multilingual AI applications

Why it matters: As LLMs expand globally, this research provides crucial infrastructure to ensure AI systems deliver factual, trustworthy information regardless of language, supporting both linguistic integrity and security considerations.

Poly-FEVER: A Multilingual Fact Verification Benchmark for Hallucination Detection in Large Language Models