
ECG-Expert-QA: Advancing Heart Disease Diagnosis with AI
A new benchmark for evaluating medical LLMs in electrocardiogram interpretation
ECG-Expert-QA provides a comprehensive multimodal dataset for evaluating how well AI systems can interpret electrocardiograms and diagnose heart conditions.
- Combines 47,211 expert-validated QA pairs covering 12 essential diagnostic tasks
- Includes both real-world clinical ECG data and systematically generated synthetic cases
- Supports evaluation of complex reasoning with rare conditions and temporal changes
- Enables rigorous assessment of medical LLMs' clinical reasoning capabilities
Why it matters: This benchmark addresses a critical need in healthcare AI evaluation, providing a standardized way to assess whether LLMs can accurately interpret ECGs across diverse clinical scenarios—a fundamental skill for cardiologists that could help expand access to expert-level cardiac care.
ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis