
LLM-Generated Data: Not a Silver Bullet for Misinformation Detection
Evaluating limitations of AI-augmented training data for COVID-19 stance detection
This research evaluates whether using large language models (LLMs) to generate additional training data improves the accuracy of systems detecting COVID-19 misinformation stances.
- LLM-based data augmentation showed limited effectiveness for improving stance detection models
- Quality of synthetic data was inconsistent across different misinformation claims
- Traditional data augmentation methods often outperformed sophisticated LLM-generated content
- Research highlights importance of domain-specific evaluation before implementing AI-powered solutions
These findings matter for healthcare information security as they demonstrate that detecting pandemic misinformation requires more than just generating additional AI data—highlighting the need for specialized approaches when protecting public health information integrity.
Limited Effectiveness of LLM-based Data Augmentation for COVID-19 Misinformation Stance Detection