LLMs in Public Health: Performance Evaluation

This research evaluates how effectively Large Language Models can support public health professionals in analyzing and classifying health-related text data.

Combines 13 datasets (6 external, 7 new) to test LLM performance across public health tasks
Assesses capabilities for identifying health burdens, epidemiological risk factors, and public health interventions
Provides systematic evaluation framework for determining LLM suitability in health applications

Practical applications include enhanced disease surveillance, more efficient health data processing, and supporting evidence-based public health decision-making—potentially reducing expert workload while maintaining accuracy.

Evaluating Large Language Models for Public Health Classification and Extraction Tasks