LLM Battle: Llama3 vs DeepSeekR1 in Medical Text Analysis

This research evaluates the performance of two leading open-source large language models specifically on biomedical text classification in zero-shot settings.

Llama3-70B and DeepSeekR1-distill-Llama3-70B were compared across six biomedical tasks
Tests covered both social media health content and clinical notes from electronic health records
Performance was measured using precision, recall, and F1 scores with 95% confidence intervals

This research matters because it helps healthcare professionals identify which LLMs perform best for specific medical text analysis applications without requiring training data, potentially improving clinical decision support systems and health monitoring solutions.

Comparing Llama3 and DeepSeekR1 on Biomedical Text Classification Tasks