
When LLMs Clash: Evidence vs. Internal Knowledge
Measuring how models balance retrieved information against their pretrained knowledge
This research introduces a framework to quantify how LLMs resolve conflicts between their internal knowledge and external evidence in retrieval-augmented systems.
- Evidence-Model Balance: Introduces ClashEval, measuring how models weigh external information against internal priors
- Medical Safety: Reveals critical insights on how models handle incorrect drug dosage information
- Benchmark Creation: Developed comprehensive datasets specifically for testing information conflicts
- Practical Implications: Demonstrates when models trust or reject retrieved information, crucial for high-stakes applications
For medical AI applications, this research provides critical safety guidance by showing when models might propagate potentially harmful misinformation about treatments or dosages versus when they correctly rely on built-in medical knowledge.
ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence