
Hidden Biases Against Mental Health Groups in LLMs
How AI language models propagate stigma against vulnerable populations
This research reveals how Large Language Models can generate unprovoked attack narratives targeting vulnerable mental health groups, creating a framework to understand bias propagation.
- Discovered differential treatment of mental health conditions, with some disorders facing more severe stigmatization
- Developed a network-based framework to analyze how biases propagate through LLM-generated content
- Identified emergent patterns of harmful narratives that weren't explicitly present in training data
This work is critical for the medical community as it exposes how AI systems might perpetuate and amplify harmful stereotypes about mental health conditions, potentially affecting patient care and public perception.