
Visual Intelligence in Cognitive Therapy
Enhancing Mental Health Support with Multimodal AI
This research introduces a novel approach to cognitive reframing therapy that integrates visual evidence with language models to deliver more effective psychological support.
- Created M2CoSC dataset pairing GPT-4 conversations with relevant visual elements
- Developed a multi-hop reasoning system that identifies emotions, analyzes visual cues, and generates appropriate therapeutic responses
- Demonstrated improved performance over text-only methods through comprehensive evaluation
- Shows potential for real-world therapy applications where non-verbal communication is crucial
This innovation matters for healthcare because it addresses the critical gap between text-only AI therapy and real-world treatment where visual cues often provide essential context for effective intervention.
Multimodal Cognitive Reframing Therapy via Multi-hop Psychotherapeutic Reasoning