
AI-Powered Visual Grounding in Medical Imaging
Automating the connection between radiological text and image locations
This research develops an innovative vision-language model that automatically connects text descriptions in radiology reports to their precise locations within PET/CT images.
- Created an automated pipeline to generate weakly-supervised labels from existing reports
- Trained a specialized 3D vision-language model for visual grounding in medical imaging
- Demonstrated potential for improving radiology workflow by linking text descriptions to image findings
- Applied across multiple radiotracer types (FDG, DCFPyL, DOTATE, Fluciclovine)
This breakthrough addresses a critical gap in medical AI by enabling more precise identification of lesions and abnormalities without requiring extensive manual annotation, potentially enhancing diagnostic accuracy and radiologist efficiency.
Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings