Vision-Enhanced LLMs for Safer Autonomous Driving

This research introduces a novel autonomous driving assistance system that integrates vision capabilities with LLM reasoning to improve decision-making in challenging road situations.

Combines YOLOv4 and Vision Transformer (ViT) for comprehensive visual feature extraction
Leverages GPT for advanced reasoning about spatial relationships in driving scenarios
Outperforms traditional autonomous systems in complex, unexpected scenarios
Addresses critical security concerns by enhancing decision quality in potentially dangerous driving situations

This approach represents a significant advancement for vehicle safety systems by enabling more human-like understanding of road conditions, improving trust in autonomous technology, and potentially reducing accidents in edge cases where traditional systems fail.

Vision-Integrated LLMs for Autonomous Driving Assistance: Human Performance Comparison and Trust Evaluation