Evaluating MLLMs for Autonomous Driving

This research introduces a structured evaluation framework for assessing how Multimodal Large Language Models (MLLMs) perform in autonomous driving scenarios.

Combines domain-independent knowledge with context-specific guidance
Evaluates MLLMs across perception, reasoning, and planning capabilities
Provides a systematic approach beyond proof-of-concept applications
Addresses key security and engineering considerations for real-world implementation

This framework is crucial for engineering teams developing autonomous systems, enabling standardized assessment of AI capabilities before deployment in safety-critical driving environments.

A Framework for a Capability-driven Evaluation of Scenario Understanding for Multimodal Large Language Models in Autonomous Driving