Benchmarking Spatial Intelligence for Self-Driving Cars

This research introduces NuScenes-SpatialQA, the first large-scale benchmark to evaluate how well vision-language models (VLMs) understand and reason about spatial relationships in autonomous driving scenarios.

Creates a ground-truth based question-answer dataset specifically for driving scenarios
Systematically evaluates VLMs' spatial reasoning capabilities - a critical gap in current benchmarks
Tests models on real-world driving situations where spatial understanding directly impacts safety
Provides a foundation for improving AI systems that power autonomous vehicles

Why it matters: Robust spatial reasoning is essential for autonomous vehicle safety and reliability. This benchmark helps identify current limitations in AI models and provides a pathway to develop more trustworthy self-driving systems.

NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving