Benchmarking Spatial Intelligence for Self-Driving Cars

Benchmarking Spatial Intelligence for Self-Driving Cars

First comprehensive test for vision-language models in autonomous driving

This research introduces NuScenes-SpatialQA, the first large-scale benchmark to evaluate how well vision-language models (VLMs) understand and reason about spatial relationships in autonomous driving scenarios.

  • Creates a ground-truth based question-answer dataset specifically for driving scenarios
  • Systematically evaluates VLMs' spatial reasoning capabilities - a critical gap in current benchmarks
  • Tests models on real-world driving situations where spatial understanding directly impacts safety
  • Provides a foundation for improving AI systems that power autonomous vehicles

Why it matters: Robust spatial reasoning is essential for autonomous vehicle safety and reliability. This benchmark helps identify current limitations in AI models and provides a pathway to develop more trustworthy self-driving systems.

NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving

11 | 17