Vision-Powered PDDL Generation

Vision-Powered PDDL Generation

Using Vision-Language Models to Automate Planning for Robots in Education

Image2PDDL is a groundbreaking framework that uses Vision-Language Models to automatically convert visual scenes into planning problems, enabling robots to understand and plan based on what they see.

  • Integrates vision capabilities with Planning Domain Definition Language (PDDL) generation
  • Bridges the gap between perception understanding and action planning
  • Specifically tested for robot-assisted teaching scenarios
  • Shows promise for supporting students with Autism Spectrum Disorder

This research opens new possibilities for educational robots that can perceive classroom environments, understand learning contexts, and plan appropriate instructional activities without extensive manual programming.

Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching

9 | 24