Making Robots Understand Human Intent Naturally

Making Robots Understand Human Intent Naturally

Combining Voice Commands with Pointing Gestures through LLM Integration

This research introduces a multimodal interaction framework that enables more intuitive human-robot communication by combining verbal commands with natural pointing gestures.

  • Addresses challenges faced by elderly users with complex syntax or traditional gesture-based systems
  • Integrates voice commands with deictic posture information (pointing)
  • Leverages Large Language Models to interpret combined multimodal inputs
  • Creates a more accessible interface for service robots in healthcare and elderly care settings

Business Impact: As populations age globally, this technology could significantly improve elderly care by making service robots more accessible to users with limited technological familiarity or physical capabilities.

Natural Multimodal Fusion-Based Human-Robot Interaction: Application With Voice and Deictic Posture via Large Language Model

71 | 168