Revolutionizing Human-Robot Interaction with Gaze and Speech

FAM-HRI integrates gaze tracking and speech recognition through foundation models to create more intuitive and accessible human-robot interaction systems.

Combines eye-tracking data and natural language inputs for efficient robot control
Leverages foundation models to interpret multimodal human inputs with higher accuracy
Reduces interaction ambiguity compared to traditional gesture or voice-only systems
Enables seamless control for users with physical impairments or limited mobility

Medical Impact: By eliminating reliance on physical gestures, this technology significantly improves assistive robotics accessibility for patients with motor impairments, potentially transforming rehabilitation and daily assistance for individuals with disabilities.

Source: FAM-HRI: Foundation-Model Assisted Multi-Modal Human-Robot Interaction Combining Gaze and Speech