Enhancing Gaze Estimation with AI

Enhancing Gaze Estimation with AI

Using text-guided multimodal learning to improve accuracy and applications

GazeCLIP leverages language-vision collaboration to significantly enhance gaze estimation performance beyond traditional image-only approaches.

  • Integrates CLIP model transferability with linguistic information input
  • Improves accuracy across diverse application scenarios
  • Enables more reliable multimodal gaze tracking systems
  • Shows particular promise for security applications

In security contexts, enhanced gaze estimation offers improved user authentication, attention monitoring, and more precise surveillance capabilities—creating more reliable and less intrusive security systems.

GazeCLIP: Enhancing Gaze Estimation Through Text-Guided Multimodal Learning

4 | 167