25/04/2020

Enhancing Mobile Voice Assistants with WorldGaze

Sven Mayer, Gierad Laput, Chris Harrison

Keywords: worldgaze, interaction techniques, mobile interaction

Abstract: Contemporary voice assistants require that objects of inter-est be specified in spoken commands. Of course, users are often looking directly at the object or place of interest ? fine-grained, contextual information that is currently unused. We present WorldGaze, a software-only method for smartphones that provides the real-world gaze location of a user that voice agents can utilize for rapid, natural, and precise interactions. We achieve this by simultaneously opening the front and rear cameras of a smartphone. The front-facing camera is used to track the head in 3D, including estimating its direction vector. As the geometry of the front and back cameras are fixed and known, we can raycast the head vector into the 3D world scene as captured by the rear-facing camera. This allows the user to intuitively define an object or region of interest using their head gaze. We started our investigations with a qualitative exploration of competing methods, before developing a functional, real-time implementation. We conclude with an evaluation that shows WorldGaze can be quick and accurate, opening new multimodal gaze+voice interactions for mobile voice agents.

The video of this talk cannot be embedded. You can watch it here:
https://www.youtube.com/watch?v=BtT-0rIzoV4
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at CHI 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers