Enabling Voice-Accompanying Hand-to-Face Gesture Recognition with Cross-Device Sensing

by   Zisu Li, et al.

Gestures performed accompanying the voice are essential for voice interaction to convey complementary semantics for interaction purposes such as wake-up state and input modality. In this paper, we investigated voice-accompanying hand-to-face (VAHF) gestures for voice interaction. We targeted hand-to-face gestures because such gestures relate closely to speech and yield significant acoustic features (e.g., impeding voice propagation). We conducted a user study to explore the design space of VAHF gestures, where we first gathered candidate gestures and then applied a structural analysis to them in different dimensions (e.g., contact position and type), outputting a total of 8 VAHF gestures with good usability and least confusion. To facilitate VAHF gesture recognition, we proposed a novel cross-device sensing method that leverages heterogeneous channels (vocal, ultrasound, and IMU) of data from commodity devices (earbuds, watches, and rings). Our recognition model achieved an accuracy of 97.3 recognizing 3 gestures and 91.5 "empty" gesture, proving the high applicability. Quantitative analysis also sheds light on the recognition capability of each sensor channel and their different combinations. In the end, we illustrated the feasible use cases and their design principles to demonstrate the applicability of our system in various scenarios.


page 9

page 11

page 12


Acoustic Sensing-based Hand Gesture Detection for Wearable Device Interaction

Hand gesture recognition attracts great attention for interaction since ...

VGPN: Voice-Guided Pointing Robot Navigation for Humans

Pointing gestures are widely used in robot navigationapproaches nowadays...

TeethTap: Recognizing Discrete Teeth Gestures Using Motion and Acoustic Sensing on an Earpiece

Teeth gestures become an alternative input modality for different situat...

Learning to recognize touch gestures: recurrent vs. convolutional features and dynamic sampling

We propose a fully automatic method for learning gestures on big touch d...

Transfer: Cross Modality Knowledge Transfer using Adversarial Networks – A Study on Gesture Recognition

Knowledge transfer across sensing technology is a novel concept that has...

Continuous interaction with a smart speaker via low-dimensional embeddings of dynamic hand pose

This paper presents a new continuous interaction strategy with visual fe...

OESense: Employing Occlusion Effect for In-ear Human Sensing

Smart earbuds are recognized as a new wearable platform for personal-sca...

Please sign up or login with your details

Forgot password? Click here to reset