Multi-modal Fusion for Single-Stage Continuous Gesture Recognition

11/10/2020
by   Harshala Gammulle, et al.
2

Gesture recognition is a much studied research area which has myriad real-world applications including robotics and human-machine interaction. Current gesture recognition methods have heavily focused on isolated gestures, and existing continuous gesture recognition methods are limited by a two-stage approach where independent models are required for detection and classification, with the performance of the latter being constrained by detection performance. In contrast, we introduce a single-stage continuous gesture recognition model, that can detect and classify multiple gestures in a single video via a single model. This approach learns the natural transitions between gestures and non-gestures without the need for a pre-processing segmentation stage to detect individual gestures. To enable this, we introduce a multi-modal fusion mechanism to support the integration of important information that flows from multi-modal inputs, and is scalable to any number of modes. Additionally, we propose Unimodal Feature Mapping (UFM) and Multi-modal Feature Mapping (MFM) models to map uni-modal features and the fused multi-modal features respectively. To further enhance the performance we propose a mid-point based loss function that encourages smooth alignment between the ground truth and the prediction. We demonstrate the utility of our proposed framework which can handle variable-length input videos, and outperforms the state-of-the-art on two challenging datasets, EgoGesture, and IPN hand. Furthermore, ablative experiments show the importance of different components of the proposed framework.

READ FULL TEXT

page 1

page 3

page 8

page 9

page 10

research
03/10/2022

BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis

Achieving realistic, vivid, and human-like synthesized conversational ge...
research
10/29/2021

Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition

Gesture recognition is getting more and more popular due to various appl...
research
09/14/2019

Progression Modelling for Online and Early Gesture Detection

Online and Early detection of gestures is crucial for building touchless...
research
09/16/2018

A Generic Multi-modal Dynamic Gesture Recognition System using Machine Learning

Human computer interaction facilitates intelligent communication between...
research
07/15/2022

A Non-Anatomical Graph Structure for isolated hand gesture separation in continuous gesture sequences

Continuous Hand Gesture Recognition (CHGR) has been extensively studied ...
research
06/11/2020

Let's face it: Probabilistic multi-modal interlocutor-aware generation of facial gestures in dyadic settings

To enable more natural face-to-face interactions, conversational agents ...
research
01/06/2022

Multi-modal data fusion of Voice and EMG data for Robotic Control

Wearable electronic equipment is constantly evolving and is increasing t...

Please sign up or login with your details

Forgot password? Click here to reset