Modality Mixer for Multi-modal Action Recognition

08/24/2022
by   Sumin Lee, et al.
0

In multi-modal action recognition, it is important to consider not only the complementary nature of different modalities but also global action content. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, to leverage complementary information across modalities and temporal context of an action for multi-modal action recognition. We also introduce a simple yet effective recurrent unit, called Multi-modal Contextualization Unit (MCU), which is a core component of M-Mixer. Our MCU temporally encodes a sequence of one modality (e.g., RGB) with action content features of other modalities (e.g., depth, IR). This process encourages M-Mixer to exploit global action content and also to supplement complementary information of other modalities. As a result, our proposed method outperforms state-of-the-art methods on NTU RGB+D 60, NTU RGB+D 120, and NW-UCLA datasets. Moreover, we demonstrate the effectiveness of M-Mixer by conducting comprehensive ablation studies.

READ FULL TEXT

page 1

page 8

research
04/20/2019

EV-Action: Electromyography-Vision Multi-Modal Action Dataset

Multi-modal human motion analysis is a critical and attractive research ...
research
01/31/2020

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

With the prevalence of RGB-D cameras, multi-modal video data have become...
research
08/22/2019

EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition

We focus on multi-modal fusion for egocentric action recognition, and pr...
research
01/05/2021

Trear: Transformer-based RGB-D Egocentric Action Recognition

In this paper, we propose a Transformer-based RGB-D egocentric action re...
research
04/17/2018

PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities

Data of different modalities generally convey complimentary but heteroge...
research
06/15/2021

Imitation and Mirror Systems in Robots through Deep Modality Blending Networks

Learning to interact with the environment not only empowers the agent wi...
research
02/25/2022

On Modality Bias Recognition and Reduction

Making each modality in multi-modal data contribute is of vital importan...

Please sign up or login with your details

Forgot password? Click here to reset