Motion Guided Attention Fusion to Recognize Interactions from Videos

04/01/2021
by   Tae Soo Kim, et al.
0

We present a dual-pathway approach for recognizing fine-grained interactions from videos. We build on the success of prior dual-stream approaches, but make a distinction between the static and dynamic representations of objects and their interactions explicit by introducing separate motion and object detection pathways. Then, using our new Motion-Guided Attention Fusion module, we fuse the bottom-up features in the motion pathway with features captured from object detections to learn the temporal aspects of an action. We show that our approach can generalize across appearance effectively and recognize actions where an actor interacts with previously unseen objects. We validate our approach using the compositional action recognition task from the Something-Something-v2 dataset where we outperform existing state-of-the-art methods. We also show that our method can generalize well to real world tasks by showing state-of-the-art performance on recognizing humans assembling various IKEA furniture on the IKEA-ASM dataset.

READ FULL TEXT

page 1

page 4

page 8

research
12/20/2019

Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks

Human action is naturally compositional: humans can easily recognize and...
research
01/04/2018

What have we learned from deep representations for action recognition?

As the success of deep models has led to their deployment in all areas o...
research
12/03/2020

SAFCAR: Structured Attention Fusion for Compositional Action Recognition

We present a general framework for compositional action recognition – i....
research
10/17/2019

Making Third Person Techniques Recognize First-Person Actions in Egocentric Videos

We focus on first-person action recognition from egocentric videos. Unli...
research
01/18/2017

Action Recognition: From Static Datasets to Moving Robots

Deep learning models have achieved state-of-the- art performance in reco...
research
06/22/2019

Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019

In this report, we present the Baidu-UTS submission to the EPIC-Kitchens...
research
04/19/2012

Dynamic Template Tracking and Recognition

In this paper we address the problem of tracking non-rigid objects whose...

Please sign up or login with your details

Forgot password? Click here to reset