Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions

01/07/2019
by   Yifei Huang, et al.
8

In this work, we address two coupled tasks of gaze prediction and action recognition in egocentric videos by exploring their mutual context. Our assumption is that in the procedure of performing a manipulation task, what a person is doing determines where the person is looking at, and the gaze point reveals gaze and non-gaze regions which contain important and complementary information about the undergoing action. We propose a novel mutual context network (MCN) that jointly learns action-dependent gaze prediction and gaze-guided action recognition in an end-to-end manner. Experiments on public egocentric video datasets demonstrate that our MCN achieves state-of-the-art performance of both gaze prediction and action recognition.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 9

research
05/31/2020

In the Eye of the Beholder: Gaze and Actions in First Person Video

We address the task of jointly determining what a person is doing and wh...
research
03/24/2018

Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition

We present a new computational model for gaze prediction in egocentric v...
research
09/15/2019

Multitask Learning to Improve Egocentric Action Recognition

In this work we employ multitask learning to capitalize on the structure...
research
07/25/2016

DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation

In this work, we consider the task of generating highly-realistic images...
research
10/01/2019

Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction

For effectively interacting with humans in collaborative environments, m...
research
10/11/2019

Interaction Relational Network for Mutual Action Recognition

Person-person mutual action recognition (also referred to as interaction...
research
10/15/2020

Boosting Image-based Mutual Gaze Detection using Pseudo 3D Gaze

Mutual gaze detection, i.e., predicting whether or not two people are lo...

Please sign up or login with your details

Forgot password? Click here to reset