Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition

03/24/2018
by   Yifei Huang, et al.
0

We present a new computational model for gaze prediction in egocentric videos by exploring patterns in temporal shift of gaze fixations (attention transition) that are dependent on egocentric manipulation tasks. Our assumption is that the high-level context of how a task is completed in a certain way has a strong influence on attention transition and should be modeled for gaze prediction in natural dynamic scenes. Specifically, we propose a hybrid model based on deep neural networks which integrates task-dependent attention transition with bottom-up saliency prediction. In particular, the task-dependent attention transition is learned with a recurrent neural network to exploit the temporal context of gaze fixations, e.g. looking at a cup after moving gaze away from a grasped bottle. Experiments on public egocentric activity datasets show that our model significantly outperforms state-of-the-art gaze prediction methods and is able to learn meaningful transition of human attention.

READ FULL TEXT

page 3

page 4

page 7

page 9

research
01/07/2019

Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions

In this work, we address two coupled tasks of gaze prediction and action...
research
04/12/2019

Digging Deeper into Egocentric Gaze Prediction

This paper digs deeper into factors that influence egocentric gaze. Inst...
research
06/09/2022

GASP: Gated Attention For Saliency Prediction

Saliency prediction refers to the computational task of modeling overt a...
research
04/16/2021

Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Deep-learning-based algorithms have led to impressive results in visual-...
research
09/04/2019

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

This paper addresses a new problem of understanding human gaze communica...
research
07/19/2017

Supervising Neural Attention Models for Video Captioning by Human Gaze Data

The attention mechanisms in deep neural networks are inspired by human's...
research
12/08/2014

When Computer Vision Gazes at Cognition

Joint attention is a core, early-developing form of social interaction. ...

Please sign up or login with your details

Forgot password? Click here to reset