Supersaliency: Predicting Smooth Pursuit-Based Attention with Slicing CNNs Improves Fixation Prediction for Naturalistic Videos

01/26/2018
by   Mikhail Startsev, et al.
0

Predicting attention is a popular topic at the intersection of human and computer vision, but video saliency prediction has only recently begun to benefit from deep learning-based approaches. Even though most of the available video-based saliency data sets and models claim to target human observers' fixations, they fail to differentiate them from smooth pursuit (SP), a major eye movement type that is unique to perception of dynamic scenes. In this work, we aim to make this distinction explicit, to which end we (i) use both algorithmic and manual annotations of SP traces and other eye movements for two well-established video saliency data sets, (ii) train Slicing Convolutional Neural Networks (S-CNN) for saliency prediction on either fixation- or SP-salient locations, and (iii) evaluate ours and over 20 popular published saliency models on the two annotated data sets for predicting both SP and fixations, as well as on another data set of human fixations. Our proposed model, trained on an independent set of videos, outperforms the state-of-the-art saliency models in the task of SP prediction on all considered data sets. Moreover, this model also demonstrates superior performance in the prediction of "classical" fixation-based saliency. Our results emphasize the importance of selectively approaching training set construction for attention modelling.

READ FULL TEXT

page 1

page 3

research
10/07/2019

CrowdFix: An Eyetracking Data-set of Human Crowd Video

Understanding human visual attention and saliency is an integral part of...
research
09/19/2017

Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM

Over the past few years, deep neural networks (DNNs) have exhibited grea...
research
10/07/2019

CrowdFix: An Eyetracking Dataset of Real Life Crowd Videos

Understanding human visual attention and saliency is an integral part of...
research
05/25/2019

DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction

This paper presents a conceptually simple and effective Deep Audio-Visua...
research
11/20/2020

ATSal: An Attention Based Architecture for Saliency Prediction in 360 Videos

The spherical domain representation of 360 video/image presents many cha...
research
04/17/2016

Visual saliency detection: a Kalman filter based approach

In this paper we propose a Kalman filter aided saliency detection model ...
research
05/20/2019

Are all the frames equally important?

In this work, we address the problem of measuring and predicting tempora...

Please sign up or login with your details

Forgot password? Click here to reset