Tidying Deep Saliency Prediction Architectures

03/10/2020
by   Navyasri Reddy, et al.
6

Learning computational models for visual attention (saliency estimation) is an effort to inch machines/robots closer to human visual cognitive abilities. Data-driven efforts have dominated the landscape since the introduction of deep neural network architectures. In deep learning research, the choices in architecture design are often empirical and frequently lead to more complex models than necessary. The complexity, in turn, hinders the application requirements. In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions. We review the existing state of the art models on these four components and propose novel and simpler alternatives. As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks. SimpleNet is an optimized encoder-decoder architecture and brings notable performance gains on the SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model that directly predicts parameters of a GMM distribution and is aimed to bring more interpretability to the prediction maps. The proposed saliency models can be inferred at 25fps, making them suitable for real-time applications. Code and pre-trained models are available at https://github.com/samyak0210/saliency.

READ FULL TEXT

page 2

page 3

page 4

page 7

research
12/11/2020

AViNet: Diving Deep into Audio-Visual Saliency Prediction

We propose the AViNet architecture for audiovisual saliency prediction. ...
research
08/25/2020

FastSal: a Computationally Efficient Network for Visual Saliency Prediction

This paper focuses on the problem of visual saliency prediction, predict...
research
04/05/2018

End-to-End Saliency Mapping via Probability Distribution Prediction

Most saliency estimation methods aim to explicitly model low-level consp...
research
05/02/2018

EML-NET:An Expandable Multi-Layer NETwork for Saliency Prediction

In this work, we apply state-of-the-art Convolutional Neural Network(CNN...
research
03/11/2020

Unified Image and Video Saliency Modeling

Visual saliency modeling for images and videos is treated as two indepen...
research
07/11/2017

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes

We introduce SaltiNet, a deep neural network for scanpath prediction tra...
research
07/03/2019

Simple vs complex temporal recurrences for video saliency prediction

This paper investigates modifying an existing neural network architectur...

Please sign up or login with your details

Forgot password? Click here to reset