Unified Image and Video Saliency Modeling

03/11/2020
by   Richard Droste, et al.
2

Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature. On the one hand, image saliency modeling is a well-studied problem and progress on benchmarks like and MIT300 is slowing. For video saliency prediction on the other hand, rapid gains have been achieved on the recent DHF1K benchmark through network architectures that are optimized for this task. Here, we take a step back and ask: Can image and video saliency modeling be approached via a unified model, with mutual benefit? We find that it is crucial to model the domain shift between image and video saliency data and between different video saliency datasets for effective joint modeling. We identify different sources of domain shift and address them through four novel domain adaptation techniques - Domain-Adaptive Priors, Domain-Adaptive Fusion, Domain-Adaptive Smoothing and Bypass-RNN - in addition to an improved formulation of learned Gaussian priors. We integrate these techniques into a simple and lightweight encoder-RNN-decoder-style network, UNISAL, and train the entire network simultaneously with image and video saliency data. We evaluate our method on the video saliency datasets DHF1K, Hollywood-2 and UCF-Sports, as well as the image saliency datasets SALICON and MIT300. With one set of parameters, our method achieves state-of-the-art performance on all video saliency datasets and is on par with the state-of-the-art for image saliency prediction, despite a 5 to 20-fold reduction in model size and the fastest runtime among all competing deep models. We provide retrospective analyses and ablation studies which demonstrate the importance of the domain shift modeling. The code is available at https://github.com/rdroste/unisal.

READ FULL TEXT

page 5

page 6

page 11

page 13

research
08/25/2020

FastSal: a Computationally Efficient Network for Visual Saliency Prediction

This paper focuses on the problem of visual saliency prediction, predict...
research
08/28/2018

Temporal Saliency Adaptation in Egocentric Videos

This work adapts a deep neural model for image saliency prediction to th...
research
03/10/2020

Tidying Deep Saliency Prediction Architectures

Learning computational models for visual attention (saliency estimation)...
research
01/11/2023

TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Video saliency prediction has recently attracted attention of the resear...
research
10/08/2018

Saliency Prediction in the Deep Learning Era: An Empirical Investigation

Visual saliency models have enjoyed a big leap in performance in recent ...
research
07/11/2017

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes

We introduce SaltiNet, a deep neural network for scanpath prediction tra...
research
05/26/2021

Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling

Since 2014 transfer learning has become the key driver for the improveme...

Please sign up or login with your details

Forgot password? Click here to reset