A domain adaptive deep learning solution for scanpath prediction of paintings

by   Mohamed Amine Kerkouri, et al.

Cultural heritage understanding and preservation is an important issue for society as it represents a fundamental aspect of its identity. Paintings represent a significant part of cultural heritage, and are the subject of study continuously. However, the way viewers perceive paintings is strictly related to the so-called HVS (Human Vision System) behaviour. This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. In further details, we introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans, including the fundamental understanding of a scene, and then extend it to painting images. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention. We use an FCNN (Fully Convolutional Neural Network), in which we exploit a differentiable channel-wise selection and Soft-Argmax modules. We also incorporate learnable Gaussian distributions onto the network bottleneck to simulate visual attention process bias in natural scene images. Furthermore, to reduce the effect of shifts between different domains (i.e. natural images, painting), we urge the model to learn unsupervised general features from other domains using a gradient reversal classifier. The results obtained by our model outperform existing state-of-the-art ones in terms of accuracy and efficiency.


page 6

page 7


An Inter-observer consistent deep adversarial training for visual scanpath prediction

The visual scanpath is a sequence of points through which the human gaze...

An HVS-Oriented Saliency Map Prediction Modeling

Visual attention is one of the most significant characteristics for sele...

Depthwise Convolution is All You Need for Learning Multiple Visual Domains

There is a growing interest in designing models that can deal with image...

Neural Models of the Psychosemantics of `Most'

How are the meanings of linguistic expressions related to their use in c...

A Two-stream End-to-End Deep Learning Network for Recognizing Atypical Visual Attention in Autism Spectrum Disorder

Eye movements have been widely investigated to study the atypical visual...

Guiding human gaze with convolutional neural networks

The eye fixation patterns of human observers are a fundamental indicator...

Attention Diversification for Domain Generalization

Convolutional neural networks (CNNs) have demonstrated gratifying result...

Please sign up or login with your details

Forgot password? Click here to reset