Predicting Human Attention using Computational Attention

03/16/2023
by   Zhibo Yang, et al.
0

Most models of visual attention are aimed at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks. We propose Human Attention Transformer (HAT), a single model predicting both forms of attention control. HAT is the new state-of-the-art (SOTA) in predicting the scanpath of fixations made during target-present and target-absent search, and matches or exceeds SOTA in the prediction of taskless free-viewing fixation scanpaths. HAT achieves this new SOTA by using a novel transformer-based architecture and a simplified foveated retina that collectively create a spatio-temporal awareness akin to the dynamic visual working memory of humans. Unlike previous methods that rely on a coarse grid of fixation cells and experience information loss due to fixation discretization, HAT features a dense-prediction architecture and outputs a dense heatmap for each fixation, thus avoiding discretizing fixations. HAT sets a new standard in computational attention, which emphasizes both effectiveness and generality. HAT's demonstrated scope and applicability will likely inspire the development of new attention models that can better predict human behavior in various attention-demanding scenarios.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
08/09/2023

Hierarchical Representations for Spatio-Temporal Visual Attention Modeling and Understanding

This PhD. Thesis concerns the study and development of hierarchical repr...
research
07/04/2022

Target-absent Human Attention

The prediction of human gaze behavior is important for building human-co...
research
03/27/2023

Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

Predicting human gaze is important in Human-Computer Interaction (HCI). ...
research
10/27/2022

Predicting Visual Attention and Distraction During Visual Search Using Convolutional Neural Networks

Most studies in computational modeling of visual attention encompass tas...
research
11/25/2018

Visual Attention on the Sun: What Do Existing Models Actually Predict?

Visual attention prediction is a classic problem that seems to be well a...
research
10/17/2019

Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy

We present an extension of our Molecular Transformer architecture combin...
research
07/13/2022

Entry-Flipped Transformer for Inference and Prediction of Participant Behavior

Some group activities, such as team sports and choreographed dances, inv...

Please sign up or login with your details

Forgot password? Click here to reset