Interaction-aware Joint Attention Estimation Using People Attributes

08/10/2023
by   Chihiro Nakatani, et al.
0

This paper proposes joint attention estimation in a single image. Different from related work in which only the gaze-related attributes of people are independently employed, (I) their locations and actions are also employed as contextual cues for weighting their attributes, and (ii) interactions among all of these attributes are explicitly modeled in our method. For the interaction modeling, we propose a novel Transformer-based attention network to encode joint attention as low-dimensional features. We introduce a specialized MLP head with positional embedding to the Transformer so that it predicts pixelwise confidence of joint attention for generating the confidence heatmap. This pixelwise prediction improves the heatmap accuracy by avoiding the ill-posed problem in which the high-dimensional heatmap is predicted from the low-dimensional features. The estimated joint attention is further improved by being integrated with general image-based attention estimation. Our method outperforms SOTA methods quantitatively in comparative experiments. Code: https://anonymous.4open.science/r/anonymized_codes-ECA4.

READ FULL TEXT

page 1

page 3

page 5

page 7

research
08/09/2023

Joint-Relation Transformer for Multi-Person Motion Prediction

Multi-person motion prediction is a challenging problem due to the depen...
research
07/27/2018

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

This paper addresses the challenging problem of estimating the general v...
research
11/25/2022

Interaction Visual Transformer for Egocentric Action Anticipation

Human-object interaction is one of the most important visual cues that h...
research
04/10/2018

Discovery and usage of joint attention in images

Joint visual attention is characterized by two or more individuals looki...
research
12/20/2016

From Images to 3D Shape Attributes

Our goal in this paper is to investigate properties of 3D shape that can...
research
06/22/2022

SpA-Former: Transformer image shadow detection and removal via spatial attention

In this paper, we propose an end-to-end SpA-Former to recover a shadow-f...
research
04/03/2022

Region-aware Attention for Image Inpainting

Recent attention-based image inpainting methods have made inspiring prog...

Please sign up or login with your details

Forgot password? Click here to reset