Sequential Cross Attention Based Multi-task Learning

09/06/2022
by   Sunkyung Kim, et al.
0

In multi-task learning (MTL) for visual scene understanding, it is crucial to transfer useful information between multiple tasks with minimal interferences. In this paper, we propose a novel architecture that effectively transfers informative features by applying the attention mechanism to the multi-scale features of the tasks. Since applying the attention module directly to all possible features in terms of scale and task requires a high complexity, we propose to apply the attention module sequentially for the task and scale. The cross-task attention module (CTAM) is first applied to facilitate the exchange of relevant information between the multiple task features of the same scale. The cross-scale attention module (CSAM) then aggregates useful information from feature maps at different resolutions in the same task. Also, we attempt to capture long range dependencies through the self-attention module in the feature extraction network. Extensive experiments demonstrate that our method achieves state-of-the-art performance on the NYUD-v2 and PASCAL-Context dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2023

InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding

Multi-task scene understanding aims to design models that can simultaneo...
research
05/04/2023

MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture

Multi-task learning has proven to be effective in improving the performa...
research
08/10/2023

Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention

Convolutional neural networks (CNNs) and vision transformers (ViTs) have...
research
04/28/2021

Exploring Relational Context for Multi-Task Dense Prediction

The timeline of computer vision research is marked with advances in lear...
research
03/22/2023

Road Extraction with Satellite Images and Partial Road Maps

Road extraction is a process of automatically generating road maps mainl...
research
03/01/2022

Automatic Depression Detection via Learning and Fusing Features from Visual Cues

Depression is one of the most prevalent mental disorders, which seriousl...
research
04/07/2020

Multi-Task Learning via Co-Attentive Sharing for Pedestrian Attribute Recognition

Learning to predict multiple attributes of a pedestrian is a multi-task ...

Please sign up or login with your details

Forgot password? Click here to reset