FAN: Focused Attention Networks

05/27/2019
by   Chu Wang, et al.
0

Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through appropriate weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far, however, the learning of attention weights has been driven solely by the minimization of task specific loss functions. We here introduce a method of learning attention weights to better emphasize informative pair-wise relations between entities. The key idea is to use a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We then introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused attention module leads to a new state-of-the-art for the recovery of relations in a relationship proposal task. Our experiments show that it also boosts performance for diverse vision and language tasks, including object detection, scene categorization and document classification.

READ FULL TEXT
research
10/26/2021

Task-Specific Dependency-based Word Embedding Methods

Two task-specific dependency-based word embedding methods are proposed f...
research
03/19/2020

Affinity Graph Supervision for Visual Recognition

Affinity graphs are widely used in deep architectures, including graph c...
research
05/22/2020

SentPWNet: A Unified Sentence Pair Weighting Network for Task-specific Sentence Embedding

Pair-based metric learning has been widely adopted to learn sentence emb...
research
03/04/2023

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

In the fashion domain, there exists a variety of vision-and-language (V+...
research
02/18/2022

Task Specific Attention is one more thing you need for object detection

Various models have been proposed to solve the object detection problem....
research
08/09/2019

VisualBERT: A Simple and Performant Baseline for Vision and Language

We propose VisualBERT, a simple and flexible framework for modeling a br...
research
04/14/2022

Superconnexivity Reconsidered

We reconsider the idea of superconnexivity, an idea that has not receive...

Please sign up or login with your details

Forgot password? Click here to reset