Not All Attention Is Needed: Gated Attention Network for Sequence Data

12/01/2019
by   Lanqing Xue, et al.
0

Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states. Dynamic network configuration in convolutional neural networks (CNNs) selectively activates only part of the network at a time for different inputs. In this paper, we combine the two dynamic mechanisms for text classification tasks. Traditional attention mechanisms attend to the whole sequence of hidden states for an input sentence, while in most cases not all attention is needed especially for long sequences. We propose a novel method called Gated Attention Network (GA-Net) to dynamically select a subset of elements to attend to using an auxiliary network, and compute attention weights to aggregate the selected elements. It avoids a significant amount of unnecessary computation on unattended elements, and allows the model to pay attention to important parts of the sequence. Experiments in various datasets show that the proposed method achieves better performance compared with all baseline models with global or local attention while requiring less computation and achieving better interpretability. It is also promising to extend the idea to more complex attention-based models, such as transformers and seq-to-seq models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2020

An Enhanced Convolutional Neural Network in Side-Channel Attacks and Its Visualization

In recent years, the convolutional neural networks (CNNs) have received ...
research
12/01/2016

Temporal Attention-Gated Model for Robust Sequence Classification

Typical techniques for sequence classification are designed for well-seg...
research
06/09/2021

DGA-Net Dynamic Gaussian Attention Network for Sentence Semantic Matching

Sentence semantic matching requires an agent to determine the semantic r...
research
06/06/2018

Attention Incorporate Network: A network can adapt various data size

In traditional neural networks for image processing, the inputs of the n...
research
06/02/2018

SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

Scene text recognition has drawn great attentions in the community of co...
research
11/27/2018

GaterNet: Dynamic Filter Selection in Convolutional Neural Network via a Dedicated Global Gating Network

The concept of conditional computation for deep nets has been proposed p...
research
02/09/2016

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems ...

Please sign up or login with your details

Forgot password? Click here to reset