Modeling Localness for Self-Attention Networks

10/24/2018
by   Baosong Yang, et al.
0

Self-attention networks have proven to be of profound value for its strength of capturing global dependencies. In this work, we propose to model localness for self-attention networks, which enhances the ability of capturing useful local context. We cast localness modeling as a learnable Gaussian bias, which indicates the central and scope of the local region to be paid more attention. The bias is then incorporated into the original attention distribution to form a revised distribution. To maintain the strength of capturing long distance dependencies and enhance the ability of capturing short-range dependencies, we only apply localness modeling to lower layers of self-attention networks. Quantitative and qualitative analyses on Chinese-English and English-German translation tasks demonstrate the effectiveness and universality of the proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2019

Context-Aware Self-Attention Networks

Self-attention model have shown its flexibility in parallel computation ...
research
10/31/2018

Convolutional Self-Attention Network

Self-attention network (SAN) has recently attracted increasing interest ...
research
06/12/2021

Structure-Regularized Attention for Deformable Object Representation

Capturing contextual dependencies has proven useful to improve the repre...
research
10/07/2020

Improving Context Modeling in Neural Topic Segmentation

Topic segmentation is critical in key NLP tasks and recent works favor h...
research
09/04/2019

Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons

Recent studies have shown that a hybrid of self-attention networks (SANs...
research
03/26/2018

Self-Attentional Acoustic Models

Self-attention is a method of encoding sequences of vectors by relating ...
research
08/27/2018

Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures

Recently, non-recurrent architectures (convolutional, self-attentional) ...

Please sign up or login with your details

Forgot password? Click here to reset