Context-Aware Self-Attention Networks

02/15/2019
by   Baosong Yang, et al.
0

Self-attention model have shown its flexibility in parallel computation and the effectiveness on modeling both long- and short-term dependencies. However, it calculates the dependencies between representations without considering the contextual information, which have proven useful for modeling dependencies among neural representations in various natural language tasks. In this work, we focus on improving self-attention networks through capturing the richness of context. To maintain the simplicity and flexibility of the self-attention networks, we propose to contextualize the transformations of the query and key layers, which are used to calculates the relevance between elements. Specifically, we leverage the internal representations that embed both global and deep contexts, thus avoid relying on external resources. Experimental results on WMT14 English-German and WMT17 Chinese-English translation tasks demonstrate the effectiveness and universality of the proposed methods. Furthermore, we conducted extensive analyses to quantity how the context vectors participate in the self-attention model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2018

Modeling Localness for Self-Attention Networks

Self-attention networks have proven to be of profound value for its stre...
research
06/12/2021

Structure-Regularized Attention for Deformable Object Representation

Capturing contextual dependencies has proven useful to improve the repre...
research
10/07/2020

Improving Context Modeling in Neural Topic Segmentation

Topic segmentation is critical in key NLP tasks and recent works favor h...
research
04/05/2019

Convolutional Self-Attention Networks

Self-attention networks (SANs) have drawn increasing interest due to the...
research
02/18/2020

Conditional Self-Attention for Query-based Summarization

Self-attention mechanisms have achieved great success on a variety of NL...
research
09/09/2021

Is Attention Better Than Matrix Decomposition?

As an essential ingredient of modern deep learning, attention mechanism,...
research
06/16/2020

Untangling tradeoffs between recurrence and self-attention in neural networks

Attention and self-attention mechanisms, inspired by cognitive processes...

Please sign up or login with your details

Forgot password? Click here to reset