How Does Selective Mechanism Improve Self-Attention Networks?

05/03/2020
by   Xinwei Geng, et al.
0

Self-attention networks (SANs) with selective mechanism has produced substantial improvements in various NLP tasks by concentrating on a subset of input words. However, the underlying reasons for their strong performance have not been well explained. In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax. Experimental results on several representative NLP tasks, including natural language inference, semantic role labelling, and machine translation, show that SSANs consistently outperform the standard SANs. Through well-designed probing experiments, we empirically validate that the improvement of SSANs can be attributed in part to mitigating two commonly-cited weaknesses of SANs: word order encoding and structure modeling. Specifically, the selective mechanism improves SANs by paying more attention to content words that contribute to the meaning of the sentence. The code and data are released at https://github.com/xwgeng/SSAN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Assessing the Ability of Self-Attention Networks to Learn Word Order

Self-attention networks (SAN) have attracted a lot of interests due to t...
research
03/01/2022

Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings

In order to equip NLP systems with selective prediction capability, seve...
research
09/01/2019

Self-Attention with Structural Position Representations

Although self-attention networks (SANs) have advanced the state-of-the-a...
research
03/23/2023

Retrieval-Augmented Classification with Decoupled Representation

Pretrained language models (PLMs) have shown marvelous improvements acro...
research
04/17/2023

Improving Autoregressive NLP Tasks via Modular Linearized Attention

Various natural language processing (NLP) tasks necessitate models that ...
research
04/13/2021

EXPLAINABOARD: An Explainable Leaderboard for NLP

With the rapid development of NLP research, leaderboards have emerged as...
research
10/07/2019

Compositional Generalization for Primitive Substitutions

Compositional generalization is a basic mechanism in human language lear...

Please sign up or login with your details

Forgot password? Click here to reset