Attention Meets Perturbations: Robust and Interpretable Attention with Adversarial Training

09/25/2020
by   Shunsuke Kitada, et al.
0

In recent years, deep learning models have placed more emphasis on the interpretability and robustness of models. The attention mechanism is an important technique that contributes to these elements and is widely used, especially in the natural language processing (NLP) field. Adversarial training (AT) is a powerful regularization technique for enhancing the robustness of neural networks and has been successful in many applications. The application of AT to the attention mechanism is expected to be highly effective, but there is little research on this. In this paper, we propose a new general training technique for NLP tasks, using AT for attention (Attention AT) and more interpretable adversarial training for attention (Attention iAT). Our proposals improved both the prediction performance and interpretability of the model by applying AT to the attention mechanisms. In particular, Attention iAT enhances those advantages by introducing adversarial perturbation, which differentiates the attention of sentences where it is unclear which words are important. We performed various NLP tasks on ten open datasets and compared the performance of our techniques to a recent model using attention mechanisms. Our experiments revealed that AT for attention mechanisms, especially Attention iAT, demonstrated (1) the best prediction performance in nine out of ten tasks and (2) more interpretable attention (i.e., the resulting attention correlated more strongly with gradient-based word importance) for all tasks. Additionally, our techniques are (3) much less dependent on perturbation size in AT. Our code and more results are available at https://github.com/shunk031/attention-meets-perturbation

READ FULL TEXT

page 1

page 7

research
05/08/2018

Interpretable Adversarial Perturbation in Input Embedding Space for Text

Following great success in the image processing field, the idea of adver...
research
09/24/2019

Attention Interpretability Across NLP Tasks

The attention layer in a neural network model provides insights into the...
research
01/08/2022

Clustering Text Using Attention

Clustering Text has been an important problem in the domain of Natural L...
research
11/23/2022

SEAT: Stable and Explainable Attention

Currently, attention mechanism becomes a standard fixture in most state-...
research
10/14/2021

The Irrationality of Neural Rationale Models

Neural rationale models are popular for interpretable predictions of NLP...
research
03/24/2023

Quadratic Graph Attention Network (Q-GAT) for Robust Construction of Gene Regulatory Networks

Gene regulatory relationships can be abstracted as a gene regulatory net...
research
12/30/2022

On the Interpretability of Attention Networks

Attention mechanisms form a core component of several successful deep le...

Please sign up or login with your details

Forgot password? Click here to reset