Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

04/06/2021
by   Shuyang Cao, et al.
0

How can we effectively inform content selection in Transformer-based abstractive summarization models? In this work, we present a simple-yet-effective attention head masking technique, which is applied on encoder-decoder attentions to pinpoint salient content at inference time. Using attention head masking, we are able to reveal the relation between encoder-decoder attentions and content selection behaviors of summarization models. We then demonstrate its effectiveness on three document summarization datasets based on both in-domain and cross-domain settings. Importantly, our models outperform prior state-of-the-art models on CNN/Daily Mail and New York Times datasets. Moreover, our inference-time masking technique is also data-efficient, requiring only 20 fine-tuned on the full CNN/DailyMail dataset.

READ FULL TEXT
research
09/08/2021

Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems

Transformer models have achieved state-of-the-art results in a wide rang...
research
11/08/2019

Resurrecting Submodularity in Neural Abstractive Summarization

Submodularity is a desirable property for a variety of objectives in sum...
research
02/02/2023

Curriculum-Guided Abstractive Summarization

Recent Transformer-based summarization models have provided a promising ...
research
10/15/2020

Compressive Summarization with Plausibility and Salience Modeling

Compressive summarization systems typically rely on a crafted set of syn...
research
06/18/2023

Summarization from Leaderboards to Practice: Choosing A Representation Backbone and Ensuring Robustness

Academic literature does not give much guidance on how to build the best...
research
09/18/2018

Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization

Sequence generative models with RNN variants, such as LSTM, GRU, show pr...
research
08/31/2018

Bottom-Up Abstractive Summarization

Neural network-based methods for abstractive summarization produce outpu...

Please sign up or login with your details

Forgot password? Click here to reset