Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization

09/18/2018
by   Kamal Al-Sabahi, et al.
0

Sequence generative models with RNN variants, such as LSTM, GRU, show promising performance on abstractive document summarization. However, they still have some issues that limit their performance, especially while deal-ing with long sequences. One of the issues is that, to the best of our knowledge, all current models employ a unidirectional decoder, which reasons only about the past and still limited to retain future context while giving a prediction. This makes these models suffer on their own by generating unbalanced outputs. Moreover, unidirec-tional attention-based document summarization can only capture partial aspects of attentional regularities due to the inherited challenges in document summarization. To this end, we propose an end-to-end trainable bidirectional RNN model to tackle the aforementioned issues. The model has a bidirectional encoder-decoder architecture; in which the encoder and the decoder are bidirectional LSTMs. The forward decoder is initialized with the last hidden state of the backward encoder while the backward decoder is initialized with the last hidden state of the for-ward encoder. In addition, a bidirectional beam search mechanism is proposed as an approximate inference algo-rithm for generating the output summaries from the bidi-rectional model. This enables the model to reason about the past and future and to generate balanced outputs as a result. Experimental results on CNN / Daily Mail dataset show that the proposed model outperforms the current abstractive state-of-the-art models by a considerable mar-gin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2016

Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization

This paper tackles the reduction of redundant repeating generation that ...
research
05/24/2017

Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning

We develop the first approximate inference algorithm for 1-Best (and M-B...
research
04/06/2021

Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

How can we effectively inform content selection in Transformer-based abs...
research
07/18/2019

Forward-Backward Decoding for Regularizing End-to-End TTS

Neural end-to-end TTS can generate very high-quality synthesized speech,...
research
07/04/2016

Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization

In this work, we introduce temporal hierarchies to the sequence to seque...
research
04/20/2018

A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization

Structured data summarization involves generation of natural language su...
research
03/08/2021

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

Interleaved texts, where posts belonging to different threads occur in a...

Please sign up or login with your details

Forgot password? Click here to reset