Long-Span Dependencies in Transformer-based Summarization Systems

05/08/2021
by   Potsawee Manakul, et al.
0

Transformer-based models have achieved state-of-the-art results in a wide range of natural language processing (NLP) tasks including document summarization. Typically these systems are trained by fine-tuning a large pre-trained model to the target task. One issue with these transformer-based models is that they do not scale well in terms of memory and compute requirements as the input length grows. Thus, for long document summarization, it can be challenging to train or fine-tune these models. In this work, we exploit large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection. These approaches are compared on a range of network configurations. Experiments are carried out on standard long-span summarization tasks, including Spotify Podcast, arXiv, and PubMed datasets. We demonstrate that by combining these methods, we can achieve state-of-the-art results on all three tasks in the ROUGE scores. Moreover, without a large-scale GPU card, our approach can achieve comparable or better results than existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2021

Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems

Transformer models have achieved state-of-the-art results in a wide rang...
research
10/12/2021

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

To capture the semantic graph structure from raw text, most existing sum...
research
07/18/2023

Attention over pre-trained Sentence Embeddings for Long Document Classification

Despite being the current de-facto models in most NLP tasks, transformer...
research
03/15/2023

GPT-4 Technical Report

We report the development of GPT-4, a large-scale, multimodal model whic...
research
02/07/2023

Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis

Recent advances in the area of long document matching have primarily foc...
research
05/29/2023

Abstractive Summarization as Augmentation for Document-Level Event Detection

Transformer-based models have consistently produced substantial performa...
research
08/30/2022

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Generating robust and reliable correspondences across images is a fundam...

Please sign up or login with your details

Forgot password? Click here to reset