HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization

05/16/2019
by   Xingxing Zhang, et al.
0

Neural extractive summarization models usually employ a hierarchical encoder for document encoding and they are trained using sentence-level labels, which are created heuristically using rule-based methods. Training the hierarchical encoder with these inaccurate labels is challenging. Inspired by the recent work on pre-training transformer sentence encoders devlin:2018:arxiv, we propose Hibert (as shorthand for HIerachical Bidirectional Encoder Representations from Transformers) for document encoding and a method to pre-train it using unlabeled data. We apply the pre-trained Hibert to our summarization model and it outperforms its randomly initialized counterpart by 1.25 ROUGE on the CNN/Dailymail dataset and by 2.0 ROUGE on a version of New York Times dataset. We also achieve the state-of-the-art performance on these two datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2020

STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Abstractive summarization aims to rewrite a long document to its shorter...
research
10/16/2020

Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers

Unsupervised extractive document summarization aims to select important ...
research
01/26/2019

Language Model Pre-training for Hierarchical Document Representations

Hierarchical neural architectures are often used to capture long-distanc...
research
06/09/2023

FPDM: Domain-Specific Fast Pre-training Technique using Document-Level Metadata

Pre-training Transformers has shown promising results on open-domain and...
research
08/16/2019

Bidirectional Context-Aware Hierarchical Attention Network for Document Understanding

The Hierarchical Attention Network (HAN) has made great strides, but it ...
research
10/06/2020

Stepwise Extractive Summarization and Planning with Structured Transformers

We propose encoder-centric stepwise models for extractive summarization ...
research
08/22/2018

Neural Latent Extractive Document Summarization

Extractive summarization models need sentence level labels, which are us...

Please sign up or login with your details

Forgot password? Click here to reset