An Efficient Coarse-to-Fine Facet-Aware Unsupervised Summarization Framework based on Semantic Blocks

08/17/2022
by   Xinnian Liang, et al.
0

Unsupervised summarization methods have achieved remarkable results by incorporating representations from pre-trained language models. However, existing methods fail to consider efficiency and effectiveness at the same time when the input document is extremely long. To tackle this problem, in this paper, we proposed an efficient Coarse-to-Fine Facet-Aware Ranking (C2F-FAR) framework for unsupervised long document summarization, which is based on the semantic block. The semantic block refers to continuous sentences in the document that describe the same facet. Specifically, we address this problem by converting the one-step ranking method into the hierarchical multi-granularity two-stage ranking. In the coarse-level stage, we propose a new segment algorithm to split the document into facet-aware semantic blocks and then filter insignificant blocks. In the fine-level stage, we select salient sentences in each block and then extract the final summary from selected sentences. We evaluate our framework on four long document summarization datasets: Gov-Report, BillSum, arXiv, and PubMed. Our C2F-FAR can achieve new state-of-the-art unsupervised summarization results on Gov-Report and BillSum. In addition, our method speeds up 4-28 times more than previous methods.[<https://github.com/xnliang98/c2f-far>]

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Improving Sentence Similarity Estimation for Unsupervised Extractive Summarization

Unsupervised extractive summarization aims to extract salient sentences ...
research
11/09/2022

Unsupervised Extractive Summarization with Heterogeneous Graph Embeddings for Chinese Document

In the scenario of unsupervised extractive summarization, learning high-...
research
10/16/2021

PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Recently proposed pre-trained generation models achieve strong performan...
research
03/13/2022

SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization

Sequence-to-sequence neural networks have recently achieved great succes...
research
08/21/2022

GRETEL: Graph Contrastive Topic Enhanced Language Model for Long Document Extractive Summarization

Recently, neural topic models (NTMs) have been incorporated into pre-tra...
research
08/19/2022

Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm

We address the problem of unsupervised extractive document summarization...
research
07/07/2019

Joint Lifelong Topic Model and Manifold Ranking for Document Summarization

Due to the manifold ranking method has a significant effect on the ranki...

Please sign up or login with your details

Forgot password? Click here to reset