Unsupervised Summarization by Jointly Extracting Sentences and Keywords

09/16/2020
by   Zongyi Li, et al.
0

We present RepRank, an unsupervised graph-based ranking model for extractive multi-document summarization in which the similarity between words, sentences, and word-to-sentence can be estimated by the distances between their vector representations in a unified vector space. In order to obtain desirable representations, we propose a self-attention based learning method that represent a sentence by the weighted sum of its word embeddings, and the weights are concentrated to those words hopefully better reflecting the content of a document. We show that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using our learned representations, and prove that this process always converges to a unique solution leading to improvement in performance. A variant of absorbing random walk and the corresponding sampling-based algorithm are also described to avoid redundancy and increase diversity in the summaries. Experiment results with multiple benchmark datasets show that RepRank achieved the best or comparable performance in ROUGE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2020

Combining Word Embeddings and N-grams for Unsupervised Document Summarization

Graph-based extractive document summarization relies on the quality of t...
research
11/09/2022

Unsupervised Extractive Summarization with Heterogeneous Graph Embeddings for Chinese Document

In the scenario of unsupervised extractive summarization, learning high-...
research
02/11/2021

Unsupervised Extractive Summarization using Pointwise Mutual Information

Unsupervised approaches to extractive summarization usually rely on a no...
research
10/16/2018

Exploring Sentence Vector Spaces through Automatic Summarization

Given vector representations for individual words, it is necessary to co...
research
06/14/2015

Leveraging Word Embeddings for Spoken Document Summarization

Owing to the rapidly growing multimedia content available on the Interne...
research
05/08/2012

Document summarization using positive pointwise mutual information

The degree of success in document summarization processes depends on the...
research
05/14/2018

Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization

We introduce a novel graph-based framework for abstractive meeting speec...

Please sign up or login with your details

Forgot password? Click here to reset