Multi-Document Keyphrase Extraction: A Literature Review and the First Dataset

10/03/2021
by   Ori Shapira, et al.
0

Keyphrase extraction has been comprehensively researched within the single-document setting, with an abundance of methods and a wealth of datasets. In contrast, multi-document keyphrase extraction has been infrequently studied, despite its utility for describing sets of documents, and its use in summarization. Moreover, no dataset existed for multi-document keyphrase extraction, hindering the progress of the task. Recent advances in multi-text processing make the task an even more appealing challenge to pursue. To initiate this pursuit, we present here the first literature review and the first dataset for the task, MK-DUC-01, which can serve as a new benchmark. We test several keyphrase extraction baselines on our data and show their results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2000

Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

We present a multi-document summarizer, called MEAD, which generates sum...
research
10/25/2016

How Document Pre-processing affects Keyphrase Extraction Performance

The SemEval-2010 benchmark dataset has brought renewed attention to the ...
research
04/18/2017

Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

Due to its promise to alleviate information overload, text summarization...
research
02/10/2023

PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream

Summarizing text-rich documents has been long studied in the literature,...
research
01/21/2019

AD3: Attentive Deep Document Dater

Knowledge of the creation date of documents facilitates several tasks su...
research
07/19/2021

MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes

We introduce MemSum (Multi-step Episodic Markov decision process extract...
research
10/11/2020

Revising FUNSD dataset for key-value detection in document images

FUNSD is one of the limited publicly available datasets for information ...

Please sign up or login with your details

Forgot password? Click here to reset