DeepAI AI Chat
Log In Sign Up

Multi-Document Keyphrase Extraction: A Literature Review and the First Dataset

by   Ori Shapira, et al.

Keyphrase extraction has been comprehensively researched within the single-document setting, with an abundance of methods and a wealth of datasets. In contrast, multi-document keyphrase extraction has been infrequently studied, despite its utility for describing sets of documents, and its use in summarization. Moreover, no dataset existed for multi-document keyphrase extraction, hindering the progress of the task. Recent advances in multi-text processing make the task an even more appealing challenge to pursue. To initiate this pursuit, we present here the first literature review and the first dataset for the task, MK-DUC-01, which can serve as a new benchmark. We test several keyphrase extraction baselines on our data and show their results.


page 1

page 2

page 3

page 4


How Document Pre-processing affects Keyphrase Extraction Performance

The SemEval-2010 benchmark dataset has brought renewed attention to the ...

Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

Due to its promise to alleviate information overload, text summarization...

PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream

Summarizing text-rich documents has been long studied in the literature,...

AD3: Attentive Deep Document Dater

Knowledge of the creation date of documents facilitates several tasks su...

MemSum: Extractive Summarization of Long Documents using Multi-step Episodic Markov Decision Processes

We introduce MemSum (Multi-step Episodic Markov decision process extract...

A Review of Keyphrase Extraction

Automated keyphrase extraction is a crucial textual information processi...