Keyphrase Extraction with Span-based Feature Representations

02/13/2020
by   Funan Mu, et al.
1

Keyphrases are capable of providing semantic metadata characterizing documents and producing an overview of the content of a document. Since keyphrase extraction is able to facilitate the management, categorization, and retrieval of information, it has received much attention in recent years. There are three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. Two-step ranking approach is based on feature engineering, which is labor intensive and domain dependent. Sequence labeling is not able to tackle overlapping phrases. Generation methods (i.e., Sequence-to-sequence neural network models) overcome those shortcomings, so they have been widely studied and gain state-of-the-art performance. However, generation methods can not utilize context information effectively. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens. In this way, our model obtains representation for each keyphrase and further learns to capture the interaction between keyphrases in one document to get better ranking results. In addition, with the help of tokens, our model is able to extract overlapped keyphrases. Experimental results on the benchmark datasets show that our proposed model outperforms the existing methods by a large margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2022

Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents

This paper introduces a new information extraction model for business do...
research
08/04/2020

Select, Extract and Generate: Neural Keyphrase Generation with Syntactic Guidance

In recent years, deep neural sequence-to-sequence framework has demonstr...
research
03/16/2022

FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

Sequence modeling has demonstrated state-of-the-art performance on natur...
research
08/17/2023

Enhancing Phrase Representation by Information Bottleneck Guided Text Diffusion Process for Keyphrase Extraction

Keyphrase extraction (KPE) is an important task in Natural Language Proc...
research
06/06/2021

Extractive Research Slide Generation Using Windowed Labeling Ranking

Presentation slides describing the content of scientific and technical p...
research
08/01/2016

Keyphrase Extraction using Sequential Labeling

Keyphrases efficiently summarize a document's content and are used in va...
research
01/05/2018

Learning Feature Representations for Keyphrase Extraction

In supervised approaches for keyphrase extraction, a candidate phrase is...

Please sign up or login with your details

Forgot password? Click here to reset