Enhancing Phrase Representation by Information Bottleneck Guided Text Diffusion Process for Keyphrase Extraction

08/17/2023
by   Yuanzhen Luo, et al.
0

Keyphrase extraction (KPE) is an important task in Natural Language Processing for many scenarios, which aims to extract keyphrases that are present in a given document. Many existing supervised methods treat KPE as sequential labeling, span-level classification, or generative tasks. However, these methods lack the ability to utilize keyphrase information, which may result in biased results. In this study, we propose Diff-KPE, which leverages the supervised Variational Information Bottleneck (VIB) to guide the text diffusion process for generating enhanced keyphrase representations. Diff-KPE first generates the desired keyphrase embeddings conditioned on the entire document and then injects the generated keyphrase embeddings into each phrase representation. A ranking network and VIB are then optimized together with rank loss and classification loss, respectively. This design of Diff-KPE allows us to rank each candidate phrase by utilizing both the information of keyphrases and the document. Experiments show that Diff-KPE outperforms existing KPE methods on a large open domain keyphrase extraction benchmark, OpenKP, and a scientific domain dataset, KP20K.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2018

Theme-weighted Ranking of Keywords from Text Documents using Phrase Embeddings

Keyword extraction is a fundamental task in natural language processing ...
research
08/01/2016

Keyphrase Extraction using Sequential Labeling

Keyphrases efficiently summarize a document's content and are used in va...
research
05/04/2022

Hyperbolic Relevance Matching for Neural Keyphrase Extraction

Keyphrase extraction is a fundamental task in natural language processin...
research
10/19/2021

Importance Estimation from Multiple Perspectives for Keyphrase Extraction

Keyphrase extraction is a fundamental task in Natural Language Processin...
research
02/21/2017

Systèmes du LIA à DEFT'13

The 2013 Défi de Fouille de Textes (DEFT) campaign is interested in two ...
research
02/13/2020

Keyphrase Extraction with Span-based Feature Representations

Keyphrases are capable of providing semantic metadata characterizing doc...
research
04/18/2021

Unsupervised Deep Keyphrase Generation

Keyphrase generation aims to summarize long documents with a collection ...

Please sign up or login with your details

Forgot password? Click here to reset