A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents

10/22/2020
by   Tuan Manh Lai, et al.
0

Keyphrase extraction is the task of extracting a small set of phrases that best describe a document. Most existing benchmark datasets for the task typically have limited numbers of annotated documents, making it challenging to train increasingly complex neural networks. In contrast, digital libraries store millions of scientific articles online, covering a wide range of topics. While a significant portion of these articles contain keyphrases provided by their authors, most other articles lack such kind of annotations. Therefore, to effectively utilize these large amounts of unlabeled articles, we propose a simple and efficient joint learning approach based on the idea of self-distillation. Experimental results show that our approach consistently improves the performance of baseline models for keyphrase extraction. Furthermore, our best models outperform previous methods for the task, achieving new state-of-the-art results on two public benchmarks: Inspec and SemEval-2017.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2016

How Document Pre-processing affects Keyphrase Extraction Performance

The SemEval-2010 benchmark dataset has brought renewed attention to the ...
research
11/04/2016

Learning to Rank Scientific Documents from the Crowd

Finding related published articles is an important task in any science, ...
research
06/28/2021

The DELICES project: Indexing scientific literature through semantic expansion

Scientific digital libraries play a critical role in the development and...
research
12/21/2012

Topic Extraction and Bundling of Related Scientific Articles

Automatic classification of scientific articles based on common characte...
research
03/15/2022

Unsupervised Keyphrase Extraction via Interpretable Neural Networks

Keyphrase extraction aims at automatically extracting a list of "importa...
research
01/13/2018

EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings

Keyphrase extraction is the task of automatically selecting a small set ...
research
07/27/2021

Dataset Distillation with Infinitely Wide Convolutional Networks

The effectiveness of machine learning algorithms arises from being able ...

Please sign up or login with your details

Forgot password? Click here to reset