Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings

10/19/2019
by   Dhruva Sahrawat, et al.
0

In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017) and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of (a) using contextualized embeddings (e.g. BERT) over fixed word embeddings (e.g. Glove); (b) using a BiLSTM-CRF architecture with contextualized word embeddings over fine-tuning the contextualized word embedding model directly, and (c) using genre-specific contextualized embeddings (SciBERT). Through error analysis, we also provide some insights into why particular models work better than others. Lastly, we present a case study where we analyze different self-attention layers of the two best models (BERT and SciBERT) to better understand the predictions made by each for the task of keyphrase extraction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2019

Aspect Detection using Word and Char Embeddings with (Bi)LSTM and CRF

We proposed a new accurate aspect extraction method that makes use of bo...
research
07/26/2019

Investigating Self-Attention Network for Chinese Word Segmentation

Neural network has become the dominant method for Chinese word segmentat...
research
12/16/2019

Predicting the Outcome of Judicial Decisions made by the European Court of Human Rights

In this study, machine learning models were constructed to predict wheth...
research
11/29/2020

Improved Semantic Role Labeling using Parameterized Neighborhood Memory Adaptation

Deep neural models achieve some of the best results for semantic role la...
research
03/30/2022

Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling

This work presents a new resource for borrowing identification and analy...
research
11/13/2021

Keyphrase Extraction Using Neighborhood Knowledge Based on Word Embeddings

Keyphrase extraction is the task of finding several interesting phrases ...
research
09/06/2020

MIDAS at SemEval-2020 Task 10: Emphasis Selection using Label Distribution Learning and Contextual Embeddings

This paper presents our submission to the SemEval 2020 - Task 10 on emph...

Please sign up or login with your details

Forgot password? Click here to reset