Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness

03/23/2021
by   Florian Boudin, et al.
0

Neural keyphrase generation models have recently attracted much interest due to their ability to output absent keyphrases, that is, keyphrases that do not appear in the source text. In this paper, we discuss the usefulness of absent keyphrases from an Information Retrieval (IR) perspective, and show that the commonly drawn distinction between present and absent keyphrases is not made explicit enough. We introduce a finer-grained categorization scheme that sheds more light on the impact of absent keyphrases on scientific document retrieval. Under this scheme, we find that only a fraction (around 20 make up keyphrases actually serves as document expansion, but that this small fraction of words is behind much of the gains observed in retrieval effectiveness. We also discuss how the proposed scheme can offer a new angle to evaluate the output of neural keyphrase generation models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2020

Neural document expansion for ad-hoc information retrieval

Recently, Nogueira et al. [2019] proposed a new approach to document exp...
research
06/28/2021

Keyphrase Generation for Scientific Document Retrieval

Sequence-to-sequence models have lead to significant progress in keyphra...
research
04/02/2018

The Effectiveness of Classification on Information Retrieval System (Case Study)

Large amount of unstructured designed information is difficult to deal w...
research
05/24/2023

Referral Augmentation for Zero-Shot Information Retrieval

We propose Referral-Augmented Retrieval (RAR), a simple technique that c...
research
08/20/2017

Modelling Word Burstiness in Natural Language: A Generalised Polya Process for Document Language Models in Information Retrieval

We introduce a generalised multivariate Polya process for document langu...
research
04/24/2022

Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval

We show that supervised neural information retrieval (IR) models are pro...
research
06/20/2023

Representation Sparsification with Hybrid Thresholding for Fast SPLADE-based Document Retrieval

Learned sparse document representations using a transformer-based neural...

Please sign up or login with your details

Forgot password? Click here to reset