Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

06/12/2021
by   Anthony Chen, et al.
0

Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple entities can share a name, making disambiguation an inherent yet under-explored problem. We propose an evaluation benchmark for assessing the entity disambiguation capabilities of these retrievers, which we call Ambiguous Entity Retrieval (AmbER) sets. We define an AmbER set as a collection of entities that share a name along with queries about those entities. By covering the set of entities for polysemous names, AmbER sets act as a challenging test of entity disambiguation. We create AmbER sets for three popular open-domain tasks: fact checking, slot filling, and question answering, and evaluate a diverse set of retrievers. We find that the retrievers exhibit popularity bias, significantly under-performing on rarer entities that share a name, e.g., they are twice as likely to retrieve erroneous documents on queries for the less popular entity under the same name. These experiments on AmbER sets show their utility as an evaluation tool and highlight the weaknesses of popular retrieval systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2022

TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval

Entity retrieval–retrieving information about entity mentions in a query...
research
01/23/2018

Entity Retrieval and Text Mining for Online Reputation Monitoring

Online Reputation Monitoring (ORM) is concerned with the use of computat...
research
08/24/2021

Robustness Evaluation of Entity Disambiguation Using Prior Probes:the Case of Entity Overshadowing

Entity disambiguation (ED) is the last step of entity linking (EL), when...
research
10/07/2022

A Unified Encoder-Decoder Framework with Entity Memory

Entities, as important carriers of real-world knowledge, play a key role...
research
09/04/2020

KILT: a Benchmark for Knowledge Intensive Language Tasks

Challenging problems such as open-domain question answering, fact checki...
research
10/02/2020

Autoregressive Entity Retrieval

Entities are at the center of how we represent and aggregate knowledge. ...
research
05/19/2023

QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations

Formulating selective information needs results in queries that implicit...

Please sign up or login with your details

Forgot password? Click here to reset