High-Throughput and Language-Agnostic Entity Disambiguation and Linking on User Generated Data

03/13/2017
by   Preeti Bhargava, et al.
0

The Entity Disambiguation and Linking (EDL) task matches entity mentions in text to a unique Knowledge Base (KB) identifier such as a Wikipedia or Freebase id. It plays a critical role in the construction of a high quality information network, and can be further leveraged for a variety of information retrieval and NLP tasks such as text categorization and document tagging. EDL is a complex and challenging problem due to ambiguity of the mentions and real world text being multi-lingual. Moreover, EDL systems need to have high throughput and should be lightweight in order to scale to large datasets and run on off-the-shelf machines. More importantly, these systems need to be able to extract and disambiguate dense annotations from the data in order to enable an Information Retrieval or Extraction task running on the data to be more efficient and accurate. In order to address all these challenges, we present the Lithium EDL system and algorithm - a high-throughput, lightweight, language-agnostic EDL system that extracts and correctly disambiguates 75 entities than state-of-the-art EDL systems and is significantly faster than them.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2023

DaMuEL: A Large Multilingual Dataset for Entity Linking

We present DaMuEL, a large Multilingual Dataset for Entity Linking conta...
research
07/13/2017

Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media

In this paper, we describe the Lithium Natural Language Processing (NLP)...
research
08/13/2019

Linking Graph Entities with Multiplicity and Provenance

Entity linking is a fundamental database problem with applicationsin dat...
research
12/21/2021

Multimodal Entity Tagging with Multimodal Knowledge Base

To enhance research on multimodal knowledge base and multimodal informat...
research
04/19/2019

OpenTapioca: Lightweight Entity Linking for Wikidata

We propose a simple Named Entity Linking system that can be trained from...
research
04/15/2023

Neural Approaches to Entity-Centric Information Extraction

Artificial Intelligence (AI) has huge impact on our daily lives with app...
research
05/04/2020

Understanding Scanned Receipts

Tasking machines with understanding receipts can have important applicat...

Please sign up or login with your details

Forgot password? Click here to reset