Pangloss: Fast Entity Linking in Noisy Text Environments

07/16/2018
by   Michael Conover, et al.
0

Entity linking is the task of mapping potentially ambiguous terms in text to their constituent entities in a knowledge base like Wikipedia. This is useful for organizing content, extracting structured data from textual documents, and in machine learning relevance applications like semantic search, knowledge graph construction, and question answering. Traditionally, this work has focused on text that has been well-formed, like news articles, but in common real world datasets such as messaging, resumes, or short-form social media, non-grammatical, loosely-structured text adds a new dimension to this problem. This paper presents Pangloss, a production system for entity disambiguation on noisy text. Pangloss combines a probabilistic linear-time key phrase identification algorithm with a semantic similarity engine based on context-dependent document embeddings to achieve better than state-of-the-art results (>5 systems. In addition, Pangloss leverages a local embedded database with a tiered architecture to house its statistics and metadata, which allows rapid disambiguation in streaming contexts and on-device disambiguation in low-memory environments such as mobile phones.

READ FULL TEXT
research
04/22/2020

ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts

In recent years, social media data has exponentially increased, which ca...
research
02/27/2022

Improving Candidate Retrieval with Entity Profile Generation for Wikidata Entity Linking

Entity linking (EL) is the task of linking entity mentions in a document...
research
08/23/2018

Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs

This paper addresses the problem of mapping natural language text to kno...
research
05/09/2022

BLINK with Elasticsearch for Efficient Entity Linking in Business Conversations

An Entity Linking system aligns the textual mentions of entities in a te...
research
12/05/2018

A Knowledge Graph Based Solution for Entity Discovery and Linking in Open-Domain Questions

Named entity discovery and linking is the fundamental and core component...
research
07/14/2019

TWEETQA: A Social Media Focused Question Answering Dataset

With social media becoming increasingly pop-ular on which lots of news a...
research
01/07/2021

Read, Retrospect, Select: An MRC Framework to Short Text Entity Linking

Entity linking (EL) for the rapidly growing short text (e.g. search quer...

Please sign up or login with your details

Forgot password? Click here to reset