Distant Learning for Entity Linking with Automatic Noise Detection

05/17/2019
by   Phong Le, et al.
1

Accurate entity linkers have been produced for domains and languages where annotated data (i.e., texts linked to a knowledge base) is available. However, little progress has been made for the settings where no or very limited amounts of labeled data are present (e.g., legal or most scientific domains). In this work, we show how we can learn to link mentions without having any labeled examples, only a knowledge base and a collection of unannotated texts from the corresponding domain. In order to achieve this, we frame the task as a multi-instance learning problem and rely on surface matching to create initial noisy labels. As the learning signal is weak and our surrogate labels are noisy, we introduce a noise detection component in our model: it lets the model detect and disregard examples which are likely to be noisy. Our method, jointly learning to detect noise and link entities, greatly outperforms the surface matching baseline and for a subset of entity categories even approaches the performance of supervised learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2023

DaMuEL: A Large Multilingual Dataset for Entity Linking

We present DaMuEL, a large Multilingual Dataset for Entity Linking conta...
research
07/12/2022

Effective Few-Shot Named Entity Linking by Meta-Learning

Entity linking aims to link ambiguous mentions to their corresponding en...
research
05/04/2019

Learning to Denoise Distantly-Labeled Data for Entity Typing

Distantly-labeled data can be used to scale up training of statistical m...
research
08/07/2017

Corpus-level Fine-grained Entity Typing

This paper addresses the problem of corpus-level entity typing, i.e., in...
research
01/14/2021

Better Together – An Ensemble Learner for Combining the Results of Ready-made Entity Linking Systems

Entity linking (EL) is the task of automatically identifying entity ment...
research
05/05/2023

Uncertainty-Aware Bootstrap Learning for Joint Extraction on Distantly-Supervised Data

Jointly extracting entity pairs and their relations is challenging when ...
research
06/05/2023

CoSiNES: Contrastive Siamese Network for Entity Standardization

Entity standardization maps noisy mentions from free-form text to standa...

Please sign up or login with your details

Forgot password? Click here to reset