Deep Transfer Learning for Multi-source Entity Linkage via Domain Adaptation

10/27/2021
by   Di Jin, et al.
0

Multi-source entity linkage focuses on integrating knowledge from multiple sources by linking the records that represent the same real world entity. This is critical in high-impact applications such as data cleaning and user stitching. The state-of-the-art entity linkage pipelines mainly depend on supervised learning that requires abundant amounts of training data. However, collecting well-labeled training data becomes expensive when the data from many sources arrives incrementally over time. Moreover, the trained models can easily overfit to specific data sources, and thus fail to generalize to new sources due to significant differences in data and label distributions. To address these challenges, we present AdaMEL, a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage. AdaMEL models the attribute importance that is used to match entities through an attribute-level self-attention mechanism, and leverages the massive unlabeled data from new data sources through domain adaptation to make it generic and data-source agnostic. In addition, AdaMEL is capable of incorporating an additional set of labeled data to more accurately integrate data sources with different attribute importance. Extensive experiments show that our framework achieves state-of-the-art results with 8.21 average over methods based on supervised learning. Besides, it is more stable in handling different sets of data sources in less runtime.

READ FULL TEXT

page 9

page 14

research
10/22/2019

Integrating Information About Entities Progressively

Users often have to integrate information about entities from multiple d...
research
07/07/2021

EchoEA: Echo Information between Entities and Relations for Entity Alignment

Entity alignment (EA) is to discover entities referring to the same obje...
research
05/01/2010

Joint Structured Models for Extraction from Overlapping Sources

We consider the problem of jointly training structured models for extrac...
research
09/22/2022

Linking Contexts from Distinct Data Sources in Zero Trust Federation

An access control model called Zero Trust Architecture (ZTA) has attract...
research
09/19/2023

Semi-supervised Domain Adaptation in Graph Transfer Learning

As a specific case of graph transfer learning, unsupervised domain adapt...
research
06/17/2014

Self-Learning Camera: Autonomous Adaptation of Object Detectors to Unlabeled Video Streams

Learning object detectors requires massive amounts of labeled training s...
research
01/13/2015

Learning from Multiple Sources for Video Summarisation

Many visual surveillance tasks, e.g.video summarisation, is conventional...

Please sign up or login with your details

Forgot password? Click here to reset