CollaborER: A Self-supervised Entity Resolution Framework Using Multi-features Collaboration

08/18/2021
by   Congcong Ge, et al.
0

Entity Resolution (ER) aims to identify whether two tuples refer to the same real-world entity and is well-known to be labor-intensive. It is a prerequisite to anomaly detection, as comparing the attribute values of two matched tuples from two different datasets provides one effective way to detect anomalies. Existing ER approaches, due to insufficient feature discovery or error-prone inherent characteristics, are not able to achieve stable performance. In this paper, we present CollaborER, a self-supervised entity resolution framework via multi-features collaboration. It is capable of (i) obtaining reliable ER results with zero human annotations and (ii) discovering adequate tuples' features in a fault-tolerant manner. CollaborER consists of two phases, i.e., automatic label generation (ALG) and collaborative ER training (CERT). In the first phase, ALG is proposed to generate a set of positive tuple pairs and a set of negative tuple pairs. ALG guarantees the high quality of the generated tuples and hence ensures the training quality of the subsequent CERT. In the second phase, CERT is introduced to learn the matching signals by discovering graph features and sentence features of tuples collaboratively. Extensive experimental results over eight real-world ER benchmarks show that CollaborER outperforms all the existing unsupervised ER approaches and is comparable or even superior to the state-of-the-art supervised ER methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs

Entity alignment, aiming to identify equivalent entities across differen...
research
06/17/2021

A Self-supervised Method for Entity Alignment

Entity alignment, aiming to identify equivalent entities across differen...
research
10/19/2020

Anomaly Detection on X-Rays Using Self-Supervised Aggregation Learning

Deep anomaly detection models using a supervised mode of learning usuall...
research
02/21/2020

Crowdsourced Collective Entity Resolution with Relational Match Propagation

Knowledge bases (KBs) store rich yet heterogeneous entities and facts. E...
research
09/12/2022

Hyperbolic Self-supervised Contrastive Learning Based Network Anomaly Detection

Anomaly detection on the attributed network has recently received increa...
research
08/16/2019

AutoER: Automated Entity Resolution using Generative Modelling

Entity resolution (ER) refers to the problem of identifying records in o...
research
12/06/2022

Self-supervised Graph Representation Learning for Black Market Account Detection

Nowadays, Multi-purpose Messaging Mobile App (MMMA) has become increasin...

Please sign up or login with your details

Forgot password? Click here to reset