DeepER -- Deep Entity Resolution

10/02/2017
by   Muhammad Ebraheem, et al.
0

Entity Resolution (ER) is a fundamental problem with many applications. Machine learning (ML)-based and rule-based approaches have been widely studied for decades, with many efforts being geared towards which features/attributes to select, which similarity functions to employ, and which blocking function to use - complicating the deployment of an ER system as a turn-key system. In this paper, we present DeepER, a turn-key ER system powered by deep learning (DL) techniques. The central idea is that distributed representations and representation learning from DL can alleviate the above human efforts for tuning existing ER systems. DeepER makes several notable contributions: encoding a tuple as a distributed representation of attribute values, building classifiers using these representations and a semantic aware blocking based on LSH, and learning and tuning the distributed representations for ER. We evaluate our algorithms on multiple benchmark datasets and achieve competitive results while requiring minimal interaction with experts.

READ FULL TEXT
research
12/07/2019

AutoBlock: A Hands-off Blocking Framework for Entity Matching

Entity matching seeks to identify data records over one or multiple data...
research
09/15/2020

CorDEL: A Contrastive Deep Learning Approach for Entity Linkage

Entity linkage (EL) is a critical problem in data cleaning and integrati...
research
02/07/2016

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Entity resolution (ER), an important and common data cleaning problem, i...
research
07/08/2022

Sudowoodo: Contrastive Self-supervised Learning for Multi-purpose Data Integration and Preparation

Machine learning (ML) is playing an increasingly important role in data ...
research
10/07/2022

Key Information Extraction in Purchase Documents using Deep Learning and Rule-based Corrections

Deep Learning (DL) is dominating the fields of Natural Language Processi...
research
04/24/2023

Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis Benchmark]

Many recent works on Entity Resolution (ER) leverage Deep Learning techn...
research
10/21/2017

Superposed Episodic and Semantic Memory via Sparse Distributed Representation

The abilities to perceive, learn, and use generalities, similarities, cl...

Please sign up or login with your details

Forgot password? Click here to reset