End-to-End Entity Resolution for Big Data: A Survey

05/15/2019
by   Vassilis Christophides, et al.
0

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions.

READ FULL TEXT
research
10/21/2020

Neural Networks for Entity Matching

Entity matching is the problem of identifying which records refer to the...
research
03/16/2018

Big Data and Reliability Applications: The Complexity Dimension

Big data features not only large volumes of data but also data with comp...
research
07/08/2016

Translating Bayesian Networks into Entity Relationship Models, Extended Version

Big data analytics applications drive the convergence of data management...
research
08/10/2020

A Survey on Large-scale Machine Learning

Machine learning can provide deep insights into data, allowing machines ...
research
05/19/2020

Benchmarking Blocking Algorithms for Web Entities

An increasing number of entities are described by interlinked data rathe...
research
04/29/2019

A Survey of Community Search Over Big Graphs

With the rapid development of information technologies, various big grap...
research
09/30/2017

Enabling Quality Control for Entity Resolution: A Human and Machine Cooperative Framework

Even though many machine algorithms have been proposed for entity resolu...

Please sign up or login with your details

Forgot password? Click here to reset