Enabling Quality Control for Entity Resolution: A Human and Machine Cooperative Framework

09/30/2017
by   Zhaoqiang Chen, et al.
0

Even though many machine algorithms have been proposed for entity resolution, it remains very challenging to find a solution with quality guarantees. In this paper, we propose a novel HUman and Machine cOoperative (HUMO) framework for entity resolution (ER), which divides an ER workload between machine and human. HUMO enables a mechanism for quality control that can flexibly enforce both precision and recall levels. We introduce the optimization problem of HUMO, minimizing human cost given a quality requirement, and then present three optimization approaches: a conservative baseline one purely based on the monotonicity assumption of precision, a more aggressive one based on sampling and a hybrid one that can take advantage of the strengths of both previous approaches. Finally, we demonstrate by extensive experiments on real and synthetic datasets that HUMO can achieve high-quality results with reasonable return on investment (ROI) in terms of human cost, and it performs considerably better than the state-of-the-art alternative in quality control.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2017

Enabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework

Even though many machine algorithms have been proposed for entity resolu...
research
03/15/2018

r-HUMO: A Risk-Aware Human-Machine Cooperation Framework for Entity Resolution with Quality Guarantees

Even though many approaches have been proposed for entity resolution (ER...
research
03/15/2018

i-HUMO: An Interactive Human and Machine Cooperation Framework for Entity Resolution with Quality Guarantees

Even though many approaches have been proposed for entity resolution (ER...
research
05/31/2018

Improving the Results of Machine-based Entity Resolution with Limited Human Effort: A Risk Perspective

Pure machine-based solutions usually struggle in challenging classificat...
research
12/06/2019

Towards Interpretable and Learnable Risk Analysis for Entity Resolution

Machine-learning-based entity resolution has been widely studied. Howeve...
research
06/06/2022

On Efficient Approximate Queries over Machine Learning Models

The question of answering queries over ML predictions has been gaining a...
research
05/15/2019

End-to-End Entity Resolution for Big Data: A Survey

One of the most important tasks for improving data quality and the relia...

Please sign up or login with your details

Forgot password? Click here to reset