Improving Machine-based Entity Resolution with Limited Human Effort: A Risk Perspective

05/31/2018
by   Zhaoqiang Chen, et al.
0

Pure machine-based solutions usually struggle in the challenging classification tasks such as entity resolution (ER). To alleviate this problem, a recent trend is to involve the human in the resolution process, most notably the crowdsourcing approach. However, it remains very challenging to effectively improve machine-based entity resolution with limited human effort. In this paper, we investigate the problem of human and machine cooperation for ER from a risk perspective. We propose to select the machine-labeled instances at high risk of being mislabeled for manual verification. For this task, we present a risk model that takes into consideration the human-labeled instances as well as the output of machine resolution. Finally, we evaluate the performance of the proposed risk model on real data. Our experiments demonstrate that it can pick up the mislabeled instances with considerably higher accuracy than the existing alternatives. Provided with the same amount of human cost budget, it can also achieve better resolution quality than the state-of-the-art approach based on active learning.

READ FULL TEXT
research
05/31/2018

Improving the Results of Machine-based Entity Resolution with Limited Human Effort: A Risk Perspective

Pure machine-based solutions usually struggle in challenging classificat...
research
12/06/2019

Towards Interpretable and Learnable Risk Analysis for Entity Resolution

Machine-learning-based entity resolution has been widely studied. Howeve...
research
10/29/2018

Gradual Machine Learning for Entity Resolution

Usually considered as a classification problem, entity resolution can be...
research
03/15/2018

i-HUMO: An Interactive Human and Machine Cooperation Framework for Entity Resolution with Quality Guarantees

Even though many approaches have been proposed for entity resolution (ER...
research
12/23/2020

Active Deep Learning on Entity Resolution by Risk Sampling

While the state-of-the-art performance on entity resolution (ER) has bee...
research
11/20/2020

Cost-effective Variational Active Entity Resolution

Accurately identifying different representations of the same real-world ...
research
09/30/2017

Enabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework

Even though many machine algorithms have been proposed for entity resolu...

Please sign up or login with your details

Forgot password? Click here to reset