Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution

11/20/2021
by   Wenpeng Yin, et al.
0

The distribution gap between training datasets and data encountered in production is well acknowledged. Training datasets are often constructed over a fixed period of time and by carefully curating the data to be labeled. Thus, training datasets may not contain all possible variations of data that could be encountered in real-world production environments. Tasked with building an entity resolution system - a model that identifies and consolidates data points that represent the same person - our first model exhibited a clear training-production performance gap. In this case study, we discuss our human-in-the-loop enabled, data-centric solution to closing the training-production performance divergence. We conclude with takeaways that apply to data-centric learning at large.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2015

Performance Bounds for Pairwise Entity Resolution

One significant challenge to scaling entity resolution algorithms to mas...
research
07/20/2022

DataPerf: Benchmarks for Data-Centric AI Development

Machine learning (ML) research has generally focused on models, while th...
research
02/07/2020

Developing a Hybrid Data-Driven, Mechanistic Virtual Flow Meter – a Case Study

Virtual flow meters, mathematical models predicting production flow rate...
research
02/07/2022

Introducing explainable supervised machine learning into interactive feedback loops for statistical production system

Statistical production systems cover multiple steps from the collection,...
research
05/16/2023

Measuring Stereotypes using Entity-Centric Data

Stereotypes inform how we present ourselves and others, and in turn how ...
research
10/28/2019

Entity Abstraction in Visual Model-Based Reinforcement Learning

This paper tests the hypothesis that modeling a scene in terms of entiti...

Please sign up or login with your details

Forgot password? Click here to reset