An Adaptive Method for Weak Supervision with Drifting Data

06/02/2023
by   Alessio Mazzetto, et al.
0

We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. Our goal is to infer the unknown labels of a sequence of data by using weak supervision sources that provide independent noisy signals of the correct classification for each data point. This setting includes crowdsourcing and programmatic weak supervision. We focus on the non-stationary case, where the accuracy of the weak supervision sources can drift over time, e.g., because of changes in the underlying data distribution. Due to the drift, older data could provide misleading information to infer the label of the current data point. Previous work relied on a priori assumptions on the magnitude of the drift to decide how much data to use from the past. Comparatively, our algorithm does not require any assumptions on the drift, and it adapts based on the input. In particular, at each step, our algorithm guarantees an estimation of the current accuracies of the weak supervision sources over a window of past observations that minimizes a trade-off between the error due to the variance of the estimation and the error due to the drift. Experiments on synthetic and real-world labelers show that our approach indeed adapts to the drift. Unlike fixed-window-size strategies, it dynamically chooses a window size that allows it to consistently maintain good performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2023

An Adaptive Algorithm for Learning with Unknown Distribution Drift

We develop and analyze a general technique for learning with an unknown ...
research
04/14/2022

Stream-based Active Learning with Verification Latency in Non-stationary Environments

Data stream classification is an important problem in the field of machi...
research
02/05/2023

Nonparametric Density Estimation under Distribution Drift

We study nonparametric density estimation in non-stationary drift settin...
research
05/11/2022

Weak Supervision with Incremental Source Accuracy Estimation

Motivated by the desire to generate labels for real-time data we develop...
research
02/27/2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

Weak supervision is a popular method for building machine learning model...
research
10/02/2019

Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data

Concept drift in learning and classification occurs when the statistical...
research
01/09/2022

Weak Supervision for Affordable Modeling of Electrocardiogram Data

Analysing electrocardiograms (ECGs) is an inexpensive and non-invasive, ...

Please sign up or login with your details

Forgot password? Click here to reset