Stream-based Active Learning with Verification Latency in Non-stationary Environments

04/14/2022
by   Andrea Castellani, et al.
0

Data stream classification is an important problem in the field of machine learning. Due to the non-stationary nature of the data where the underlying distribution changes over time (concept drift), the model needs to continuously adapt to new data statistics. Stream-based Active Learning (AL) approaches address this problem by interactively querying a human expert to provide new data labels for the most recent samples, within a limited budget. Existing AL strategies assume that labels are immediately available, while in a real-world scenario the expert requires time to provide a queried label (verification latency), and by the time the requested labels arrive they may not be relevant anymore. In this article, we investigate the influence of finite, time-variable, and unknown verification delay, in the presence of concept drift on AL approaches. We propose PRopagate (PR), a latency independent utility estimator which also predicts the requested, but not yet known, labels. Furthermore, we propose a drift-dependent dynamic budget strategy, which uses a variable distribution of the labelling budget over time, after a detected drift. Thorough experimental evaluation, with both synthetic and real-world non-stationary datasets, and different settings of verification latency and budget are conducted and analyzed. We empirically show that the proposed method consistently outperforms the state-of-the-art. Additionally, we demonstrate that with variable budget allocation in time, it is possible to boost the performance of AL strategies, without increasing the overall labeling budget.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2022

Unsupervised Unlearning of Concept Drift with Autoencoders

The phenomena of concept drift refers to a change of the data distributi...
research
11/26/2020

Comparative Analysis of Extreme Verification Latency Learning Algorithms

One of the more challenging real-world problems in computational intelli...
research
12/19/2021

Active Weighted Aging Ensemble for Drifted Data Stream Classification

One of the significant problems of streaming data classification is the ...
research
06/02/2023

An Adaptive Method for Weak Supervision with Drifting Data

We introduce an adaptive method with formal quality guarantees for weak ...
research
12/21/2021

Mining Drifting Data Streams on a Budget: Combining Active Learning with Self-Labeling

Mining data streams poses a number of challenges, including the continuo...
research
02/10/2020

Model adaptation and unsupervised learning with non-stationary batch data under smooth concept drift

Most predictive models assume that training and test data are generated ...
research
03/18/2019

Prototype-based classifiers in the presence of concept drift: A modelling framework

We present a modelling framework for the investigation of prototype-base...

Please sign up or login with your details

Forgot password? Click here to reset