Stream-based active learning with linear models

07/20/2022
by   Davide Cacciarelli, et al.
0

The proliferation of automated data collection schemes and the advances in sensorics are increasing the amount of data we are able to monitor in real-time. However, given the high annotation costs and the time required by quality inspections, data is often available in an unlabeled form. This is fostering the use of active learning for the development of soft sensors and predictive models. In production, instead of performing random inspections to obtain product information, labels are collected by evaluating the information content of the unlabeled data. Several query strategy frameworks for regression have been proposed in the literature but most of the focus has been dedicated to the static pool-based scenario. In this work, we propose a new strategy for the stream-based scenario, where instances are sequentially offered to the learner, which must instantaneously decide whether to perform the quality check to obtain the label or discard the instance. The approach is inspired by the optimal experimental design theory and the iterative aspect of the decision-making process is tackled by setting a threshold on the informativeness of the unlabeled data points. The proposed approach is evaluated using numerical simulations and the Tennessee Eastman Process simulator. The results confirm that selecting the examples suggested by the proposed algorithm allows for a faster reduction in the prediction error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2023

A survey on online active learning

Online active learning is a paradigm in machine learning that aims to se...
research
04/16/2021

Data Shapley Valuation for Efficient Batch Active Learning

Annotating the right set of data amongst all available data points is a ...
research
12/26/2022

Online Active Learning for Soft Sensor Development using Semi-Supervised Autoencoders

Data-driven soft sensors are extensively used in industrial and chemical...
research
06/05/2021

Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

Given restrictions on the availability of data, active learning is the p...
research
07/31/2022

Deep Active Learning with Budget Annotation

Digital data collected over the decades and data currently being produce...
research
10/18/2020

Exploiting Context for Robustness to Label Noise in Active Learning

Several works in computer vision have demonstrated the effectiveness of ...
research
10/19/2020

Online Active Model Selection for Pre-trained Classifiers

Given k pre-trained classifiers and a stream of unlabeled data examples,...

Please sign up or login with your details

Forgot password? Click here to reset