Instance exploitation for learning temporary concepts from sparsely labeled drifting data streams

09/20/2020
by   Łukasz Korycki, et al.
0

Continual learning from streaming data sources becomes more and more popular due to the increasing number of online tools and systems. Dealing with dynamic and everlasting problems poses new challenges for which traditional batch-based offline algorithms turn out to be insufficient in terms of computational time and predictive performance. One of the most crucial limitations is that we cannot assume having access to a finite and complete data set - we always have to be ready for new data that may complement our model. This poses a critical problem of providing labels for potentially unbounded streams. In the real world, we are forced to deal with very strict budget limitations, therefore, we will most likely face the scarcity of annotated instances, which are essential in supervised learning. In our work, we emphasize this problem and propose a novel instance exploitation technique. We show that when: (i) data is characterized by temporary non-stationary concepts, and (ii) there are very few labels spanned across a long time horizon, it is actually better to risk overfitting and adapt models more aggressively by exploiting the only labeled instances we have, instead of sticking to a standard learning mode and suffering from severe underfitting. We present different strategies and configurations for our methods, as well as an ensemble algorithm that attempts to maintain a sweet spot between risky and normal adaptation. Finally, we conduct a complex in-depth comparative analysis of our methods, using state-of-the-art streaming algorithms relevant to the given problem.

READ FULL TEXT

page 7

page 27

research
12/21/2021

Mining Drifting Data Streams on a Budget: Combining Active Learning with Self-Labeling

Mining data streams poses a number of challenges, including the continuo...
research
06/14/2021

Automated Machine Learning Techniques for Data Streams

Automated machine learning techniques benefited from tremendous research...
research
05/26/2021

Continual Learning for Real-World Autonomous Systems: Algorithms, Challenges and Frameworks

Continual learning is essential for all real-world applications, as froz...
research
04/12/2022

Continual Predictive Learning from Videos

Predictive learning ideally builds the world model of physical processes...
research
12/19/2021

Active Weighted Aging Ensemble for Drifted Data Stream Classification

One of the significant problems of streaming data classification is the ...
research
05/15/2023

Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

In our digital universe nowadays, enormous amount of data are produced i...
research
12/12/2022

Learning on non-stationary data with re-weighting

Many real-world learning scenarios face the challenge of slow concept dr...

Please sign up or login with your details

Forgot password? Click here to reset