A Clustering-based Framework for Classifying Data Streams

06/22/2021
by   Xuyang Yan, et al.
0

The non-stationary nature of data streams strongly challenges traditional machine learning techniques. Although some solutions have been proposed to extend traditional machine learning techniques for handling data streams, these approaches either require an initial label set or rely on specialized design parameters. The overlap among classes and the labeling of data streams constitute other major challenges for classifying data streams. In this paper, we proposed a clustering-based data stream classification framework to handle non-stationary data streams without utilizing an initial label set. A density-based stream clustering procedure is used to capture novel concepts with a dynamic threshold and an effective active label querying strategy is introduced to continuously learn the new concepts from the data streams. The sub-cluster structure of each cluster is explored to handle the overlap among classes. Experimental results and quantitative comparison studies reveal that the proposed method provides statistically better or comparable performance than the existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2020

Data Stream Clustering: A Review

Number of connected devices is steadily increasing and these devices con...
research
08/15/2019

Double-Coupling Learning for Multi-Task Data Stream Classification

Data stream classification methods demonstrate promising performance on ...
research
10/06/2016

A Robust Framework for Classifying Evolving Document Streams in an Expert-Machine-Crowd Setting

An emerging challenge in the online classification of social media data ...
research
02/08/2023

Combining self-labeling and demand based active learning for non-stationary data streams

Learning from non-stationary data streams is a research direction that g...
research
08/07/2018

Deep Stacked Stochastic Configuration Networks for Non-Stationary Data Streams

The concept of stochastic configuration networks (SCNs) others a solid f...
research
04/21/2015

The adaptable buffer algorithm for high quantile estimation in non-stationary data streams

The need to estimate a particular quantile of a distribution is an impor...
research
07/18/2018

Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

Many distributed machine learning frameworks have recently been built to...

Please sign up or login with your details

Forgot password? Click here to reset