Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition

04/25/2018
by   Ravdeep Pasricha, et al.
0

Tensor decompositions are used in various data mining applications from social network to medical applications and are extremely useful in discovering latent structures or concepts in the data. Many real-world applications are dynamic in nature and so are their data. To deal with this dynamic nature of data, there exist a variety of online tensor decomposition algorithms. A central assumption in all those algorithms is that the number of latent concepts remains fixed throughout the en- tire stream. However, this need not be the case. Every incoming batch in the stream may have a different number of latent concepts, and the difference in latent concepts from one tensor batch to another can provide insights into how our findings in a particular application behave and deviate over time. In this paper, we define "concept" and "concept drift" in the context of streaming tensor decomposition, as the manifestation of the variability of latent concepts throughout the stream. Furthermore, we introduce SeekAndDestroy, an algorithm that detects concept drift in streaming tensor decomposition and is able to produce results robust to that drift. To the best of our knowledge, this is the first work that investigates concept drift in streaming tensor decomposition. We extensively evaluate SeekAndDestroy on synthetic datasets, which exhibit a wide variety of realistic drift. Our experiments demonstrate the effectiveness of SeekAndDestroy, both in the detection of concept drift and in the alleviation of its effects, producing results with similar quality to decomposing the entire tensor in one shot. Additionally, in real datasets, SeekAndDestroy outperforms other streaming baselines, while discovering novel useful components.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2023

Streaming probabilistic tensor train decomposition

The Bayesian streaming tensor decomposition method is a novel method to ...
research
05/06/2022

PARAFAC2×N: Coupled Decomposition of Multi-modal Data with Drift in N Modes

Reliable analysis of comprehensive two-dimensional gas chromatography - ...
research
11/07/2020

Enhash: A Fast Streaming Algorithm For Concept Drift Detection

We propose Enhash, a fast ensemble learner that detects concept drift in...
research
06/24/2022

SECLEDS: Sequence Clustering in Evolving Data Streams via Multiple Medoids and Medoid Voting

Sequence clustering in a streaming environment is challenging because it...
research
12/07/2020

Passive Approach for the K-means Problem on Streaming Data

Currently the amount of data produced worldwide is increasing beyond mea...
research
03/11/2015

Automatic Unsupervised Tensor Mining with Quality Assessment

A popular tool for unsupervised modelling and mining multi-aspect data i...
research
10/10/2022

Modeling and Mining Multi-Aspect Graphs With Scalable Streaming Tensor Decomposition

Graphs emerge in almost every real-world application domain, ranging fro...

Please sign up or login with your details

Forgot password? Click here to reset