Oversampling for Imbalanced Time Series Data

04/14/2020
by   Tuanfei Zhu, et al.
0

Many important real-world applications involve time-series data with skewed distribution. Compared to conventional imbalance learning problems, the classification of imbalanced time-series data is more challenging due to high dimensionality and high inter-variable correlation. This paper proposes a structure preserving Oversampling method to combat the High-dimensional Imbalanced Time-series classification (OHIT). OHIT first leverages a density-ratio based shared nearest neighbor clustering algorithm to capture the modes of minority class in high-dimensional space. It then for each mode applies the shrinkage technique of large-dimensional covariance matrix to obtain accurate and reliable covariance structure. Finally, OHIT generates the structure-preserving synthetic samples based on multivariate Gaussian distribution by using the estimated covariance matrices. Experimental results on several publicly available time-series datasets (including unimodal and multi-modal) demonstrate the superiority of OHIT against the state-of-the-art oversampling algorithms in terms of F-value, G-mean, and AUC.

READ FULL TEXT
research
10/10/2021

Time Series Classification Using Convolutional Neural Network On Imbalanced Datasets

Time Series Classification (TSC) has drawn a lot of attention in literat...
research
08/07/2023

DOMINO: Domain-invariant Hyperdimensional Classification for Multi-Sensor Time Series Data

With the rapid evolution of the Internet of Things, many real-world appl...
research
11/27/2017

OSTSC: Over Sampling for Time Series Classification in R

The OSTSC package is a powerful oversampling approach for classifying un...
research
05/30/2019

Efficient Covariance Estimation from Temporal Data

Estimating the covariance structure of multivariate time series is a fun...
research
06/01/2023

Learning Gaussian Mixture Representations for Tensor Time Series Forecasting

Tensor time series (TTS) data, a generalization of one-dimensional time ...
research
07/04/2016

Automatic Generation of Probabilistic Programming from Time Series Data

Probabilistic programming languages represent complex data with intermin...
research
02/16/2022

HDC-MiniROCKET: Explicit Time Encoding in Time Series Classification with Hyperdimensional Computing

Classification of time series data is an important task for many applica...

Please sign up or login with your details

Forgot password? Click here to reset