Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile

11/05/2018
by   Chin-Chia Michael Yeh, et al.
0

The last decade has seen a flurry of research on all-pairs-similarity-search (or, self-join) for text, DNA, and a handful of other datatypes, and these systems have been applied to many diverse data mining problems. Surprisingly, however, little progress has been made on addressing this problem for time series subsequences. In this thesis, we have introduced a near universal time series data mining tool called matrix profile which solves the all-pairs-similarity-search problem and caches the output in an easy-to-access fashion. The proposed algorithm is not only parameter-free, exact and scalable, but also applicable for both single and multidimensional time series. By building time series data mining methods on top of matrix profile, many time series data mining tasks (e.g., motif discovery, discord discovery, shapelet discovery, semantic segmentation, and clustering) can be efficiently solved. Because the same matrix profile can be shared by a diverse set of time series data mining methods, matrix profile is versatile and computed-once-use-many-times data structure. We demonstrate the utility of matrix profile for many time series data mining problems, including motif discovery, discord discovery, weakly labeled time series classification, and representation learning on domains as diverse as seismology, entomology, music processing, bioinformatics, human activity monitoring, electrical power-demand monitoring, and medicine. We hope the matrix profile is not the end but the beginning of many more time series data mining projects.

READ FULL TEXT
research
12/24/2021

Error-bounded Approximate Time Series Joins using Compact Dictionary Representations of Time Series

The matrix profile is an effective data mining tool that provides simila...
research
02/15/2018

Admissible Time Series Motif Discovery with Missing Data

The discovery of time series motifs has emerged as one of the most usefu...
research
09/16/2020

Matrix Profile XXII: Exact Discovery of Time Series Motifs under DTW

Over the last decade, time series motif discovery has emerged as a usefu...
research
04/17/2020

Exploring time-series motifs through DTW-SOM

Motif discovery is a fundamental step in data mining tasks for time-seri...
research
03/25/2020

FastDTW is approximate and Generally Slower than the Algorithm it Approximates

Many time series data mining problems can be solved with repeated use of...
research
06/16/2023

Calculating the matrix profile from noisy data

The matrix profile (MP) is a data structure computed from a time series ...
research
04/03/2020

Modeling Rare Interactions in Time Series Data Through Qualitative Change: Application to Outcome Prediction in Intensive Care Units

Many areas of research are characterised by the deluge of large-scale hi...

Please sign up or login with your details

Forgot password? Click here to reset