A Review and Evaluation of Elastic Distance Functions for Time Series Clustering

05/30/2022
by   Chris Holder, et al.
10

Time series clustering is the act of grouping time series data without recourse to a label. Algorithms that cluster time series can be classified into two groups: those that employ a time series specific distance measure; and those that derive features from time series. Both approaches usually rely on traditional clustering algorithms such as k-means. Our focus is on distance based time series that employ elastic distance measures, i.e. distances that perform some kind of realignment whilst measuring distance. We describe nine commonly used elastic distance measures and compare their performance with k-means and k-medoids clustering. Our findings are surprising. The most popular technique, dynamic time warping (DTW), performs worse than Euclidean distance with k-means, and even when tuned, is no better. Using k-medoids rather than k-means improved the clusterings for all nine distance measures. DTW is not significantly better than Euclidean distance with k-medoids. Generally, distance measures that employ editing in conjunction with warping perform better, and one distance measure, the move-split-merge (MSM) method, is the best performing measure of this study. We also compare to clustering with DTW using barycentre averaging (DBA). We find that DBA does improve DTW k-means, but that the standard DBA is still worse than using MSM. Our conclusion is to recommend MSM with k-medoids as the benchmark algorithm for clustering time series with elastic distance measures. We provide implementations, results and guidance on reproducing results on the associated GitHub repository.

READ FULL TEXT
research
12/05/2019

Clustering Time-Series by a Novel Slope-Based Similarity Measure Considering Particle Swarm Optimization

Recently there has been an increase in the studies on time-series data m...
research
03/06/2017

A time series distance measure for efficient clustering of input output signals by their underlying dynamics

Starting from a dataset with input/output time series generated by multi...
research
04/20/2020

A Benchmark Study on Time Series Clustering

This paper presents the first time series clustering benchmark utilizing...
research
09/29/2021

Kernel distance measures for time series, random fields and other structured data

This paper introduces kdiff, a novel kernel-based measure for estimating...
research
01/10/2016

On Clustering Time Series Using Euclidean Distance and Pearson Correlation

For time series comparisons, it has often been observed that z-score nor...
research
04/25/2023

Bake off redux: a review and experimental evaluation of recent time series classification algorithms

In 2017, a research paper compared 18 Time Series Classification (TSC) a...
research
01/17/2021

Free congruence: an exploration of expanded similarity measures for time series data

Time series similarity measures are highly relevant in a wide range of e...

Please sign up or login with your details

Forgot password? Click here to reset