SOMTimeS: Self Organizing Maps for Time Series Clustering and its Application to Serious Illness Conversations

08/26/2021
by   Ali Javed, et al.
0

There is an increasing demand for scalable algorithms capable of clustering and analyzing large time series datasets. The Kohonen self-organizing map (SOM) is a type of unsupervised artificial neural network for visualizing and clustering complex data, reducing the dimensionality of data, and selecting influential features. Like all clustering methods, the SOM requires a measure of similarity between input data (in this work time series). Dynamic time warping (DTW) is one such measure, and a top performer given that it accommodates the distortions when aligning time series. Despite its use in clustering, DTW is limited in practice because it is quadratic in runtime complexity with the length of the time series data. To address this, we present a new DTW-based clustering method, called SOMTimeS (a Self-Organizing Map for TIME Series), that scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy. The computational performance of SOMTimeS stems from its ability to prune unnecessary DTW computations during the SOM's training phase. We also implemented a similar pruning strategy for K-means for comparison with one of the top performing clustering algorithms. We evaluated the pruning effectiveness, accuracy, execution time and scalability on 112 benchmark time series datasets from the University of California, Riverside classification archive. We showed that for similar accuracy, the speed-up achieved for SOMTimeS and K-means was 1.8x on average; however, rates varied between 1x and 18x depending on the dataset. SOMTimeS and K-means pruned 43 respectively. We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study of patient-clinician serious illness conversations to demonstrate the algorithm's utility with complex, temporally sequenced phenomena.

READ FULL TEXT
research
12/05/2019

Clustering Time-Series by a Novel Slope-Based Similarity Measure Considering Particle Swarm Optimization

Recently there has been an increase in the studies on time-series data m...
research
03/06/2017

A time series distance measure for efficient clustering of input output signals by their underlying dynamics

Starting from a dataset with input/output time series generated by multi...
research
05/14/2019

A self-organising eigenspace map for time series clustering

This paper presents a novel time series clustering method, the self-orga...
research
04/20/2020

A Benchmark Study on Time Series Clustering

This paper presents the first time series clustering benchmark utilizing...
research
12/31/2021

Clustering Vietnamese Conversations From Facebook Page To Build Training Dataset For Chatbot

The biggest challenge of building chatbots is training data. The require...
research
03/10/2016

Real time error detection in metal arc welding process using Artificial Neural Netwroks

Quality assurance in production line demands reliable weld joints. Human...
research
04/13/2018

Clustering Analysis on Locally Asymptotically Self-similar Processes

In this paper, we design algorithms for clustering locally asymptoticall...

Please sign up or login with your details

Forgot password? Click here to reset