A Self-Supervised Learning-based Approach to Clustering Multivariate Time-Series Data with Missing Values (SLAC-Time): An Application to Traumatic Brain Injury Phenotyping

02/27/2023
by   Hamid Ghaderi, et al.
0

Self-supervised learning approaches provide a promising direction for clustering multivariate time-series data. However, real-world time-series data often include missing values, and the existing approaches require imputing missing values before clustering, which may cause extensive computations and noise and result in invalid interpretations. To address these challenges, we present a Self-supervised Learning-based Approach to Clustering multivariate Time-series data with missing values (SLAC-Time). SLAC-Time is a Transformer-based clustering method that uses time-series forecasting as a proxy task for leveraging unlabeled data and learning more robust time-series representations. This method jointly learns the neural network parameters and the cluster assignments of the learned representations. It iteratively clusters the learned representations with the K-means method and then utilizes the subsequent cluster assignments as pseudo-labels to update the model parameters. To evaluate our proposed approach, we applied it to clustering and phenotyping Traumatic Brain Injury (TBI) patients in the TRACK-TBI dataset. Our experiments demonstrate that SLAC-Time outperforms the baseline K-means clustering algorithm in terms of silhouette coefficient, Calinski Harabasz index, Dunn index, and Davies Bouldin index. We identified three TBI phenotypes that are distinct from one another in terms of clinically significant variables as well as clinical outcomes, including the Extended Glasgow Outcome Scale (GOSE) score, Intensive Care Unit (ICU) length of stay, and mortality rate. The experiments show that the TBI phenotypes identified by SLAC-Time can be potentially used for developing targeted clinical trials and therapeutic strategies.

READ FULL TEXT
research
07/29/2021

Self-supervised Transformer for Multivariate Clinical Time-Series with Missing Values

Multivariate time-series (MVTS) data are frequently observed in critical...
research
03/23/2023

Self-Supervised Clustering of Multivariate Time-Series Data for Identifying TBI Physiological States

Determining clinically relevant physiological states from multivariate t...
research
04/11/2020

Clustering Time Series Data through Autoencoder-based Deep Learning Models

Machine learning and in particular deep learning algorithms are the emer...
research
08/03/2017

Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model

Recurrent major mood episodes and subsyndromal mood instability cause su...
research
12/02/2022

Clustering individuals based on multivariate EMA time-series data

In the field of psychopathology, Ecological Momentary Assessment (EMA) m...
research
09/09/2022

Autoencoder Based Iterative Modeling and Multivariate Time-Series Subsequence Clustering Algorithm

This paper introduces an algorithm for the detection of change-points an...

Please sign up or login with your details

Forgot password? Click here to reset