COHORTNEY: Deep Clustering for Heterogeneous Event Sequences

04/03/2021
by   Vladislav Zhuzhel, et al.
0

There is emerging attention towards working with event sequences. In particular, clustering of event sequences is widely applicable in domains such as healthcare, marketing, and finance. Use cases include analysis of visitors to websites, hospitals, or bank transactions. Unlike traditional time series, event sequences tend to be sparse and not equally spaced in time. As a result, they exhibit different properties, which are essential to account for when developing state-of-the-art methods. The community has paid little attention to the specifics of heterogeneous event sequences. Existing research in clustering primarily focuses on classic times series data. It is unclear if proposed methods in the literature generalize well to event sequences. Here we propose COHORTNEY as a novel deep learning method for clustering heterogeneous event sequences. Our contributions include (i) a novel method using a combination of LSTM and the EM algorithm and code implementation; (ii) a comparison of this method to previous research on time series and event sequence clustering; (iii) a performance benchmark of different approaches on a new dataset from the finance industry and fourteen additional datasets. Our results show that COHORTNEY vastly outperforms in speed and cluster quality the state-of-the-art algorithm for clustering event sequences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2020

Time series classification for predictive maintenance on event logs

Time series classification (TSC) gained a lot of attention in the past d...
research
02/20/2021

nTreeClus: a Tree-based Sequence Encoder for Clustering Categorical Series

The overwhelming presence of categorical/sequential data in diverse doma...
research
04/20/2020

A Benchmark Study on Time Series Clustering

This paper presents the first time series clustering benchmark utilizing...
research
01/31/2021

Synergetic Learning of Heterogeneous Temporal Sequences for Multi-Horizon Probabilistic Forecasting

Time-series is ubiquitous across applications, such as transportation, f...
research
04/27/2019

Temporal-Clustering Invariance in Irregular Healthcare Time Series

Electronic records contain sequences of events, some of which take place...
research
04/28/2022

COSTI: a New Classifier for Sequences of Temporal Intervals

Classification of sequences of temporal intervals is a part of time seri...
research
12/06/2018

Time-Discounting Convolution for Event Sequences with Ambiguous Timestamps

This paper proposes a method for modeling event sequences with ambiguous...

Please sign up or login with your details

Forgot password? Click here to reset