Scalable Hybrid HMM with Gaussian Process Emission for Sequential Time-series Data Clustering

by   Yohan Jung, et al.

Hidden Markov Model (HMM) combined with Gaussian Process (GP) emission can be effectively used to estimate the hidden state with a sequence of complex input-output relational observations. Especially when the spectral mixture (SM) kernel is used for GP emission, we call this model as a hybrid HMM-GPSM. This model can effectively model the sequence of time-series data. However, because of a large number of parameters for the SM kernel, this model can not effectively be trained with a large volume of data having (1) long sequence for state transition and 2) a large number of time-series dataset in each sequence. This paper proposes a scalable learning method for HMM-GPSM. To effectively train the model with a long sequence, the proposed method employs a Stochastic Variational Inference (SVI) approach. Also, to effectively process a large number of data point each time-series data, we approximate the SM kernel using Reparametrized Random Fourier Feature (R-RFF). The combination of these two techniques significantly reduces the training time. We validate the proposed learning method in terms of its hidden-sate estimation accuracy and computation time using large-scale synthetic and real data sets with missing values.


page 7

page 15

page 16


The Automatic Statistician: A Relational Perspective

Gaussian Processes (GPs) provide a general and analytically tractable wa...

Bayesian nonparametric shared multi-sequence time series segmentation

In this paper, we introduce a method for segmenting time series data usi...

Attentive Gaussian processes for probabilistic time-series generation

The transduction of sequence has been mostly done by recurrent networks,...

Capturing Structure Implicitly from Time-Series having Limited Data

Scientific fields such as insider-threat detection and highway-safety pl...

Fast nonparametric clustering of structured time-series

In this publication, we combine two Bayesian non-parametric models: the ...

Linear Multiple Low-Rank Kernel Based Stationary Gaussian Processes Regression for Time Series

Gaussian processes (GP) for machine learning have been studied systemati...

Symbolic Analysis-based Reduced Order Markov Modeling of Time Series Data

This paper presents a technique for reduced-order Markov modeling for co...