Scalable Hybrid HMM with Gaussian Process Emission for Sequential Time-series Data Clustering

01/07/2020
by   Yohan Jung, et al.
42

Hidden Markov Model (HMM) combined with Gaussian Process (GP) emission can be effectively used to estimate the hidden state with a sequence of complex input-output relational observations. Especially when the spectral mixture (SM) kernel is used for GP emission, we call this model as a hybrid HMM-GPSM. This model can effectively model the sequence of time-series data. However, because of a large number of parameters for the SM kernel, this model can not effectively be trained with a large volume of data having (1) long sequence for state transition and 2) a large number of time-series dataset in each sequence. This paper proposes a scalable learning method for HMM-GPSM. To effectively train the model with a long sequence, the proposed method employs a Stochastic Variational Inference (SVI) approach. Also, to effectively process a large number of data point each time-series data, we approximate the SM kernel using Reparametrized Random Fourier Feature (R-RFF). The combination of these two techniques significantly reduces the training time. We validate the proposed learning method in terms of its hidden-sate estimation accuracy and computation time using large-scale synthetic and real data sets with missing values.

READ FULL TEXT

page 7

page 15

page 16

11/26/2015

The Automatic Statistician: A Relational Perspective

Gaussian Processes (GPs) provide a general and analytically tractable wa...
01/27/2020

Bayesian nonparametric shared multi-sequence time series segmentation

In this paper, we introduce a method for segmenting time series data usi...
02/10/2021

Attentive Gaussian processes for probabilistic time-series generation

The transduction of sequence has been mostly done by recurrent networks,...
03/15/2018

Capturing Structure Implicitly from Time-Series having Limited Data

Scientific fields such as insider-threat detection and highway-safety pl...
01/08/2014

Fast nonparametric clustering of structured time-series

In this publication, we combine two Bayesian non-parametric models: the ...
04/21/2019

Linear Multiple Low-Rank Kernel Based Stationary Gaussian Processes Regression for Time Series

Gaussian processes (GP) for machine learning have been studied systemati...
09/26/2017

Symbolic Analysis-based Reduced Order Markov Modeling of Time Series Data

This paper presents a technique for reduced-order Markov modeling for co...