Kernel distance measures for time series, random fields and other structured data

09/29/2021
by   Srinjoy Das, et al.
0

This paper introduces kdiff, a novel kernel-based measure for estimating distances between instances of time series, random fields and other forms of structured data. This measure is based on the idea of matching distributions that only overlap over a portion of their region of support. Our proposed measure is inspired by MPdist which has been previously proposed for such datasets and is constructed using Euclidean metrics, whereas kdiff is constructed using non-linear kernel distances. Also, kdiff accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution. Comparing the cross similarity to self similarity allows for measures of similarity that are more robust to noise and partial occlusions of the relevant signals. Our proposed measure kdiff is a more general form of the well known kernel-based Maximum Mean Discrepancy (MMD) distance estimated over the embeddings. Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems where the embedding distributions can be modeled as two component mixtures. Applications are demonstrated for clustering of synthetic and real-life time series and image data, and the performance of kdiff is compared to competing distance measures for clustering.

READ FULL TEXT

page 11

page 12

research
05/30/2022

A Review and Evaluation of Elastic Distance Functions for Time Series Clustering

Time series clustering is the act of grouping time series data without r...
research
06/12/2018

A review on distance based time series classification

Time series classification is an increasing research topic due to the va...
research
03/24/2023

Clustering Multivariate Time Series using Energy Distance

A novel methodology is proposed for clustering multivariate time series ...
research
10/17/2021

Noise-robust Clustering

This paper presents noise-robust clustering techniques in unsupervised m...
research
11/04/2019

Novel semi-metrics for multivariate change point analysis and anomaly detection

This paper proposes a new method for determining similarity and anomalie...
research
01/10/2016

On Clustering Time Series Using Euclidean Distance and Pearson Correlation

For time series comparisons, it has often been observed that z-score nor...
research
07/09/2014

Identifying Cover Songs Using Information-Theoretic Measures of Similarity

This paper investigates methods for quantifying similarity between audio...

Please sign up or login with your details

Forgot password? Click here to reset