The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art

06/20/2020
by   Karima Echihabi, et al.
0

Increasingly large data series collections are becoming commonplace across many different domains and applications. A key operation in the analysis of data series collections is similarity search, which has attracted lots of attention and effort over the past two decades. Even though several relevant approaches have been proposed in the literature, none of the existing studies provides a detailed evaluation against the available alternatives. The lack of comparative results is further exacerbated by the non-standard use of terminology, which has led to confusion and misconceptions. In this paper, we provide definitions for the different flavors of similarity search that have been studied in the past, and present the first systematic experimental evaluation of the efficiency of data series similarity search techniques. Based on the experimental results, we describe the strengths and weaknesses of each approach and give recommendations for the best approach to use under typical use cases. Finally, by identifying the shortcomings of each method, our findings lay the ground for solid further developments in the field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2020

Data Series Indexing Gone Parallel

Data series similarity search is a core operation for several data serie...
research
06/20/2020

Return of the Lernaean Hydra: Experimental Evaluation of Data Series Approximate Similarity Search

Data series are a special type of multidimensional data present in numer...
research
10/14/2021

Fast Data Series Indexing for In-Memory Data

Data series similarity search is a core operation for several data serie...
research
09/22/2020

Effective and Efficient Variable-Length Data Series Analytics

In the last twenty years, data series similarity search has emerged as a...
research
01/04/2022

Elastic Product Quantization for Time Series

Analyzing numerous or long time series is difficult in practice due to t...
research
04/17/2023

Dumpy: A Compact and Adaptive Index for Large Data Series Collections

Data series indexes are necessary for managing and analyzing the increas...
research
06/05/2020

MISIM: An End-to-End Neural Code Similarity System

Code similarity systems are integral to a range of applications from cod...

Please sign up or login with your details

Forgot password? Click here to reset