Data Series Indexing Gone Parallel

09/02/2020
by   Botao Peng, et al.
0

Data series similarity search is a core operation for several data series analysis applications across many different domains. However, the state-of-the-art techniques fail to deliver the time performance required for interactive exploration, or analysis of large data series collections. In this Ph.D. work, we present the first data series indexing solutions, for both on-disk and in-memory data, that are designed to inherently take advantage of multi-core architectures, in order to accelerate similarity search processing times. Our experiments on a variety of synthetic and real data demonstrate that our approaches are up to orders of magnitude faster than the alternatives. More specifically, our on-disk solution can answer exact similarity search queries on 100GB datasets in a few seconds, and our in-memory solution in a few milliseconds, which enables real-time, interactive data exploration on very large data series collections.

READ FULL TEXT
research
09/02/2020

MESSI: In-Memory Data Series Indexing

Data series similarity search is a core operation for several data serie...
research
06/20/2020

The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art

Increasingly large data series collections are becoming commonplace acro...
research
06/20/2020

Return of the Lernaean Hydra: Experimental Evaluation of Data Series Approximate Similarity Search

Data series are a special type of multidimensional data present in numer...
research
09/22/2020

Effective and Efficient Variable-Length Data Series Analytics

In the last twenty years, data series similarity search has emerged as a...
research
03/09/2019

RadegastXDB - Prototype of Native XML Database Management System: Technical Report

A lot of advances in the processing of XML data have been proposed in la...
research
03/09/2019

RadegastXDB - Prototype of a Native XML Database Management System: Technical Report

A lot of advances in the processing of XML data have been proposed in th...
research
12/26/2022

ProS: Data Series Progressive k-NN Similarity Search and Classification with Probabilistic Quality Guarantees

Existing systems dealing with the increasing volume of data series canno...

Please sign up or login with your details

Forgot password? Click here to reset