Compact Representations of Event Sequences

We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More importantly, our representation is flexible in the sense that the relevant dimensions and queries may be used to guide the construction process, easily providing a space-time tradeoff depending on the relevant queries in the domain. We provide two alternative representations for sequences of multidimensional data and describe the techniques to efficiently store the datasets and to perform aggregation queries over the compressed representation. We perform experimental evaluation on realistic datasets, showing the space efficiency and query capabilities of our proposal.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

09/21/2020

Space/time-efficient RDF stores based on circular suffix sorting

In recent years, RDF has gained popularity as a format for the standardi...
08/27/2018

NNCubes: Learned Structures for Visual Data Exploration

Visual exploration of large multidimensional datasets has seen tremendou...
03/17/2022

A Cube Algebra with Comparative Operations: Containment, Overlap, Distance and Usability

In this paper, we provide a comprehensive rigorous modeling for multidim...
04/19/2019

Approximate Queries and Representations for Large Data Sequences

Many new database application domains such as experimental sciences and ...
08/06/2019

RSATree: Distribution-Aware Data Representation of Large-Scale Tabular Datasets for Flexible Visual Query

Analysts commonly investigate the data distributions derived from statis...
10/06/2020

Sharon: Shared Online Event Sequence Aggregation

Streaming systems evaluate massive workloads of event sequence aggregati...
02/27/2020

Semantrix: A Compressed Semantic Matrix

We present a compact data structure to represent both the duration and l...

References