CPOI: A Compact Method to Archive Versioned RDF Triple-Sets

02/11/2019
by   Maria Psaraki, et al.
0

Large amounts of RDF/S data are produced and published lately, and several modern applications require the provision of versioning and archiving services over such datasets. In this paper we propose a novel storage index for archiving versions of such datasets, called CPOI (compact partial order index), that exploits the fact that an RDF Knowledge Base (KB), is a graph (or equivalently a set of triples), and thus it has not a unique serialization (as it happens with text). If we want to keep stored several versions we actually want to store multiple sets of triples. CPOI is a data structure for storing such sets aiming at reducing the storage space since this is important not only for reducing storage costs, but also for reducing the various communication costs and enabling hosting in main memory (and thus processing efficiently) large quantities of data. CPOI is based on a partial order structure over sets of triple identifiers, where the triple identifiers are represented in a gapped form using variable length encoding schemes. For this index we evaluate analytically and experimentally various identifier assignment techniques and their space savings. The results show significant storage savings, specifically, the storage space of the compressed sets in large and realistic synthetic datasets is about the 8

READ FULL TEXT
research
11/19/2019

Extending General Compact Querieable Representations to GIS Applications

The raster model is commonly used for the representation of images in ma...
research
02/21/2018

RStore: A Distributed Multi-version Document Store

We address the problem of compactly storing a large number of versions (...
research
02/21/2018

Managing and Querying Multi-versioned Documents using a Distributed Key-Value Store

We address the problem of compactly storing a large number of versions (...
research
08/03/2020

Failure Probability Analysis for Partial Extraction from Invertible Bloom Filters

Invertible Bloom Filter (IBF) is a data structure, which employs a small...
research
11/20/2022

Graceful Forgetting II. Data as a Process

Data are rapidly growing in size and importance for society, a trend mot...
research
08/28/2019

Techniques for Inverted Index Compression

The data structure at the core of large-scale search engines is the inve...
research
06/20/2020

Coconut: a scalable bottom-up approach for building data series indexes

Many modern applications produce massive amounts of data series that nee...

Please sign up or login with your details

Forgot password? Click here to reset