Analyzing the Download Time of Availability Codes

12/20/2019
by   Mehmet Fatih Aktas, et al.
0

Availability codes have recently been proposed to facilitate efficient storage, management, and retrieval of frequently accessed data in distributed storage systems. Such codes provide multiple disjoint recovery groups for each data object, which makes it possible for multiple users to access the same object in a non-overlapping way. However in the presence of server-side performance variability, downloading an object using a recovery group takes longer than using a single server hosting the object. Therefore it is not immediately clear whether availability codes reduce latency to access hot data. Accordingly, the goal of this paper is to analyze, using a queuing theoretical approach, the download time in storage systems that employ availability codes. For data access, we consider the widely adopted Fork-Join model with redundancy. In this model, each request arrival splits into multiple copies and completes as soon as any one of the copies finishes service. We first carry out the analysis under the low-traffic regime in which case the system consists of at most one download request at any time. In this setting, we compare the download time in systems with availability, maximum distance separable (MDS), and replication codes. Our results indicate that availability codes can reduce download time in some settings, but are not always optimal. When the low-traffic assumption does not hold, system consists of multiple inter-dependent Fork-Join queues operating in parallel, which makes the exact analysis intractable. For this case we present upper and lower bounds on the download time. These bounds yield insight on system performance with respect to varying popularities over the stored objects. We also derive an M/G/1 queue approximation for the system, and show with simulations that it performs well in estimating the actual system performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2018

Simplex Queues for Hot-Data Download

In cloud storage systems, hot data is usually replicated over multiple n...
research
05/06/2021

Download time analysis for distributed storage systems with node failures

We consider a distributed storage system which stores several hot (popul...
research
07/06/2018

Faster Data-access in Large-scale Systems: Network-scale Latency Analysis under General Service-time Distributions

In cloud storage systems with a large number of servers, files are typic...
research
09/18/2018

Local Reconstruction Codes: A Class of MDS-PIR Capacity-Achieving Codes

We prove that a class of distance-optimal local reconstruction codes (LR...
research
06/14/2023

RAID Organizations for Improved Reliability and Performance: A Not Entirely Unbiased Tutorial

This is a followup to the 1994 tutorial by Berkeley RAID researchers who...
research
10/20/2016

Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints

Multi-server systems have received increasing attention with important i...
research
06/04/2020

Access-optimal Linear MDS Convertible Codes for All Parameters

In large-scale distributed storage systems, erasure codes are used to ac...

Please sign up or login with your details

Forgot password? Click here to reset