Bandwidth Cost of Code Conversions in Distributed Storage: Fundamental Limits and Optimal Constructions

08/28/2020
by   Francisco Maturana, et al.
0

Erasure codes have become an integral part of distributed storage systems as a tool for providing data reliability and durability under the constant threat of device failures. In such systems, an [n, k] code over a finite field 𝔽_q encodes k message symbols into n codeword symbols from 𝔽_q which are then stored on n different nodes in the system. Recent work has shown that significant savings in storage space can be obtained by tuning n and k to variations in device failure rates. Such a tuning necessitates code conversion: the process of converting already encoded data under an initial [n^I, k^I] code to its equivalent under a final [n^F, k^F] code. The default approach to conversion is to reencode data, which places significant burden on system resources. Convertible codes are a recently proposed class of codes for enabling resource-efficient conversions. Existing work on convertible codes has focused on minimizing access cost, i.e., the number of code symbols accessed during conversion. Bandwidth, which corresponds to the amount of data read and transferred, is another important resource to optimize. In this paper, we initiate the study on the fundamental limits on bandwidth used during code conversion and present constructions for bandwidth-optimal convertible codes. First, we model the code conversion problem using network information flow graphs with variable capacity edges. Second, focusing on MDS codes and an important parameter regime called the merge regime, we derive tight lower bounds on the bandwidth cost of conversion. The derived bounds show that bandwidth cost can be significantly reduced even in regimes where access cost cannot be reduced as compared to the default approach. Third, we present a new construction for MDS convertible codes which matches the proposed lower bound and is thus bandwidth-optimal during conversion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2022

Bandwidth Cost of Code Conversions in the Split Regime

Distributed storage systems must store large amounts of data over long p...
research
07/30/2019

Convertible Codes: Efficient Conversion of Coded Data in Distributed Storage

Large-scale distributed storage systems typically use erasure codes to p...
research
06/04/2020

Access-optimal Linear MDS Convertible Codes for All Parameters

In large-scale distributed storage systems, erasure codes are used to ac...
research
08/13/2023

Locally repairable convertible codes with optimal access costs

Modern large-scale distributed storage systems use erasure codes to prot...
research
01/16/2019

An Exponential Lower Bound on the Sub-Packetization of MSR Codes

An (n,k,ℓ)-vector MDS code is a F-linear subspace of (F^ℓ)^n (for some f...
research
05/25/2020

Update Bandwidth for Distributed Storage

In this paper, we consider the update bandwidth in distributed storage s...
research
09/07/2022

Explicit Low-Bandwidth Evaluation Schemes for Weighted Sums of Reed-Solomon-Coded Symbols

Motivated by applications in distributed storage, distributed computing,...

Please sign up or login with your details

Forgot password? Click here to reset