Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

by   Vero Estrada-Galinanes, et al.

Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.



There are no comments yet.


page 1

page 2

page 3

page 4


Wedge-Lifted Codes

We define wedge-lifted codes, a variant of lifted codes, and we study th...

Investigating the Reliability in Three RAID Storage Models and Effect of Ordering Replicas on Disks

One of the most important parts of cloud computing is storage devices, a...

Rack-Aware Regenerating Codes for Data Centers

Erasure coding is widely used for massive storage in data centers to ach...

Simplex Queues for Hot-Data Download

In cloud storage systems, hot data is usually replicated over multiple n...

A Range Matching CAM for Hierarchical Defect Tolerance Technique in NRAM Structures

Due to the small size of nanoscale devices, they are highly prone to pro...

On the data persistency of replicated erasure codes in distributed storage systems

This paper studies the fundamental problem of data persistency for a gen...

Codes for Distributed Storage

This chapter deals with the topic of designing reliable and efficient co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.