Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection

by   Hang Zhu, et al.

Distributed storage employs replication to mask failures and improve availability. However, these systems typically exhibit a hard tradeoff between consistency and performance. Ensuring consistency introduces coordination overhead, and as a result the system throughput does not scale with the number of replicas. We present Harmonia, a replicated storage architecture that exploits the capability of new-generation programmable switches to obviate this tradeoff by providing near-linear scalability without sacrificing consistency. To achieve this goal, Harmonia detects read-write conflicts in the network, which enables any replica to serve reads for objects with no pending writes. Harmonia implements this functionality at line rate, thus imposing no performance overhead. We have implemented a prototype of Harmonia on a cluster of commodity servers connected by a Barefoot Tofino switch, and have integrated it with Redis. We demonstrate the generality of our approach by supporting a variety of replication protocols, including primary-backup, chain replication, Viewstamped Replication, and NOPaxos. Experimental results show that Harmonia improves the throughput of these protocols by up to 10X for a replication factor of 10, providing near-linear scalability up to the limit of our testbed.



page 3

page 5

page 6

page 18


Global Stabilization for Causally Consistent Partial Replication

Causally consistent distributed storage systems have received significan...

SCAR: Strong Consistency using Asynchronous Replication with Minimal Coordination

Data replication is crucial in modern distributed systems as a means to ...

Hihooi: A Database Replication Middleware for Scaling Transactional Databases Consistently

With the advent of the Internet and Internet-connected devices, modern b...

NetChain: Scale-Free Sub-RTT Coordination (Extended Version)

Coordination services are a fundamental building block of modern cloud s...

Applying consensus and replication securely with FLAQR

Availability is crucial to the security of distributed systems, but guar...

"Reduction of Monetary Cost in Cloud Storage System by Using Extended Strict Timed Causal Consistency"

Cloud storage systems have been introduced to provide a scalable, secure...

Scaling Replicated State Machines with Compartmentalization [Technical Report]

State machine replication protocols, like MultiPaxos and Raft, are a cri...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.