RepNet: Cutting Tail Latency in Data Center Networks with Flow Replication

07/04/2014
by   Shuhao Liu, et al.
0

Data center networks need to provide low latency, especially at the tail, as demanded by many interactive applications. To improve tail latency, existing approaches require modifications to switch hardware and/or end-host operating systems, making them difficult to be deployed. We present the design, implementation, and evaluation of RepNet, an application layer transport that can be deployed today. RepNet exploits the fact that only a few paths among many are congested at any moment in the network, and applies simple flow replication to mice flows to opportunistically use the less congested path. RepNet has two designs for flow replication: (1) RepSYN, which only replicates SYN packets and uses the first connection that finishes TCP handshaking for data transmission, and (2) RepFlow which replicates the entire mice flow. We implement RepNet on node.js, one of the most commonly used platforms for networked interactive applications. node's single threaded event-loop and non-blocking I/O make flow replication highly efficient. Performance evaluation on a real network testbed and in Mininet reveals that RepNet is able to reduce the tail latency of mice flows, as well as application completion times, by more than 50%.

READ FULL TEXT
research
07/05/2018

Slytherin: Dynamic, Network-assisted Prioritization of Tail Packets in Datacenter Networks

Datacenter applications demand both low latency and high throughput; whi...
research
10/28/2021

Optimizing Tail Latency in Commodity Datacenters using Forward Error Correction

Long tail latency of short flows (or messages) greatly affects user-faci...
research
12/20/2022

Tuning the Tail Latency of Distributed Queries Using Replication

Querying graph data with low latency is an important requirement in appl...
research
07/29/2013

RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

Short TCP flows that are critical for many interactive applications in d...
research
05/02/2022

Scalable Tail Latency Estimation for Data Center Networks

In this paper, we consider how to provide fast estimates of flow-level t...
research
10/23/2020

The nanoPU: Redesigning the CPU-Network Interface to Minimize RPC Tail Latency

The nanoPU is a new networking-optimized CPU designed to minimize tail l...
research
11/23/2021

LEGOStore: A Linearizable Geo-Distributed Store Combining Replication and Erasure Coding

We design and implement LEGOStore, an erasure coding (EC) based lineariz...

Please sign up or login with your details

Forgot password? Click here to reset