SFC: Near-Source Congestion Signaling and Flow Control

04/30/2023
by   Yanfang Le, et al.
0

State-of-the-art congestion control algorithms for data centers alone do not cope well with transient congestion and high traffic bursts. To help with these, we revisit the concept of direct backward feedback from switches and propose Back-to-Sender (BTS) signaling to many concurrent incast senders. Combining it with our novel approach to in-network caching, we achieve near-source sub-RTT congestion signaling. Source Flow Control (SFC) combines these two simple signaling mechanisms to instantly pause traffic sources, hence avoiding the head-of-line blocking problem of conventional hop-by-hop flow control. Our prototype system and scale simulations demonstrate that near-source signaling can significantly reduce the message completion time of various workloads in the presence of incast, complementing existing congestion control algorithms. Our results show that SFC can reduce the 99^th-percentile flow completion times by 1.2-6× and the peak switch buffer usage by 2-3× compared to the recent incast solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2019

Backpressure Flow Control

Effective congestion control in a multi-tenant data center is becoming i...
research
01/11/2022

Congestion Control Mechanisms for Inter-Datacenter Networks

Applications running in geographically distributed setting are becoming ...
research
07/22/2022

Impact of RoCE Congestion Control Policies on Distributed Training of DNNs

RDMA over Converged Ethernet (RoCE) has gained significant attraction fo...
research
07/29/2013

RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

Short TCP flows that are critical for many interactive applications in d...
research
05/28/2018

Dart: Divide and Specialize for Fast Response to Congestion in RDMA-based Datacenter Networks

Though Remote Direct Memory Access (RDMA) promises to reduce datacenter ...
research
06/21/2018

Revisiting Network Support for RDMA

The advent of RoCE (RDMA over Converged Ethernet) has led to a significa...
research
08/09/2023

GraphCC: A Practical Graph Learning-based Approach to Congestion Control in Datacenters

Congestion Control (CC) plays a fundamental role in optimizing traffic i...

Please sign up or login with your details

Forgot password? Click here to reset