High-Quality Fault Resiliency in Fat Trees

11/23/2022
by   John Gliksberg, et al.
0

Coupling regular topologies with optimised routing algorithms is key in pushing the performance of interconnection networks of supercomputers.In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalised Fat-Trees (PGFTs) which minimises congestion risk even under massive network degradation caused by equipment failure.Dmodc computes forwarding tables with a closed-form arithmetic formula by relying on a fast preprocessing phase.This allows complete re-routing of networks with tens of thousands of nodes in less than a second.In turn, this greatly helps centralised fabric management react to faults with high-quality routing tables and no impact to running applications in current and future very large-scale HPC clusters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2022

High-Quality Fault-Resiliency in Fat-Tree Networks (Extended Abstract)

Coupling regular topologies with optimized routing algorithms is key in ...
research
11/21/2022

Node-Type-Based Load-Balancing Routing for Parallel Generalized Fat-Trees

High-Performance Computing (HPC) clusters are made up of a variety of no...
research
03/04/2023

Electrical Flows for Polylogarithmic Competitive Oblivious Routing

Oblivious routing is a well-studied distributed paradigm that uses stati...
research
08/20/2020

An In-Depth Analysis of the Slingshot Interconnect

The interconnect is one of the most critical components in large scale c...
research
09/17/2019

Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing

System noise can negatively impact the performance of HPC systems, and t...
research
01/30/2020

Routing-Led Placement of VNFs in Arbitrary Networks

The ever increasing demand for computing resources has led to the creati...
research
04/19/2018

VeriTable: Fast Equivalence Verification of Multiple Large Forwarding Tables

Due to network practices such as traffic engineering and multi-homing, t...

Please sign up or login with your details

Forgot password? Click here to reset