Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing

09/17/2019
by   Daniele De Sensi, et al.
0

System noise can negatively impact the performance of HPC systems, and the interconnection network is one of the main factors contributing to this problem. To mitigate this effect, adaptive routing sends packets on non-minimal paths if they are less congested. However, while this may mitigate interference caused by congestion, it also generates more traffic since packets traverse additional hops, causing in turn congestion on other applications and on the application itself. In this paper, we first describe how to estimate network noise. By following these guidelines, we show how noise can be reduced by using routing algorithms which select minimal paths with a higher probability. We exploit this knowledge to design an algorithm which changes the probability of selecting minimal paths according to the application characteristics. We validate our solution on microbenchmarks and real-world applications on two systems relying on a Dragonfly interconnection network, showing noise reduction and performance improvement.

READ FULL TEXT
research
12/14/2020

Application-aware Congestion Mitigation for High-Performance Computing Systems

High-performance computing (HPC) systems frequently experience congestio...
research
11/20/2020

Hop-Constrained Oblivious Routing

We prove the existence of an oblivious routing scheme that is poly(log n...
research
03/19/2023

Efficient deadlock avoidance for 2D mesh NoCs that use OQ or VOQ routers

Network-on-chips (NoCs) are currently a widely used approach for achievi...
research
06/22/2023

Analysing Mechanisms for Virtual Channel Management in Low-Diameter networks

To interconnect their growing number of servers, current supercomputers ...
research
12/28/2018

Stability of Adversarial Routing with Feedback

We consider the impact of scheduling disciplines on performance of routi...
research
11/23/2022

High-Quality Fault Resiliency in Fat Trees

Coupling regular topologies with optimised routing algorithms is key in ...
research
10/10/2019

Remote Control: A Simple Deadlock Avoidance Scheme for Modular System on Chip

The increase in design cost and complexity have motivated designers to a...

Please sign up or login with your details

Forgot password? Click here to reset