Network Partitioning and Avoidable Contention

05/28/2020
by   Yishai Oltchik, et al.
0

Network contention frequently dominates the run time of parallel algorithms and limits scaling performance. Most previous studies mitigate or eliminate contention by utilizing one of several approaches: communication-minimizing algorithms; hotspot-avoiding routing schemes; topology-aware task mapping; or improving global network properties, such as bisection bandwidth, edge-expansion, partitioning, and network diameter. In practice, parallel jobs often use only a fraction of a host system. How do processor allocation policies affect contention within a partition? We utilize edge-isoperimetric analysis of network graphs to determine whether a network partition has optimal internal bisection. Increasing the bisection allows a more efficient use of the network resources, decreasing or completely eliminating the link contention. We first study torus networks and characterize partition geometries that maximize internal bisection bandwidth. We examine the allocation policies of Mira and JUQUEEN, the two largest publicly-accessible Blue Gene/Q torus-based supercomputers. Our analysis demonstrates that the bisection bandwidth of their current partitions can often be improved by changing the partitions' geometries. These can yield up to a X2 speedup for contention-bound workloads. Benchmarking experiments validate the predictions. Our analysis applies to allocation policies of other networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2021

SDP: Scalable Real-time Dynamic Graph Partitioner

Time-evolving large graph has received attention due to their participat...
research
08/16/2019

Distributed Edge Partitioning for Trillion-edge Graphs

We propose Distributed Neighbor Expansion (Distributed NE), a parallel a...
research
05/04/2023

ProNet: Network-level Bandwidth Sharing among Tenants in Cloud

In today's private cloud, the resource of the datacenter is shared by mu...
research
04/11/2023

Partitioner Selection with EASE to Optimize Distributed Graph Processing

For distributed graph processing on massive graphs, a graph is partition...
research
09/07/2023

DGC: Training Dynamic Graphs with Spatio-Temporal Non-Uniformity using Graph Partitioning by Chunks

Dynamic Graph Neural Network (DGNN) has shown a strong capability of lea...
research
03/10/2020

Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning

To leverage data and computation capabilities of mobile devices, machine...
research
12/27/2022

Range-Based Set Reconciliation

Range-based set reconciliation is a simple approach to efficiently compu...

Please sign up or login with your details

Forgot password? Click here to reset