Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

01/09/2023
by   George Michelogiannakis, et al.
0

The diversity of workload requirements and increasing hardware heterogeneity in emerging high performance computing (HPC) systems motivate resource disaggregation. Disaggregation allows compute and memory resources to be allocated individually as required to each workload. However, it is unclear how to realize these gains and cost-effectively meet the stringent bandwidth and latency requirements of HPC applications. To that end, we describe how modern photonic components can be co-designed with modern HPC racks to implement flexible intra-rack resource disaggregation and fully meet the bit error rate (BER) and high escape bandwidth of all chip types in modern HPC racks with negligible power overhead. Our photonic-based disaggregated rack provides an average application speedup of 11 benchmarks compared to a similar system that instead uses modern electronic switches for disaggregation. Using observed resource usage from a production system, we estimate that an iso-performance intra-rack disaggregated HPC system using photonics would require 4x fewer memory modules and 2x fewer NICs than a non-disaggregated baseline.

READ FULL TEXT

page 4

page 5

page 6

page 14

research
12/10/2020

Scheduling Beyond CPUs for HPC

High performance computing (HPC) is undergoing significant changes. The ...
research
06/06/2023

Evaluating the Potential of Disaggregated Memory Systems for HPC applications

Disaggregated memory is a promising approach that addresses the limitati...
research
03/26/2018

Reactive NaN Repair for Applying Approximate Memory to Numerical Applications

Applications in the AI and HPC fields require much memory capacity, and ...
research
01/12/2018

A Workload Analysis of NSF's Innovative HPC Resources Using XDMoD

Workload characterization is an integral part of performance analysis of...
research
12/05/2016

BrainFrame: A node-level heterogeneous accelerator platform for neuron simulations

Objective: The advent of High-Performance Computing (HPC) in recent year...
research
10/08/2020

Deploying a Task-based Runtime System on Raspberry Pi Clusters

Arm technology is becoming increasingly important in HPC. Recently, Fuga...
research
09/13/2023

Short reasons for long vectors in HPC CPUs: a study based on RISC-V

For years, SIMD/vector units have enhanced the capabilities of modern CP...

Please sign up or login with your details

Forgot password? Click here to reset