FFT, FMM, and Multigrid on the Road to Exascale: performance challenges and opportunities

10/28/2018
by   Huda Ibeid, et al.
0

FFT, FMM, and multigrid methods are widely used fast and highly scalable solvers for elliptic PDEs. However, emerging large-scale computing systems are introducing challenges in comparison to current petascale computers. Recent efforts have identified several constraints in the design of exascale software that include massive concurrency, resilience management, exploiting the high performance of heterogeneous systems, energy efficiency, and utilizing the deeper and more complex memory hierarchy expected at exascale. In this paper, we perform a model-based comparison of the FFT, FMM, and multigrid methods in the context of these projected constraints. In addition we use performance models to offer predictions about the expected performance on upcoming exascale system configurations based on current technology trends.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

12/29/2019

On the Performance and Energy Efficiency of the PGAS Programming Model on Multicore Architectures

Using large-scale multicore systems to get the maximum performance and e...
03/04/2021

The RECIPE Approach to Challenges in Deeply Heterogeneous High Performance Systems

RECIPE (REliable power and time-ConstraInts-aware Predictive management ...
08/14/2020

Toward an End-to-End Auto-tuning Framework in HPC PowerStack

Efficiently utilizing procured power and optimizing performance of scien...
10/18/2020

Fault Tolerance for Remote Memory Access Programming Models

Remote Memory Access (RMA) is an emerging mechanism for programming high...
07/31/2020

Opportunities and Challenges for Next Generation Computing

Computing has dramatically changed nearly every aspect of our lives, fro...
11/28/2021

A Survey of Large-Scale Deep Learning Serving System Optimization: Challenges and Opportunities

Deep Learning (DL) models have achieved superior performance in many app...
08/10/2021

Survey and Benchmarking of Precision-Scalable MAC Arrays for Embedded DNN Processing

Reduced-precision and variable-precision multiply-accumulate (MAC) opera...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.