HPC AI500: A Benchmark Suite for HPC AI Systems

by   ZiHan Jiang, et al.
The University of Texas at Austin
Hunan University
University at Buffalo

In recent years, with the trend of applying deep learning (DL) in high performance scientific computing, the unique characteristics of emerging DL workloads in HPC raise great challenges in designing, implementing HPC AI systems. The community needs a new yard stick for evaluating the future HPC systems. In this paper, we propose HPC AI500 --- a benchmark suite for evaluating HPC systems that running scientific DL workloads. Covering the most representative scientific fields, each workload from HPC AI500 is based on real-world scientific DL applications. Currently, we choose 14 scientific DL benchmarks from perspectives of application scenarios, data sets, and software stack. We propose a set of metrics for comprehensively evaluating the HPC AI systems, considering both accuracy, performance as well as power and cost. We provide a scalable reference implementation of HPC AI500. HPC AI500 is a part of the open-source AIBench project, the specification and source code are publicly available from <http://www.benchcouncil.org/AIBench/index.html>.


SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems

Novel artificial intelligence (AI) technology has expedited various scie...

Experiences of running an HPC RISC-V testbed

Funded by the UK ExCALIBUR H&ES exascale programme, in early 2022 a RISC...

A Study of Checkpointing in Large Scale Training of Deep Neural Networks

Deep learning (DL) applications are increasingly being deployed on HPC s...

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Scientific communities are increasingly adopting machine learning and de...

Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures

During the last two years, the goal of many researchers has been to sque...

Optimising AI Training Deployments using Graph Compilers and Containers

Artificial Intelligence (AI) applications based on Deep Neural Networks ...

Effective implementation of the High Performance Conjugate Gradient benchmark on GraphBLAS

Applications in High-Performance Computing (HPC) environments face chall...

Please sign up or login with your details

Forgot password? Click here to reset