Deploying Scientific AI Networks at Petaflop Scale on Secure Large Scale HPC Production Systems with Containers

05/20/2020
by   David Brayford, et al.
0

There is an ever-increasing need for computational power to train complex artificial intelligence (AI) machine learning (ML) models to tackle large scientific problems. High performance computing (HPC) resources are required to efficiently compute and scale complex models across tens of thousands of compute nodes. In this paper, we discuss the issues associated with the deployment of machine learning frameworks on large scale secure HPC systems and how we successfully deployed a standard machine learning framework on a secure large scale HPC production system, to train a complex three-dimensional convolutional GAN (3DGAN), with petaflop performance. 3DGAN is an example from the high energy physics domain, designed to simulate the energy pattern produced by showers of secondary particles inside a particle detector on various HPC systems.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/24/2019

Deploying AI Frameworks on Secure HPC Systems with Containers

The increasing interest in the usage of Artificial Intelligence techniqu...
07/14/2021

Higgs Boson Classification: Brain-inspired BCPNN Learning with StreamBrain

One of the most promising approaches for data analysis and exploration o...
11/23/2020

Integrating Deep Learning in Domain Sciences at Exascale

This paper presents some of the current challenges in designing deep lea...
12/05/2019

Merlin: Enabling Machine Learning-Ready HPC Ensembles

With the growing complexity of computational and experimental facilities...
03/02/2022

Hyperparameter optimization of data-driven AI models on HPC systems

In the European Center of Excellence in Exascale computing "Research on ...
12/09/2021

Is Disaggregation possible for HPC Cognitive Simulation?

Cognitive simulation (CogSim) is an important and emerging workflow for ...
06/09/2021

StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs

The modern deep learning method based on backpropagation has surged in p...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.