DeepAI AI Chat
Log In Sign Up

Exascale Deep Learning for Climate Analytics

by   Thorsten Kurth, et al.
Oak Ridge National Laboratory
berkeley college
Berkeley Lab

We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0 to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7 Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.


page 2

page 10


AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning

In the last few years, the memory requirements to train state-of-the-art...

Scheduling Optimization Techniques for Neural Network Training

Neural network training requires a large amount of computation and thus ...

ChainerMN: Scalable Distributed Deep Learning Framework

One of the keys for deep learning to have made a breakthrough in various...

AMPNet: Asynchronous Model-Parallel Training for Dynamic Neural Networks

New types of machine learning hardware in development and entering the m...

Towards Efficient Large-Scale Graph Neural Network Computing

Recent deep learning models have moved beyond low-dimensional regular gr...

Hierarchical Roofline Performance Analysis for Deep Learning Applications

This paper presents a practical methodology for collecting performance d...