A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning

04/11/2021
by   Samson B. Akintoye, et al.
1

Recently, Deep Neural Networks (DNNs) have recorded great success in handling medical and other complex classification tasks. However, as the sizes of a DNN model and the available dataset increase, the training process becomes more complex and computationally intensive, which usually takes a longer time to complete. In this work, we have proposed a generic full end-to-end hybrid parallelization approach combining both model and data parallelism for efficiently distributed and scalable training of DNN models. We have also proposed a Genetic Algorithm based heuristic resources allocation mechanism (GABRA) for optimal distribution of partitions on the available GPUs for computing performance optimization. We have applied our proposed approach to a real use case based on 3D Residual Attention Deep Neural Network (3D-ResAttNet) for efficient Alzheimer Disease (AD) diagnosis on multiple GPUs. The experimental evaluation shows that the proposed approach is efficient and scalable, which achieves almost linear speedup with little or no differences in accuracy performance when compared with the existing non-parallel DNN models.

READ FULL TEXT

page 9

page 11

research
09/08/2018

Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform

The training process of Deep Neural Network (DNN) is compute-intensive, ...
research
04/19/2021

An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks

Deep Neural Network (DNN) frameworks use distributed training to enable ...
research
04/16/2020

TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism

A good parallelization strategy can significantly improve the efficiency...
research
11/12/2019

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

The enormous amount of data and computation required to train DNNs have ...
research
11/20/2021

HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments

Deep neural networks (DNNs) exploit many layers and a large number of pa...
research
01/18/2022

An efficient and flexible inference system for serving heterogeneous ensembles of deep neural networks

Ensembles of Deep Neural Networks (DNNs) has achieved qualitative predic...
research
07/22/2022

Layer-Wise Partitioning and Merging for Efficient and Scalable Deep Learning

Deep Neural Network (DNN) models are usually trained sequentially from o...

Please sign up or login with your details

Forgot password? Click here to reset