A Generic Performance Model for Deep Learning in a Distributed Environment

05/19/2023
by   Tulasi Kavarakuntla, et al.
0

Performance modelling of a deep learning application is essential to improve and quantify the efficiency of the model framework. However, existing performance models are mostly case-specific, with limited capability for the new deep learning frameworks/applications. In this paper, we propose a generic performance model of an application in a distributed environment with a generic expression of the application execution time that considers the influence of both intrinsic factors/operations (e.g. algorithmic parameters/internal operations) and extrinsic scaling factors (e.g. the number of processors, data chunks and batch size). We formulate it as a global optimization problem and solve it using regularization on a cost function and differential evolution algorithm to find the best-fit values of the constants in the generic expression to match the experimentally determined computation time. We have evaluated the proposed model on three deep learning frameworks (i.e., TensorFlow, MXnet, and Pytorch). The experimental results show that the proposed model can provide accurate performance predictions and interpretability. In addition, the proposed work can be applied to any distributed deep neural network without instrumenting the code and provides insight into the factors affecting performance and scalability.

READ FULL TEXT

page 1

page 5

page 8

page 9

research
09/04/2019

Performance Analysis and Comparison of Distributed Machine Learning Systems

Deep learning has permeated through many aspects of computing/processing...
research
11/16/2017

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs

Deep learning frameworks have been widely deployed on GPU servers for de...
research
10/21/2018

Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training

Training neural network often uses a machine learning framework such as ...
research
07/16/2018

Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs

For a deep learning model, efficient execution of its computation graph ...
research
04/04/2020

Predicting Respiratory Anomalies and Diseases Using Deep Learning Models

In this paper, robust deep learning frameworks are introduced, aims to d...
research
08/14/2018

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Deep learning is a promising tool to determine the physical model that d...
research
11/12/2020

Empirical Performance Analysis of Conventional Deep Learning Models for Recognition of Objects in 2-D Images

Artificial Neural Networks, an essential part of Deep Learning, are deri...

Please sign up or login with your details

Forgot password? Click here to reset