PROFET: Profiling-based CNN Training Latency Prophet for GPU Cloud Instances

08/10/2022
by   Sungjae Lee, et al.
0

Training a Convolutional Neural Network (CNN) model typically requires significant computing power, and cloud computing resources are widely used as a training environment. However, it is difficult for CNN algorithm developers to keep up with system updates and apply them to their training environment due to quickly evolving cloud services. Thus, it is important for cloud computing service vendors to design and deliver an optimal training environment for various training tasks to lessen system operation management overhead of algorithm developers. To achieve the goal, we propose PROFET, which can predict the training latency of arbitrary CNN implementation on various Graphical Processing Unit (GPU) devices, minibatch sizes, and input image pixel sizes to develop a cost-effective and time-efficient training cloud environment. Different from the previous training latency prediction work, PROFET does not rely on the implementation details of the CNN architecture, and it is suitable for use in a public cloud environment. Thorough evaluations and demonstrations present the practicality of PROFET while improving accuracy by 35 84 MLPredict, respectively.

READ FULL TEXT
research
03/04/2019

CloudPSS: A High-Performance Power System Simulator Based on Cloud Computing

With the increasing computations in power system simulations, high-perfo...
research
06/17/2020

Cuckoo Optimization Algorithm Based Job Scheduling Using Cloud and Fog Computing in Smart Grid

The integration of Smart Grid (SG) with cloud and fog computing has impr...
research
01/04/2019

The ISTI Rapid Response on Exploring Cloud Computing 2018

This report describes eighteen projects that explored how commercial clo...
research
04/27/2019

Collage Inference: Tolerating Stragglers in Distributed Neural Network Inference using Coding

MLaaS (ML-as-a-Service) offerings by cloud computing platforms are becom...
research
02/13/2019

Training on the Edge: The why and the how

Edge computing is the natural progression from Cloud computing, where, i...
research
10/21/2020

Performance Prediction for Convolutional Neural Networks in Edge Devices

Running Convolutional Neural Network (CNN) based applications on edge de...
research
06/20/2022

The Greater The Power, The More Dangerous The Abuse: Facing Malicious Insiders in The Cloud

The financial crisis made companies around the world search for cheaper ...

Please sign up or login with your details

Forgot password? Click here to reset