Predicting the Computational Cost of Deep Learning Models

11/28/2018
by   Daniel Justus, et al.
0

Deep learning is rapidly becoming a go-to tool for many artificial intelligence problems due to its ability to outperform other approaches and even humans at many problems. Despite its popularity we are still unable to accurately predict the time it will take to train a deep learning network to solve a given problem. This training time can be seen as the product of the training time per epoch and the number of epochs which need to be performed to reach the desired level of accuracy. Some work has been carried out to predict the training time for an epoch -- most have been based around the assumption that the training time is linearly related to the number of floating point operations required. However, this relationship is not true and becomes exacerbated in cases where other activities start to dominate the execution time. Such as the time to load data from memory or loss of performance due to non-optimal parallel execution. In this work we propose an alternative approach in which we train a deep learning network to predict the execution time for parts of a deep learning network. Timings for these individual parts can then be combined to provide a prediction for the whole execution time. This has advantages over linear approaches as it can model more complex scenarios. But, also, it has the ability to predict execution times for scenarios unseen in the training data. Therefore, our approach can be used not only to infer the execution time for a batch, or entire epoch, but it can also support making a well-informed choice for the appropriate hardware and model.

READ FULL TEXT

page 1

page 6

research
05/02/2019

Can the Optimizer Cost be Used to Predict Query Execution Times?

Predicting the execution time of queries is an important problem with ap...
research
09/19/2018

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

Deep neural networks show great potential as solutions to many sensing a...
research
04/28/2022

Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

We introduce a software-hardware co-design approach to reduce memory tra...
research
05/22/2020

Comparative Study of Machine Learning Models and BERT on SQuAD

This study aims to provide a comparative analysis of performance of cert...
research
01/31/2021

A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

Deep learning researchers and practitioners usually leverage GPUs to hel...
research
07/02/2018

Elastic Neural Networks: A Scalable Framework for Embedded Computer Vision

We propose a new framework for image classification with deep neural net...
research
06/03/2013

Predicting Parameters in Deep Learning

We demonstrate that there is significant redundancy in the parameterizat...

Please sign up or login with your details

Forgot password? Click here to reset