Optimising the Performance of Convolutional Neural Networks across Computing Systems using Transfer Learning

10/20/2020
by   Rik Mulder, et al.
0

The choice of convolutional routines (primitives) to implement neural networks has a tremendous impact on their inference performance (execution speed) on a given hardware platform. To optimise a neural network by primitive selection, the optimal primitive is identified for each layer of the network. This process requires a lengthy profiling stage, iterating over all the available primitives for each layer configuration, to measure their execution time on the target platform. Because each primitive exploits the hardware in different ways, new profiling is needed to obtain the best performance when moving to another platform. In this work, we propose to replace this prohibitively expensive profiling stage with a machine learning based approach of performance modeling. Our approach speeds up the optimisation time drastically. After training, our performance model can estimate the performance of convolutional primitives in any layer configuration. The time to optimise the execution of large neural networks via primitive selection is reduced from hours to just seconds. Our performance model is easily transferable to other target platforms. We demonstrate this by training a performance model on an Intel platform and performing transfer learning to AMD and ARM processor devices with minimal profiled samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2020

TASO: Time and Space Optimization for Memory-Constrained DNN Inference

Convolutional neural networks (CNNs) are used in many embedded applicati...
research
06/04/2019

Performance Modelling of Deep Learning on Intel Many Integrated Core Architectures

Many complex problems, such as natural language processing or visual obj...
research
10/21/2020

Performance Prediction for Convolutional Neural Networks in Edge Devices

Running Convolutional Neural Network (CNN) based applications on edge de...
research
10/03/2017

Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming

Deep Neural Networks (DNNs) require very large amounts of computation bo...
research
03/19/2023

Evaluation of Convolution Primitives for Embedded Neural Networks on 32-bit Microcontrollers

Deploying neural networks on constrained hardware platforms such as 32-b...
research
08/27/2021

Using Graph Neural Networks to model the performance of Deep Neural Networks

With the unprecedented proliferation of machine learning software, there...
research
06/07/2019

Lightweight Parallel Foundations: a model-compliant communication layer

We present the Lightweight Parallel Foundations (LPF), an interoperable ...

Please sign up or login with your details

Forgot password? Click here to reset