Weight Update Skipping: Reducing Training Time for Artificial Neural Networks

12/05/2020
by   Pooneh Safayenikoo, et al.
0

Artificial Neural Networks (ANNs) are known as state-of-the-art techniques in Machine Learning (ML) and have achieved outstanding results in data-intensive applications, such as recognition, classification, and segmentation. These networks mostly use deep layers of convolution or fully connected layers with many filters in each layer, demanding a large amount of data and tunable hyperparameters to achieve competitive accuracy. As a result, storage, communication, and computational costs of training (in particular training time) become limiting factors to scale them up. In this paper, we propose a new training methodology for ANNs that exploits the observation of improvement of accuracy shows temporal variations which allow us to skip updating weights when the variation is minuscule. During such time windows, we keep updating bias which ensures the network still trains and avoids overfitting; however, we selectively skip updating weights (and their time-consuming computations). Such a training approach virtually achieves the same accuracy with considerably less computational cost, thus lower training time. We propose two methods for updating weights and evaluate them by analyzing four state-of-the-art models, AlexNet, VGG-11, VGG-16, ResNet-18 on CIFAR datasets. On average, our two proposed methods called WUS and WUS+LR reduced the training time (compared to the baseline) by 54 CIFAR-100, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2020

Reusing Trained Layers of Convolutional Neural Networks to Shorten Hyperparameters Tuning Time

Hyperparameters tuning is a time-consuming approach, particularly when t...
research
07/06/2020

Deep Partial Updating

Emerging edge intelligence applications require the server to continuous...
research
01/28/2022

Mixing Implicit and Explicit Deep Learning with Skip DEQs and Infinite Time Neural ODEs (Continuous DEQs)

Implicit deep learning architectures, like Neural ODEs and Deep Equilibr...
research
09/06/2022

Merged-GHCIDR: Geometrical Approach to Reduce Image Data

The computational resources required to train a model have been increasi...
research
06/27/2022

Effective training-time stacking for ensembling of deep neural networks

Ensembling is a popular and effective method for improving machine learn...
research
05/24/2022

DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks

Deep learning is attracting interest across a variety of domains, includ...
research
09/14/2018

Non-iterative recomputation of dense layers for performance improvement of DCNN

An iterative method of learning has become a paradigm for training deep ...

Please sign up or login with your details

Forgot password? Click here to reset