Exploiting Adam-like Optimization Algorithms to Improve the Performance of Convolutional Neural Networks

03/26/2021
by   Loris Nanni, et al.
0

Stochastic gradient descent (SGD) is the main approach for training deep networks: it moves towards the optimum of the cost function by iteratively updating the parameters of a model in the direction of the gradient of the loss evaluated on a minibatch. Several variants of SGD have been proposed to make adaptive step sizes for each parameter (adaptive gradient) and take into account the previous updates (momentum). Among several alternative of SGD the most popular are AdaGrad, AdaDelta, RMSProp and Adam which scale coordinates of the gradient by square roots of some form of averaging of the squared coordinates in the past gradients and automatically adjust the learning rate on a parameter basis. In this work, we compare Adam based variants based on the difference between the present and the past gradients, the step size is adjusted for each parameter. We run several tests benchmarking proposed methods using medical image data. The experiments are performed using ResNet50 architecture neural network. Moreover, we have tested ensemble of networks and the fusion with ResNet50 trained with stochastic gradient descent. To combine the set of ResNet50 the simple sum rule has been applied. Proposed ensemble obtains very high performance, it obtains accuracy comparable or better than actual state of the art. To improve reproducibility and research efficiency the MATLAB source code used for this research is available at GitHub: https://github.com/LorisNanni.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2019

diffGrad: An Optimization Method for Convolutional Neural Networks

Stochastic Gradient Decent (SGD) is one of the core techniques behind th...
research
10/09/2021

Gated recurrent units and temporal convolutional network for multilabel classification

Multilabel learning tackles the problem of associating a sample with mul...
research
05/20/2022

PSO-Convolutional Neural Networks with Heterogeneous Learning Rate

Convolutional Neural Networks (ConvNets or CNNs) have been candidly depl...
research
05/21/2021

AngularGrad: A New Optimization Technique for Angular Convergence of Convolutional Neural Networks

Convolutional neural networks (CNNs) are trained using stochastic gradie...
research
11/08/2022

Black Box Lie Group Preconditioners for SGD

A matrix free and a low rank approximation preconditioner are proposed t...
research
11/19/2018

Deep Frank-Wolfe For Neural Network Optimization

Learning a deep neural network requires solving a challenging optimizati...

Please sign up or login with your details

Forgot password? Click here to reset