Stochastic batch size for adaptive regularization in deep network optimization

04/14/2020
by   Kensuke Nakamura, et al.
1

We propose a first-order stochastic optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework. The adaptive regularization is imposed by stochastic process in determining batch size for each model parameter at each optimization iteration. The stochastic batch size is determined by the update probability of each parameter following a distribution of gradient norms in consideration of their local and global properties in the neural network architecture where the range of gradient norms may vary within and across layers. We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets. The quantitative evaluation indicates that our algorithm outperforms the state-of-the-art optimization algorithms in generalization while providing less sensitivity to the selection of batch size which often plays a critical role in optimization, thus achieving more robustness to the selection of regularity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2019

Adaptive Regularization via Residual Smoothing in Deep Learning Optimization

We present an adaptive regularization algorithm that can be effectively ...
research
07/21/2019

Adaptive Weight Decay for Deep Neural Networks

Regularization in the optimization of deep neural networks is often crit...
research
02/29/2020

Conjugate-gradient-based Adam for stochastic optimization and its application to deep learning

This paper proposes a conjugate-gradient-based Adam algorithm blending A...
research
02/21/2019

Interplay Between Optimization and Generalization of Stochastic Gradient Descent with Covariance Noise

The choice of batch-size in a stochastic optimization algorithm plays a ...
research
05/17/2022

Hyper-Learning for Gradient-Based Batch Size Adaptation

Scheduling the batch size to increase is an effective strategy to contro...
research
02/06/2023

Target-based Surrogates for Stochastic Optimization

We consider minimizing functions for which it is expensive to compute th...
research
02/13/2023

Symbolic Discovery of Optimization Algorithms

We present a method to formulate algorithm discovery as program search, ...

Please sign up or login with your details

Forgot password? Click here to reset