A Resizable Mini-batch Gradient Descent based on a Randomized Weighted Majority

11/17/2017
by   Seong Jin Cho, et al.
0

Determining the appropriate batch size for mini-batch gradient descent is always time consuming as it often relies on grid search. This paper considers a resizable mini-batch gradient descent (RMGD) algorithm-inspired by the randomized weighted majority algorithm-for achieving best performance in grid search by selecting an appropriate batch size at each epoch with a probability defined as a function of its previous success/failure and the validation error. This probability encourages exploration of different batch size and then later exploitation of batch size with history of success. At each epoch, the RMGD samples a batch size from its probability distribution, then uses the selected batch size for mini-batch gradient descent. After obtaining the validation error at each epoch, the probability distribution is updated to incorporate the effectiveness of the sampled batch size. The RMGD essentially assists the learning process to explore the possible domain of the batch size and exploit successful batch size. Experimental results show that the RMGD achieves performance better than the best performing single batch size. Furthermore, it attains this performance in a shorter amount of time than that of the best performing. It is surprising that the RMGD achieves better performance than grid search.

READ FULL TEXT

page 3

page 4

page 6

page 8

research
06/14/2022

MBGDT:Robust Mini-Batch Gradient Descent

In high dimensions, most machine learning method perform fragile even th...
research
09/29/2022

Convergence of the mini-batch SIHT algorithm

The Iterative Hard Thresholding (IHT) algorithm has been considered exte...
research
11/15/2019

Optimal Mini-Batch Size Selection for Fast Gradient Descent

This paper presents a methodology for selecting the mini-batch size that...
research
07/14/2021

Disparity Between Batches as a Signal for Early Stopping

We propose a metric for evaluating the generalization ability of deep ne...
research
01/27/2023

Meta-Learning Mini-Batch Risk Functionals

Supervised learning typically optimizes the expected value risk function...
research
05/29/2023

Learning Two-Layer Neural Networks, One (Giant) Step at a Time

We study the training dynamics of shallow neural networks, investigating...
research
04/02/2023

Mini-batch k-means terminates within O(d/ε) iterations

We answer the question: "Does local progress (on batches) imply global p...

Please sign up or login with your details

Forgot password? Click here to reset