Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization

06/15/2023
by   Ramnath Kumar, et al.
0

We develop a re-weighted gradient descent technique for boosting the performance of deep neural networks. Our algorithm involves the importance weighting of data points during each optimization step. Our approach is inspired by distributionally robust optimization with f-divergences, which has been known to result in models with improved generalization guarantees. Our re-weighting scheme is simple, computationally efficient, and can be combined with any popular optimization algorithms such as SGD and Adam. Empirically, we demonstrate our approach's superiority on various tasks, including vanilla classification, classification with label imbalance, noisy labels, domain adaptation, and tabular representation learning. Notably, we obtain improvements of +0.7 respectively. Moreover, our algorithm boosts the performance of BERT on GLUE benchmarks by +1.94 demonstrate the effectiveness of the proposed approach, indicating its potential for improving performance in diverse domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2019

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

In this paper, we aim at providing an introduction to the gradient desce...
research
05/13/2014

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling

Stochastic Gradient Descent (SGD) is a popular optimization method which...
research
01/21/2023

Genetically Modified Wolf Optimization with Stochastic Gradient Descent for Optimising Deep Neural Networks

When training Convolutional Neural Networks (CNNs) there is a large emph...
research
10/30/2019

Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation

Stochastic Gradient Descent or SGD is the most popular optimization algo...
research
05/25/2022

Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently

Driven by the empirical success and wide use of deep neural networks, un...
research
04/07/2020

Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

This paper investigates the stochastic optimization problem with a focus...
research
12/03/2019

Improving upon NBA point-differential rankings

For some time, point-differential has been thought to be a better predic...

Please sign up or login with your details

Forgot password? Click here to reset