Multiplicative update rules for accelerating deep learning training and increasing robustness

07/14/2023
by   Manos Kirtas, et al.
0

Even nowadays, where Deep Learning (DL) has achieved state-of-the-art performance in a wide range of research domains, accelerating training and building robust DL models remains a challenging task. To this end, generations of researchers have pursued to develop robust methods for training DL architectures that can be less sensitive to weight distributions, model architectures and loss landscapes. However, such methods are limited to adaptive learning rate optimizers, initialization schemes, and clipping gradients without investigating the fundamental rule of parameters update. Although multiplicative updates have contributed significantly to the early development of machine learning and hold strong theoretical claims, to best of our knowledge, this is the first work that investigate them in context of DL training acceleration and robustness. In this work, we propose an optimization framework that fits to a wide range of optimization algorithms and enables one to apply alternative update rules. To this end, we propose a novel multiplicative update rule and we extend their capabilities by combining it with a traditional additive update term, under a novel hybrid update method. We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule and we experimentally demonstrate their effectiveness in a wide range of task and optimization methods. Such tasks ranging from convex and non-convex optimization to difficult image classification benchmarks applying a wide range of traditionally used optimization methods and Deep Neural Network (DNN) architectures.

READ FULL TEXT

page 4

page 6

page 7

page 8

page 9

research
03/29/2023

Meta-Learning Parameterized First-Order Optimizers using Differentiable Convex Optimization

Conventional optimization methods in machine learning and controls rely ...
research
04/10/2018

DeepMarks: A Digital Fingerprinting Framework for Deep Neural Networks

This paper proposes DeepMarks, a novel end-to-end framework for systemat...
research
09/29/2020

BAMSProd: A Step towards Generalizing the Adaptive Optimization Methods to Deep Binary Model

Recent methods have significantly reduced the performance degradation of...
research
10/08/2021

Auto-DSP: Learning to Optimize Acoustic Echo Cancellers

Adaptive filtering algorithms are commonplace in signal processing and h...
research
06/25/2020

The Effect of Optimization Methods on the Robustness of Out-of-Distribution Detection Approaches

Deep neural networks (DNNs) have become the de facto learning mechanism ...
research
01/28/2022

Learning Proximal Operators to Discover Multiple Optima

Finding multiple solutions of non-convex optimization problems is a ubiq...
research
06/18/2020

Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition

Training of deep neural networks (DNNs) frequently involves optimizing s...

Please sign up or login with your details

Forgot password? Click here to reset