Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition

06/18/2020
by   Mauricio E. Tano, et al.
0

Training of deep neural networks (DNNs) frequently involves optimizing several millions or even billions of parameters. Even with modern computing architectures, the computational expense of DNN training can inhibit, for instance, network architecture design optimization, hyper-parameter studies, and integration into scientific research cycles. The key factor limiting performance is that both the feed-forward evaluation and the back-propagation rule are needed for each weight during optimization in the update rule. In this work, we propose a method to decouple the evaluation of the update rule at each weight. At first, Proper Orthogonal Decomposition (POD) is used to identify a current estimate of the principal directions of evolution of weights per layer during training based on the evolution observed with a few backpropagation steps. Then, Dynamic Mode Decomposition (DMD) is used to learn the dynamics of the evolution of the weights in each layer according to these principal directions. The DMD model is used to evaluate an approximate converged state when training the ANN. Afterward, some number of backpropagation steps are performed, starting from the DMD estimates, leading to an update to the principal directions and DMD model. This iterative process is repeated until convergence. By fine-tuning the number of backpropagation steps used for each DMD model estimation, a significant reduction in the number of operations required to train the neural networks can be achieved. In this paper, the DMD acceleration method will be explained in detail, along with the theoretical justification for the acceleration provided by DMD. This method is illustrated using a regression problem of key interest for the scientific machine learning community: the prediction of a pollutant concentration field in a diffusion, advection, reaction problem.

READ FULL TEXT

page 6

page 7

page 12

page 13

research
03/29/2023

Backpropagation and F-adjoint

This paper presents a concise mathematical framework for investigating b...
research
10/02/2019

Accelerating Deep Learning by Focusing on the Biggest Losers

This paper introduces Selective-Backprop, a technique that accelerates t...
research
03/17/2023

A Two-Step Rule for Backpropagation

We present a simplified computational rule for the back-propagation form...
research
11/22/2015

An Approximate Backpropagation Learning Rule for Memristor Based Neural Networks Using Synaptic Plasticity

We describe an approximation to backpropagation algorithm for training d...
research
02/01/2023

Weight Prediction Boosts the Convergence of AdamW

In this paper, we introduce weight prediction into the AdamW optimizer t...
research
07/14/2023

Multiplicative update rules for accelerating deep learning training and increasing robustness

Even nowadays, where Deep Learning (DL) has achieved state-of-the-art pe...
research
09/21/2020

Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

Training deep learning models on embedded devices is typically avoided s...

Please sign up or login with your details

Forgot password? Click here to reset