The Underlying Correlated Dynamics in Neural Training

12/18/2022
by   Rotem Turjeman, et al.
0

Training of neural networks is a computationally intensive task. The significance of understanding and modeling the training dynamics is growing as increasingly larger networks are being trained. We propose in this work a model based on the correlation of the parameters' dynamics, which dramatically reduces the dimensionality. We refer to our algorithm as correlation mode decomposition (CMD). It splits the parameter space into groups of parameters (modes) which behave in a highly correlated manner through the epochs. We achieve a remarkable dimensionality reduction with this approach, where networks like ResNet-18, transformers and GANs, containing millions of parameters, can be modeled well using just a few modes. We observe each typical time profile of a mode is spread throughout the network in all layers. Moreover, our model induces regularization which yields better generalization capacity on the test set. This representation enhances the understanding of the underlying training dynamics and can pave the way for designing better acceleration techniques.

READ FULL TEXT

page 6

page 27

page 28

page 29

page 30

page 33

research
03/20/2021

Train Deep Neural Networks in 40-D Subspaces

Although there are massive parameters in deep neural networks, the train...
research
05/16/2019

Reduced-order modeling using Dynamic Mode Decomposition and Least Angle Regression

Dynamic Mode Decomposition (DMD) yields a linear, approximate model of a...
research
08/21/2022

Emergence of hierarchical modes from deep learning

Large-scale deep neural networks consume expensive training costs, but t...
research
07/18/2022

Interpolation, extrapolation, and local generalization in common neural networks

There has been a long history of works showing that neural networks have...
research
02/26/2020

Dimensionality Reduction of Movement Primitives in Parameter Space

Movement primitives are an important policy class for real-world robotic...
research
08/26/2021

Disentangling ODE parameters from dynamics in VAEs

Deep networks have become increasingly of interest in dynamical system p...
research
11/25/2020

Automatic Identification of MHD Modes in Magnetic Fluctuations Spectrograms using Deep Learning Techniques

The control and mitigation of MHD oscillations modes is an open problem ...

Please sign up or login with your details

Forgot password? Click here to reset