InRank: Incremental Low-Rank Learning

06/20/2023
by   Jiawei Zhao, et al.
10

The theory of greedy low-rank learning (GLRL) aims to explain the impressive generalization capabilities of deep learning. It proves that stochastic gradient-based training implicitly regularizes neural networks towards low-rank solutions through a gradual increase of the rank during training. However, there is a gap between theory and practice since GLRL requires an infinitesimal initialization of the weights, which is not practical due to the fact that it is a saddle point. In this work, we remove the assumption of infinitesimal initialization by focusing on cumulative weight updates. We prove the cumulative weight updates follow an incremental low-rank trajectory for arbitrary orthogonal initialization of weights in a three-layer linear network. Empirically, we demonstrate that our theory holds on a broad range of neural networks (e.g., transformers) and standard training algorithms (e.g., SGD, Adam). However, existing training algorithms do not exploit the low-rank property to improve computational efficiency as the networks are not parameterized in low-rank. To remedy this, we design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices while incrementally augmenting their ranks during training. We evaluate InRank on GPT-2, and our results indicate that InRank achieves comparable prediction performance as the full-rank counterpart while requiring at most 33 propose an efficient version of InRank that achieves a reduction of 20 total training time and 37 WikiText-103 from scratch.

READ FULL TEXT
research
02/02/2022

Algorithms for Efficiently Learning Low-Rank Neural Networks

We study algorithms for learning low-rank neural networks – networks whe...
research
06/12/2023

Transformers learn through gradual rank increase

We identify incremental learning dynamics in transformers, where the dif...
research
03/10/2022

projUNN: efficient method for training deep networks with unitary matrices

In learning with recurrent or very deep feed-forward networks, employing...
research
06/13/2022

Rank Diminishing in Deep Neural Networks

The rank of neural networks measures information flowing across layers. ...
research
07/19/2022

Incremental Task Learning with Incremental Rank Updates

Incremental Task learning (ITL) is a category of continual learning that...
research
01/31/2023

On the Initialisation of Wide Low-Rank Feedforward Neural Networks

The edge-of-chaos dynamics of wide randomly initialized low-rank feedfor...
research
03/02/2020

Energy-efficient and Robust Cumulative Training with Net2Net Transformation

Deep learning has achieved state-of-the-art accuracies on several comput...

Please sign up or login with your details

Forgot password? Click here to reset