projUNN: efficient method for training deep networks with unitary matrices

03/10/2022
by   Bobak Kiani, et al.
6

In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability. However, restricting network parameters to be unitary typically comes at the cost of expensive parameterizations or increased training runtime. We propose instead an efficient method based on rank-k updates – or their rank-k approximation – that maintains performance at a nearly optimal training runtime. We introduce two variants of this method, named Direct (projUNN-D) and Tangent (projUNN-T) projected Unitary Neural Networks, that can parameterize full N-dimensional unitary or orthogonal matrices with a training runtime scaling as O(kN^2). Our method either projects low-rank gradients onto the closest unitary matrix (projUNN-T) or transports unitary matrices in the direction of the low-rank gradient (projUNN-D). Even in the fastest setting (k=1), projUNN is able to train a model's unitary parameters to reach comparable performances against baseline implementations. By integrating our projUNN algorithm into both recurrent and convolutional neural networks, our models can closely match or exceed benchmarked results from state-of-the-art algorithms.

READ FULL TEXT
research
02/02/2022

Algorithms for Efficiently Learning Low-Rank Neural Networks

We study algorithms for learning low-rank neural networks – networks whe...
research
06/20/2023

InRank: Incremental Low-Rank Learning

The theory of greedy low-rank learning (GLRL) aims to explain the impres...
research
07/11/2023

Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Despite the dominance and effectiveness of scaling, resulting in large n...
research
07/06/2022

Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Applying Differentially Private Stochastic Gradient Descent (DPSGD) to t...
research
01/24/2020

Low-rank Gradient Approximation For Memory-Efficient On-device Training of Deep Neural Network

Training machine learning models on mobile devices has the potential of ...
research
11/30/2021

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

Overparameterized neural networks generalize well but are expensive to t...
research
06/16/2021

Simultaneous Training of Partially Masked Neural Networks

For deploying deep learning models to lower end devices, it is necessary...

Please sign up or login with your details

Forgot password? Click here to reset