Maestro: Uncovering Low-Rank Structures via Trainable Decomposition

08/28/2023
by   Samuel Horvath, et al.
0

Deep Neural Networks (DNNs) have been a large driver and enabler for AI breakthroughs in recent years. These models have been getting larger in their attempt to become more accurate and tackle new upcoming use-cases, including AR/VR and intelligent assistants. However, the training process of such large models is a costly and time-consuming process, which typically yields a single model to fit all targets. To mitigate this, various techniques have been proposed in the literature, including pruning, sparsification or quantization of the model weights and updates. While able to achieve high compression rates, they often incur computational overheads or accuracy penalties. Alternatively, factorization methods have been leveraged to incorporate low-rank compression in the training process. Similarly, such techniques (e.g., SVD) frequently rely on the computationally expensive decomposition of layers and are potentially sub-optimal for non-linear models, such as DNNs. In this work, we take a further step in designing efficient low-rank models and propose Maestro, a framework for trainable low-rank layers. Instead of regularly applying a priori decompositions such as SVD, the low-rank structure is built into the training process through a generalized variant of Ordered Dropout. This method imposes an importance ordering via sampling on the decomposed DNN structure. Our theoretical analysis demonstrates that our method recovers the SVD decomposition of linear mapping on uniformly distributed data and PCA for linear autoencoders. We further apply our technique on DNNs and empirically illustrate that Maestro enables the extraction of lower footprint models that preserve model performance while allowing for graceful accuracy-latency tradeoff for the deployment to devices of different capabilities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2020

Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification

Modern deep neural networks (DNNs) often require high memory consumption...
research
12/06/2018

Trained Rank Pruning for Efficient Deep Neural Networks

The performance of Deep Neural Networks (DNNs) keeps elevating in recent...
research
04/30/2020

TRP: Trained Rank Pruning for Efficient Deep Neural Networks

To enable DNNs on edge devices like mobile phones, low-rank approximatio...
research
10/29/2019

Scalable Deep Neural Networks via Low-Rank Matrix Factorization

Compressing deep neural networks (DNNs) is important for real-world appl...
research
05/30/2022

STN: Scalable Tensorizing Networks via Structure-Aware Training and Adaptive Compression

Deep neural networks (DNNs) have delivered a remarkable performance in m...
research
03/05/2021

Pufferfish: Communication-efficient Models At No Extra Cost

To mitigate communication overheads in distributed model training, sever...
research
03/28/2017

Coordinating Filters for Faster Deep Neural Networks

Very large-scale Deep Neural Networks (DNNs) have achieved remarkable su...

Please sign up or login with your details

Forgot password? Click here to reset