Log In Sign Up

Structured Transforms for Small-Footprint Deep Learning

by   Vikas Sindhwani, et al.

We consider the task of building compact deep learning pipelines suitable for deployment on storage and power constrained mobile devices. We propose a unified framework to learn a broad family of structured parameter matrices that are characterized by the notion of low displacement rank. Our structured transforms admit fast function and gradient evaluation, and span a rich range of parameter sharing configurations whose statistical modeling capacity can be explicitly tuned along a continuum from structured to unstructured. Experimental results show that these transforms can significantly accelerate inference and forward/backward passes during training, and offer superior accuracy-compactness-speed tradeoffs in comparison to a number of existing techniques. In keyword spotting applications in mobile speech recognition, our methods are much more effective than standard linear low-rank bottleneck layers and nearly retain the performance of state of the art models, while providing more than 3.5-fold compression.


page 1

page 2

page 3

page 4


Learning Compressed Transforms with Low Displacement Rank

The low displacement rank (LDR) framework for structured matrices repres...

Low-rank Gradient Approximation For Memory-Efficient On-device Training of Deep Neural Network

Training machine learning models on mobile devices has the potential of ...

A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank

Compression has emerged as one of the essential deep learning research t...

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

Fast linear transforms are ubiquitous in machine learning, including the...

TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices

Advances in deep learning have led to state-of-the-art performance acros...

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

Modern neural network architectures use structured linear transformation...

AntMan: Sparse Low-Rank Compression to Accelerate RNN inference

Wide adoption of complex RNN based models is hindered by their inference...