StrassenNets: Deep learning with a multiplication budget

by   Michael Tschannen, et al.

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) are due to matrix multiplications, both in convolutional and fully connected layers. Matrix multiplications can be cast as 2-layer sum-product networks (SPNs) (arithmetic circuits), disentangling multiplications and additions. We leverage this observation for end-to-end learning of low-cost (in terms of multiplications) approximations of linear operations in DNN layers. Specifically, we propose to replace matrix multiplication operations by SPNs, with widths corresponding to the budget of multiplications we want to allocate to each layer, and learning the edges of the SPNs from data. Experiments on CIFAR-10 and ImageNet show that this method applied to ResNet yields significantly higher accuracy than existing methods for a given multiplication budget, or leads to the same or higher accuracy compared to existing methods while using significantly fewer multiplications. Furthermore, our approach allows fine-grained control of the tradeoff between arithmetic complexity and accuracy of DNN models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen's matrix multiplication algorithm, i.e., it can learn to multiply 2 × 2 matrices using only 7 multiplications instead of 8.



page 1

page 2

page 3

page 4


Matrix Multiplication with Less Arithmetic Complexity and IO Complexity

After Strassen presented the first sub-cubic matrix multiplication algor...

OverSketch: Approximate Matrix Multiplication for the Cloud

We propose OverSketch, an approximate algorithm for distributed matrix m...

Arithmetic Distribution Neural Network for Background Subtraction

We propose a new Arithmetic Distribution Neural Network (ADNN) for learn...

Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time

We exhibit a randomized algorithm which given a square n× n complex matr...

Constant-Depth and Subcubic-Size Threshold Circuits for Matrix Multiplication

Boolean circuits of McCulloch-Pitts threshold gates are a classic model ...

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network

Resistive Random-Access-Memory (ReRAM) crossbar is a promising technique...

Effects of Approximate Multiplication on Convolutional Neural Networks

This paper analyzes the effects of approximate multiplication when perfo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.