Neural Networks with Few Multiplications

10/11/2015
by   Zhouhan Lin, et al.
0

For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these. Our method consists of two parts: First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts. Experimental results across 3 popular datasets (MNIST, CIFAR10, SVHN) show that this approach not only does not hurt classification performance but can result in even better performance than standard stochastic gradient descent training, paving the way to fast, hardware-friendly training of neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2015

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Training neural networks is a challenging non-convex optimization proble...
research
10/14/2020

Deep Neural Network Training with Frank-Wolfe

This paper studies the empirical efficacy and benefits of using projecti...
research
10/21/2017

Learning Discrete Weights Using the Local Reparameterization Trick

Recent breakthroughs in computer vision make use of large deep neural ne...
research
12/30/2020

SGD Distributional Dynamics of Three Layer Neural Networks

With the rise of big data analytics, multi-layer neural networks have su...
research
02/02/2019

Self-Binarizing Networks

We present a method to train self-binarizing neural networks, that is, n...
research
09/07/2022

A Greedy Algorithm for Building Compact Binary Activated Neural Networks

We study binary activated neural networks in the context of regression t...
research
05/23/2019

Parsimonious Deep Learning: A Differential Inclusion Approach with Global Convergence

Over-parameterization is ubiquitous nowadays in training neural networks...

Please sign up or login with your details

Forgot password? Click here to reset