Training wide residual networks for deployment using a single bit for each weight

02/23/2018
by   Mark D. McDonnell, et al.
0

For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bit-per-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9 also considered MNIST, SVHN and ImageNet32, achieving 1-bit-per-weight test results of 0.27 rates halve previously reported values, and are within about 1 error-rates for the same network with full-precision weights. For networks that overfit, we also show significant improvements in error rate by not learning batch normalization scale and offset parameters. This applies to both full precision and 1-bit-per-weight networks. Using a warm-restart learning-rate schedule, we found that training for 1-bit-per-weight is just as fast as full-precision networks, with better accuracy than standard schedules, and achieved about 98 CIFAR-10/100. For full training code and trained models in MATLAB, Keras and PyTorch see https://github.com/McDonnell-Lab/1-bit-per-weight/ .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2021

S^3: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Shift neural networks reduce computation complexity by removing expensiv...
research
11/20/2016

Efficient Stochastic Inference of Bitwise Deep Neural Networks

Recently published methods enable training of bitwise neural networks wh...
research
02/03/2020

Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks

Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy...
research
12/09/2018

Binary Input Layer: Training of CNN models with binary input data

For the efficient execution of deep convolutional neural networks (CNN) ...
research
05/24/2019

Magnetoresistive RAM for error resilient XNOR-Nets

We trained three Binarized Convolutional Neural Network architectures (L...
research
07/16/2019

Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems

Batch-normalization (BN) layers are thought to be an integrally importan...
research
02/16/2021

SiMaN: Sign-to-Magnitude Network Binarization

Binary neural networks (BNNs) have attracted broad research interest due...

Please sign up or login with your details

Forgot password? Click here to reset