Multiplexed gradient descent: Fast online training of modern datasets on hardware neural networks without backpropagation

03/05/2023
by   Adam N. McCaughan, et al.
0

We present multiplexed gradient descent (MGD), a gradient descent framework designed to easily train analog or digital neural networks in hardware. MGD utilizes zero-order optimization techniques for online training of hardware neural networks. We demonstrate its ability to train neural networks on modern machine learning datasets, including CIFAR-10 and Fashion-MNIST, and compare its performance to backpropagation. Assuming realistic timescales and hardware parameters, our results indicate that these optimization techniques can train a network on emerging hardware platforms orders of magnitude faster than the wall-clock time of training via backpropagation on a standard GPU, even in the presence of imperfect weight updates or device-to-device variations in the hardware. We additionally describe how it can be applied to existing hardware as part of chip-in-the-loop training, or integrated directly at the hardware level. Crucially, the MGD framework is highly flexible, and its gradient descent process can be optimized to compensate for specific hardware limitations such as slow parameter-update speeds or limited input bandwidth.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2020

ZORB: A Derivative-Free Backpropagation Algorithm for Neural Networks

Gradient descent and backpropagation have enabled neural networks to ach...
research
06/24/2018

In-situ Stochastic Training of MTJ Crossbar based Neural Networks

Owing to high device density, scalability and non-volatility, Magnetic T...
research
08/15/2019

Accelerated CNN Training Through Gradient Approximation

Training deep convolutional neural networks such as VGG and ResNet by gr...
research
07/06/2017

High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis

Independent Component Analysis (ICA) is a dimensionality reduction techn...
research
04/17/2018

Joint Quantizer Optimization based on Neural Quantizer for Sum-Product Decoder

A low-precision analog-to-digital converter (ADC) is required to impleme...
research
02/10/2020

Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

A traditional artificial neural network (ANN) is normally trained slowly...
research
02/10/2020

Reducing the Computational Burden of Deep Learning with Recursive Local Representation Alignment

Training deep neural networks on large-scale datasets requires significa...

Please sign up or login with your details

Forgot password? Click here to reset