Beyond Backprop: Alternating Minimization with co-Activation Memory

06/24/2018
by   Anna Choromanska, et al.
0

We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [Mairal et al, 2009]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2020

Learning to Learn with Feedback and Local Plasticity

Interest in biologically inspired alternatives to backpropagation is dri...
research
08/08/2023

Improving Performance in Continual Learning Tasks using Bio-Inspired Architectures

The ability to learn continuously from an incoming data stream without c...
research
05/01/2021

Stochastic Block-ADMM for Training Deep Networks

In this paper, we propose Stochastic Block-ADMM as an approach to train ...
research
08/17/2022

Learning with Local Gradients at the Edge

To enable learning on edge devices with fast convergence and low memory,...
research
01/30/2021

Inertial Proximal Deep Learning Alternating Minimization for Efficient Neutral Network Training

In recent years, the Deep Learning Alternating Minimization (DLAM), whic...
research
02/20/2022

Personalized Federated Learning with Exact Stochastic Gradient Descent

In Federated Learning (FL), datasets across clients tend to be heterogen...
research
03/13/2019

Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets

Training activation quantized neural networks involves minimizing a piec...

Please sign up or login with your details

Forgot password? Click here to reset