Structural Credit Assignment with Coordinated Exploration

07/25/2023
by   Stephen Chung, et al.
0

A biologically plausible method for training an Artificial Neural Network (ANN) involves treating each unit as a stochastic Reinforcement Learning (RL) agent, thereby considering the network as a team of agents. Consequently, all units can learn via REINFORCE, a local learning rule modulated by a global reward signal, which aligns more closely with biologically observed forms of synaptic plasticity. However, this learning method tends to be slow and does not scale well with the size of the network. This inefficiency arises from two factors impeding effective structural credit assignment: (i) all units independently explore the network, and (ii) a single reward is used to evaluate the actions of all units. Accordingly, methods aimed at improving structural credit assignment can generally be classified into two categories. The first category includes algorithms that enable coordinated exploration among units, such as MAP propagation. The second category encompasses algorithms that compute a more specific reward signal for each unit within the network, like Weight Maximization and its variants. In this research report, our focus is on the first category. We propose the use of Boltzmann machines or a recurrent network for coordinated exploration. We show that the negative phase, which is typically necessary to train Boltzmann machines, can be removed. The resulting learning rules are similar to the reward-modulated Hebbian learning rule. Experimental results demonstrate that coordinated exploration significantly exceeds independent exploration in training speed for multiple stochastic and discrete units based on REINFORCE, even surpassing straight-through estimator (STE) backpropagation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2023

Unbiased Weight Maximization

A biologically plausible method for training an Artificial Neural Networ...
research
10/15/2020

An Alternative to Backpropagation in Deep Reinforcement Learning

State-of-the-art deep learning algorithms mostly rely on gradient backpr...
research
10/14/2021

Hindsight Network Credit Assignment: Efficient Credit Assignment in Networks of Discrete Stochastic Units

Training neural networks with discrete stochastic variables presents a u...
research
10/19/2020

Every Hidden Unit Maximizing Output Weights Maximizes The Global Reward

For a network of stochastic units trained on a reinforcement learning ta...
research
05/28/2019

Learning distant cause and effect using only local and immediate credit assignment

We present a recurrent neural network memory that uses sparse coding to ...
research
12/27/2021

Multiagent Model-based Credit Assignment for Continuous Control

Deep reinforcement learning (RL) has recently shown great promise in rob...
research
06/08/2021

Credit Assignment Through Broadcasting a Global Error Vector

Backpropagation (BP) uses detailed, unit-specific feedback to train deep...

Please sign up or login with your details

Forgot password? Click here to reset