Techniques for Learning Binary Stochastic Feedforward Neural Networks

06/11/2014
by   Tapani Raiko, et al.
0

Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potential benefits when compared to deterministic MLP networks. (1) They allow to learn one-to-many type of mappings. (2) They can be used in structured prediction problems, where modeling the internal structure of the output is important. (3) Stochasticity has been shown to be an excellent regularizer, which makes generalization performance potentially better in general. However, training stochastic networks is considerably more difficult. We study training using M samples of hidden activations per input. We show that the case M=1 leads to a fundamentally different behavior where the network tries to avoid stochasticity. We propose two new estimators for the training gradient and propose benchmark tests for comparing training algorithms. Our experiments confirm that training stochastic networks is difficult and show that the proposed two estimators perform favorably among all the five known estimators.

READ FULL TEXT
research
05/16/2020

An Effective and Efficient Training Algorithm for Multi-layer Feedforward Neural Networks

Network initialization is the first and critical step for training neura...
research
03/24/2015

Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks

We establish upper bounds for the minimal number of hidden units for whi...
research
04/29/2009

Adaptive Learning with Binary Neurons

A efficient incremental learning algorithm for classification tasks, cal...
research
06/04/2020

Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

In networks with binary activations and or binary weights the training b...
research
06/20/2018

Reinforcement Learning using Augmented Neural Networks

Neural networks allow Q-learning reinforcement learning agents such as d...
research
08/11/2023

Automated Sizing and Training of Efficient Deep Autoencoders using Second Order Algorithms

We propose a multi-step training method for designing generalized linear...
research
01/16/2018

Empirical Explorations in Training Networks with Discrete Activations

We present extensive experiments training and testing hidden units in de...

Please sign up or login with your details

Forgot password? Click here to reset