DeepAI AI Chat
Log In Sign Up

Practical recommendations for gradient-based training of deep architectures

06/24/2012
by   Yoshua Bengio, et al.
0

Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/23/2017

Failures of Gradient-Based Deep Learning

In recent years, Deep Learning has become the go-to solution for a broad...
03/15/2022

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Deep neural networks have seen great success in recent years; however, t...
04/02/2022

AdaSmooth: An Adaptive Learning Rate Method based on Effective Ratio

It is well known that we need to choose the hyper-parameters in Momentum...
09/15/2015

Adapting Resilient Propagation for Deep Learning

The Resilient Propagation (Rprop) algorithm has been very popular for ba...
01/22/2021

Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning

We introduce Gravity, another algorithm for gradient-based optimization....
08/21/2020

Topological Gradient-based Competitive Learning

Topological learning is a wide research area aiming at uncovering the mu...
03/12/2020

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contribut...

Code Repositories

espresso

A minimal high performance parallel neural network framework running on iOS


view repo

ConvNetCIFAR10ClassificationWithTensorFlow

None


view repo

RNNintendo

Files for the independent study with Ms. Pandya


view repo