A Bootstrap Algorithm for Fast Supervised Learning

05/04/2023
by   Michael A. Kouritzin, et al.
0

Training a neural network (NN) typically relies on some type of curve-following method, such as gradient descent (GD) (and stochastic gradient descent (SGD)), ADADELTA, ADAM or limited memory algorithms. Convergence for these algorithms usually relies on having access to a large quantity of observations in order to achieve a high level of accuracy and, with certain classes of functions, these algorithms could take multiple epochs of data points to catch on. Herein, a different technique with the potential of achieving dramatically better speeds of convergence, especially for shallow networks, is explored: it does not curve-follow but rather relies on 'decoupling' hidden layers and on updating their weighted connections through bootstrapping, resampling and linear regression. By utilizing resampled observations, the convergence of this process is empirically shown to be remarkably fast and to require a lower amount of data points: in particular, our experiments show that one needs a fraction of the observations that are required with traditional neural network training methods to approximate various classes of functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2015

Variance Reduced Stochastic Gradient Descent with Neighbors

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, ye...
research
08/06/2020

Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem

This paper considers the multi-agent linear least-squares problem in a s...
research
08/25/2019

What are Neural Networks made of?

The success of Deep Learning methods is not well understood, though vari...
research
02/22/2018

Vector Field Based Neural Networks

A novel Neural Network architecture is proposed using the mathematically...
research
11/15/2020

Accelerating Distributed SGD for Linear Regression using Iterative Pre-Conditioning

This paper considers the multi-agent distributed linear least-squares pr...
research
10/27/2014

Parallel training of DNNs with Natural Gradient and Parameter Averaging

We describe the neural-network training framework used in the Kaldi spee...
research
12/19/2022

Gradient Descent-Type Methods: Background and Simple Unified Convergence Analysis

In this book chapter, we briefly describe the main components that const...

Please sign up or login with your details

Forgot password? Click here to reset