Revisiting Recursive Least Squares for Training Deep Neural Networks

09/07/2021
by   Chunyuan Zhang, et al.
0

Recursive least squares (RLS) algorithms were once widely used for training small-scale neural networks, due to their fast convergence. However, previous RLS algorithms are unsuitable for training deep neural networks (DNNs), since they have high computational complexity and too many preconditions. In this paper, to overcome these drawbacks, we propose three novel RLS optimization algorithms for training feedforward neural networks, convolutional neural networks and recurrent neural networks (including long short-term memory networks), by using the error backpropagation and our average-approximation RLS method, together with the equivalent gradients of the linear least squares loss function with respect to the linear outputs of hidden layers. Compared with previous RLS optimization algorithms, our algorithms are simple and elegant. They can be viewed as an improved stochastic gradient descent (SGD) algorithm, which uses the inverse autocorrelation matrix of each layer as the adaptive learning rate. Their time and space complexities are only several times those of SGD. They only require the loss function to be the mean squared error and the activation function of the output layer to be invertible. In fact, our algorithms can be also used in combination with other first-order optimization algorithms without requiring these two preconditions. In addition, we present two improved methods for our algorithms. Finally, we demonstrate their effectiveness compared to the Adam algorithm on MNIST, CIFAR-10 and IMDB datasets, and investigate the influences of their hyperparameters experimentally.

READ FULL TEXT
research
02/20/2020

Stochastic Runge-Kutta methods and adaptive SGD-G2 stochastic gradient descent

The minimization of the loss function is of paramount importance in deep...
research
02/11/2018

Optimizing Neural Networks in the Equivalent Class Space

It has been widely observed that many activation functions and pooling m...
research
02/17/2023

On Equivalent Optimization of Machine Learning Methods

At the core of many machine learning methods resides an iterative optimi...
research
11/15/2022

Selective Memory Recursive Least Squares: Uniformly Allocated Approximation Capabilities of RBF Neural Networks in Real-Time Learning

When performing real-time learning tasks, the radial basis function neur...
research
04/29/2019

New optimization algorithms for neural network training using operator splitting techniques

In the following paper we present a new type of optimization algorithms ...
research
07/28/2022

One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares

While deep neural networks are capable of achieving state-of-the-art per...
research
08/03/2019

Ensemble Neural Networks (ENN): A gradient-free stochastic method

In this study, an efficient stochastic gradient-free method, the ensembl...

Please sign up or login with your details

Forgot password? Click here to reset