Successive Affine Learning for Deep Neural Networks

05/13/2023
by   Yuesheng Xu, et al.
0

This paper introduces a successive affine learning (SAL) model for constructing deep neural networks (DNNs). Traditionally, a DNN is built by solving a non-convex optimization problem. It is often challenging to solve such a problem numerically due to its non-convexity and having a large number of layers. To address this challenge, inspired by the human education system, the multi-grade deep learning (MGDL) model was recently initiated by the author of this paper. The MGDL model learns a DNN in several grades, in each of which one constructs a shallow DNN consisting of a small number of layers. The MGDL model still requires solving several non-convex optimization problems. The proposed SAL model mutates from the MGDL model. Noting that each layer of a DNN consists of an affine map followed by an activation function, we propose to learn the affine map by solving a quadratic/convex optimization problem which involves the activation function only after the weight matrix and the bias vector for the current layer have been trained. In the context of function approximation, for a given function the SAL model generates an orthogonal expansion of the function with adaptive basis functions in the form of DNNs. We establish the Pythagorean identity and the Parseval identity for the orthogonal system generated by the SAL model. Moreover, we provide a convergence theorem of the SAL process in the sense that either it terminates after a finite number of grades or the norms of its optimal error functions strictly decrease to a limit as the grade number increases to infinity. Furthermore, we present numerical examples of proof of concept which demonstrate that the proposed SAL model significantly outperforms the traditional deep learning model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2023

Multi-Grade Deep Learning

The current deep learning model is of a single-grade, that is, it learns...
research
09/14/2023

Multi-Grade Deep Learning for Partial Differential Equations with Applications to the Burgers Equation

We develop in this paper a multi-grade deep learning method for solving ...
research
06/08/2015

Adaptive Normalized Risk-Averting Training For Deep Neural Networks

This paper proposes a set of new error criteria and learning approaches,...
research
03/31/2021

CDiNN -Convex Difference Neural Networks

Neural networks with ReLU activation function have been shown to be univ...
research
09/11/2019

Regularized deep learning with non-convex penalties

Regularization methods are often employed in deep learning neural networ...
research
06/06/2018

Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex

Several recently proposed architectures of neural networks such as ResNe...
research
07/19/2021

Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function

We consider non-convex stochastic optimization problems where the object...

Please sign up or login with your details

Forgot password? Click here to reset