The Backpropagation algorithm for a math student

01/22/2023
by   Saeed Damadi, et al.
0

A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the loss function with respect to all parameters. This calculation can be a non-trivial task because the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters. The Backpropagation (BP) algorithm leverages the composite structure of the DNN to efficiently compute the gradient. As a result, the number of layers in the network does not significantly impact the complexity of the calculation. The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator. This can be achieved by considering the total derivative of each layer with respect to its parameters and expressing it as a Jacobian matrix. The gradient can then be represented as the matrix product of these Jacobian matrices. This approach is valid because the chain rule can be applied to a composition of vector-valued functions, and the use of Jacobian matrices allows for the incorporation of multiple inputs and outputs. By providing concise mathematical justifications, the results can be made understandable and useful to a broad audience from various disciplines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2017

Backpropagation in matrix notation

In this note we calculate the gradient of the network function in matrix...
research
11/15/2019

What is the gradient of a scalar function of a symmetric matrix ?

Perusal of research articles that deal with the topic of matrix calculus...
research
03/29/2023

Backpropagation and F-adjoint

This paper presents a concise mathematical framework for investigating b...
research
07/20/2021

An induction proof of the backpropagation algorithm in matrix notation

Backpropagation (BP) is a core component of the contemporary deep learni...
research
10/07/2015

Efficient Per-Example Gradient Computations

This technical report describes an efficient technique for computing the...
research
02/04/2021

A Deep Collocation Method for the Bending Analysis of Kirchhoff Plate

In this paper, a deep collocation method (DCM) for thin plate bending pr...

Please sign up or login with your details

Forgot password? Click here to reset