Stability of Accuracy for the Training of DNNs Via the Uniform Doubling Condition

10/16/2022
by   Yitzchak Shmalo, et al.
0

We study the stability of accuracy for the training of deep neural networks. Here the training of a DNN is preformed via the minimization of a cross-entropy loss function and the performance metric is the accuracy (the proportion of objects classified correctly). While training amounts to the decrease of loss, the accuracy does not necessarily increase during the training. A recent result by Berlyand, Jabin and Safsten introduces a doubling condition on the training data which ensures the stability of accuracy during training for DNNs with the absolute value activation function. For training data in ^n, this doubling condition is formulated using slabs in ^n and it depends on the choice of the slabs. The goal of this paper is twofold. First to make the doubling condition uniform, that is independent on the choice of slabs leading to sufficient conditions for stability in terms of training data only. Second to extend the original stability results for the absolute value activation function to a broader class of piecewise linear activation function with finitely many critical points such as the popular Leaky ReLU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/23/2023

Improving Classification Neural Networks by using Absolute activation function (MNIST/LeNET-5 example)

The paper discusses the use of the Absolute activation function in class...
research
02/10/2020

Stability for the Training of Deep Neural Networks and Other Classifiers

We examine the stability of loss-minimizing training processes that are ...
research
04/10/2023

Criticality versus uniformity in deep neural networks

Deep feedforward networks initialized along the edge of chaos exhibit ex...
research
10/27/2018

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Activation functions influence behavior and performance of DNNs. Nonline...
research
07/31/2020

An Investigation on Deep Learning with Beta Stabilizer

Artificial neural networks (ANN) have been used in many applications suc...
research
09/25/2016

Accurate and Efficient Hyperbolic Tangent Activation Function on FPGA using the DCT Interpolation Filter

Implementing an accurate and fast activation function with low cost is a...
research
07/16/2019

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

Improving the accuracy and robustness of deep neural nets (DNNs) and ada...

Please sign up or login with your details

Forgot password? Click here to reset