Classification with Deep Neural Networks and Logistic Loss

07/31/2023
by   Zihan Zhang, et al.
0

Deep neural networks (DNNs) trained with the logistic loss (i.e., the cross entropy loss) have made impressive advancements in various binary classification tasks. However, generalization analysis for binary classification with DNNs and logistic loss remains scarce. The unboundedness of the target function for the logistic loss is the main obstacle to deriving satisfying generalization bounds. In this paper, we aim to fill this gap by establishing a novel and elegant oracle-type inequality, which enables us to deal with the boundedness restriction of the target function, and using it to derive sharp convergence rates for fully connected ReLU DNN classifiers trained with logistic loss. In particular, we obtain optimal convergence rates (up to log factors) only requiring the Hölder smoothness of the conditional class probability η of data. Moreover, we consider a compositional assumption that requires η to be the composition of several vector-valued functions of which each component function is either a maximum value function or a Hölder smooth function only depending on a small number of its input variables. Under this assumption, we derive optimal convergence rates (up to log factors) which are independent of the input dimension of data. This result explains why DNN classifiers can perform well in practical high-dimensional classification problems. Besides the novel oracle-type inequality, the sharp convergence rates given in our paper also owe to a tight error bound for approximating the natural logarithm function near zero (where it is unbounded) by ReLU DNNs. In addition, we justify our claims for the optimality of rates by proving corresponding minimax lower bounds. All these results are new in the literature and will deepen our theoretical understanding of classification with DNNs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/02/2021

Convergence rates of deep ReLU networks for multiclass classification

For classification problems, trained deep neural networks return probabi...
research
08/15/2023

Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks

This paper studies the binary classification of unbounded data from ℝ^d ...
research
12/10/2018

Fast convergence rates of deep neural networks for classification

We derive the fast convergence rates of a deep neural network (DNN) clas...
research
07/25/2022

Optimal Convergence Rates of Deep Neural Networks in a Classification Setting

We establish optimal convergence rates up to a log-factor for a class of...
research
03/26/2020

Nonconvex sparse regularization for deep neural networks and its optimality

Recent theoretical studies proved that deep neural network (DNN) estimat...
research
01/19/2020

Optimal Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting

Classifiers built with neural networks handle large-scale high-dimension...
research
02/24/2022

Optimal Learning Rates of Deep Convolutional Neural Networks: Additive Ridge Functions

Convolutional neural networks have shown extraordinary abilities in many...

Please sign up or login with your details

Forgot password? Click here to reset