Log In Sign Up

The Loss Surfaces of Neural Networks with General Activation Functions

by   Nicholas P. Baskerville, et al.

We present results extending the foundational work of Choromanska et al (2015) on the complexity of the loss surfaces of multi-layer neural networks. We remove the strict reliance on specifically ReLU activation functions and obtain broadly the same results for general activation functions. This is achieved with piece-wise linear approximations to general activation functions, Kac-Rice calculations akin to those of Auffinger, Ben Arous and Černỳ (2013) and asymptotic analysis made possible by supersymmetric methods. Our results strengthen the case for the conclusions of Choromanska et al (2015) and the calculations contain various novel details required to deal with certain perturbations to the classical spin-glass calculations.


page 1

page 2

page 3

page 4


Complexity Measures for Neural Networks with General Activation Functions Using Path-based Norms

A simple approach is proposed to obtain complexity controls for neural n...

A Closer Look at Double Backpropagation

In recent years, an increasing number of neural network models have incl...

Neural Networks on Groups

Recent work on neural networks has shown that allowing them to build int...

L*ReLU: Piece-wise Linear Activation Functions for Deep Fine-grained Visual Categorization

Deep neural networks paved the way for significant improvements in image...

Efficient Neural Network Robustness Certification with General Activation Functions

Finding minimum distortion of adversarial examples and thus certifying r...

Approximating Activation Functions

ReLU is widely seen as the default choice for activation functions in ne...

Almost Sure Convergence of Dropout Algorithms for Neural Networks

We investigate the convergence and convergence rate of stochastic traini...