The Expressivity and Training of Deep Neural Networks: toward the Edge of Chaos?

10/11/2019
by   Gege Zhang, et al.
0

Expressivity is one of the most significant issues in assessing neural networks. In this paper, we provide a quantitative analysis of the expressivity from dynamic models, where Hilbert space is employed to analyze its convergence and criticality. From the feature mapping of several widely used activation functions made by Hermite polynomials, We found sharp declines or even saddle points in the feature space, which stagnate the information transfer in deep neural networks, then present an activation function design based on the Hermite polynomials for better utilization of spatial representation. Moreover, we analyze the information transfer of deep neural networks, emphasizing the convergence problem caused by the mismatch between input and topological structure. We also study the effects of input perturbations and regularization operators on critical expressivity. Finally, we verified the proposed method by multivariate time series prediction. The results show that the optimized DeepESN provides higher predictive performance, especially for long-term prediction. Our theoretical analysis reveals that deep neural networks use spatial domains for information representation and evolve to the edge of chaos as depth increases. In actual training, whether a particular network can ultimately arrive that depends on its ability to overcome convergence and pass information to the required network depth.

READ FULL TEXT
research
02/19/2019

On the Impact of the Activation Function on Deep Neural Networks Training

The weight initialization and the activation function of deep neural net...
research
04/10/2023

Criticality versus uniformity in deep neural networks

Deep feedforward networks initialized along the edge of chaos exhibit ex...
research
06/02/2023

Uniform Convergence of Deep Neural Networks with Lipschitz Continuous Activation Functions and Variable Widths

We consider deep neural networks with a Lipschitz continuous activation ...
research
09/06/2020

Higher-order Quasi-Monte Carlo Training of Deep Neural Networks

We present a novel algorithmic approach and an error analysis leveraging...
research
04/09/2020

Mehler's Formula, Branching Process, and Compositional Kernels of Deep Neural Networks

In this paper, we utilize a connection between compositional kernels and...
research
09/03/2021

Using Topological Framework for the Design of Activation Function and Model Pruning in Deep Neural Networks

Success of deep neural networks in diverse tasks across domains of compu...
research
02/15/2023

Excess risk bound for deep learning under weak dependence

This paper considers deep neural networks for learning weakly dependent ...

Please sign up or login with your details

Forgot password? Click here to reset