
On Learnability via Gradient Method for TwoLayer ReLU Neural Networks in TeacherStudent Setting
Deep learning empirically achieves high performance in many applications...
read it

FitNets: Hints for Thin Deep Nets
While depth tends to improve network performances, it also makes gradien...
read it

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
We analyze the dynamics of training deep ReLU networks and their implica...
read it

A new role for circuit expansion for learning in neural networks
Many sensory pathways in the brain rely on sparsely active populations o...
read it

Ternary Neural Networks for ResourceEfficient AI Applications
The computation and storage requirements for Deep Neural Networks (DNNs)...
read it

On the geometry of solutions and on the capacity of multilayer neural networks with ReLU activations
Rectified Linear Units (ReLU) have become the main model for the neural ...
read it

Active online learning in the binary perceptron problem
The binary perceptron is the simplest artificial neural network formed b...
read it
From complex to simple : hierarchical freeenergy landscape renormalized in deep neural networks
We develop a statistical mechanical approach based on the replica method to study the solution space of deep neural networks. Specifically we analyze the configuration space of the synaptic weights in a simple feedforward perceptron network within a Gaussian approximation for two scenarios : a setting with random inputs/outputs and a teacherstudent setting. By increasing the strength of constraints, i. e. increasing the number of imposed patterns, successive 2nd order glass transition (random inputs/outputs) or 2nd order crystalline transition (teacherstudent setting) take place place layerbylayer starting next to the inputs/outputs boundaries going deeper into the bulk. For deep enough network the central part of the network remains in the liquid phase. We argue that in systems of finite width, weak bias field remain in the central part and plays the role of a symmetry breaking field which connects the opposite sides of the system. In the setting with random inputs/outputs, the successive glass transitions bring about a hierarchical freeenergy landscape with ultrametricity, which evolves in space: it is most complex close to the boundaries but becomes renormalized into progressively simpler one in deeper layers. These observations provide clues to understand why deep neural networks operate efficiently. Finally we present results of a set of numerical simulations to examine the theoretical predictions.
READ FULL TEXT
Comments
There are no comments yet.