We study the loss surface of DNNs with L_2 regularization. We show that
...
For deep linear networks (DLN), various hyperparameters alter the dynami...
We study how permutation symmetries in overparameterized multi-layer neu...
Modern mathematics is built on the idea that proofs should be translatab...
We study the risk (i.e. generalization error) of Kernel Ridge Regression...
Random Feature (RF) models are used as efficient parametric approximatio...
The dynamics of DNNs during gradient descent is described by the so-call...
In this paper, we analyze a number of architectural features of Deep Neu...
We provide a description for the evolution of the generalization perform...
At initialization, artificial neural networks (ANNs) are equivalent to
G...