
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Deep learning models are often successfully trained using gradient desce...
read it

When is a Convolutional Filter Easy To Learn?
We analyze the convergence of (stochastic) gradient descent algorithm fo...
read it

Learning OverParametrized TwoLayer ReLU Neural Networks beyond NTK
We consider the dynamic of gradient descent for learning a twolayer neu...
read it

Learning ReLU Networks via Alternating Minimization
We propose and analyze a new family of algorithms for training neural ne...
read it

On the geometry of solutions and on the capacity of multilayer neural networks with ReLU activations
Rectified Linear Units (ReLU) have become the main model for the neural ...
read it

Tensor Switching Networks
We present a novel neural network algorithm, the Tensor Switching (TS) n...
read it

Local Geometry of OneHiddenLayer Neural Networks for Logistic Regression
We study the local geometry of a onehiddenlayer fullyconnected neural...
read it
Learning Onehiddenlayer ReLU Networks via Gradient Descent
We study the problem of learning onehiddenlayer neural networks with Rectified Linear Unit (ReLU) activation function, where the inputs are sampled from standard Gaussian distribution and the outputs are generated from a noisy teacher network. We analyze the performance of gradient descent for training such kind of neural networks based on empirical risk minimization, and provide algorithmdependent guarantees. In particular, we prove that tensor initialization followed by gradient descent can converge to the groundtruth parameters at a linear rate up to some statistical error. To the best of our knowledge, this is the first work characterizing the recovery guarantee for practical learning of onehiddenlayer ReLU networks with multiple neurons. Numerical experiments verify our theoretical findings.
READ FULL TEXT
Comments
There are no comments yet.