Global Convergence of Frank Wolfe on One Hidden Layer Networks

02/06/2020
by   Alexandre d'Aspremont, et al.
0

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate O(1/T) where T is both the number of neurons and the number of calls to the oracle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

Learning ReLU Networks via Alternating Minimization

We propose and analyze a new family of algorithms for training neural ne...
research
09/16/2020

Activation Functions: Do They Represent A Trade-Off Between Modular Nature of Neural Networks And Task Performance

Current research suggests that the key factors in designing neural netwo...
research
12/31/2015

A single hidden layer feedforward network with only one neuron in the hidden layer can approximate any univariate function

The possibility of approximating a continuous function on a compact subs...
research
07/02/2020

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach

Structural equation models (SEMs) are widely used in sciences, ranging f...
research
11/20/2020

A global universality of two-layer neural networks with ReLU activations

In the present study, we investigate a universality of neural networks, ...
research
11/04/2021

Rate of Convergence of Polynomial Networks to Gaussian Processes

We examine one-hidden-layer neural networks with random weights. It is w...
research
02/11/2023

Global Convergence Rate of Deep Equilibrium Models with General Activations

In a recent paper, Ling et al. investigated the over-parametrized Deep E...

Please sign up or login with your details

Forgot password? Click here to reset