Learning ReLU Networks via Alternating Minimization

06/20/2018
by   Gauri Jagatap, et al.
0

We propose and analyze a new family of algorithms for training neural networks with ReLU activations. Our algorithms are based on the technique of alternating minimization: estimating the activation patterns of each ReLU for all given samples, interleaved with weight updates via a least-squares step. We consider three different cases of this model: (i) a single ReLU; (ii) 1-hidden layer networks with k hidden ReLUs (iii) 2-hidden layer networks. We show that under standard distributional assumptions on the input data, our algorithm provably recovers the true "ground truth" parameters in a linearly convergent fashion; furthermore, our method exhibits requires only O(d) samples for the single ReLU case and O(dk^2) samples in the 1-hidden layer case. We also extend this framework to deeper networks, and empirically demonstrate its convergence to a global minimum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

Learning One-hidden-layer ReLU Networks via Gradient Descent

We study the problem of learning one-hidden-layer neural networks with R...
research
02/06/2020

Global Convergence of Frank Wolfe on One Hidden Layer Networks

We derive global convergence bounds for the Frank Wolfe algorithm when t...
research
08/14/2018

Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization

Neural networks with ReLU activations have achieved great empirical succ...
research
05/31/2023

Alternating Minimization for Regression with Tropical Rational Functions

We propose an alternating minimization heuristic for regression over the...
research
10/17/2018

Finite sample expressive power of small-width ReLU networks

We study universal finite sample expressivity of neural networks, define...
research
07/05/2019

Prior Activation Distribution (PAD): A Versatile Representation to Utilize DNN Hidden Units

In this paper, we introduce the concept of Prior Activation Distribution...
research
11/20/2020

A global universality of two-layer neural networks with ReLU activations

In the present study, we investigate a universality of neural networks, ...

Please sign up or login with your details

Forgot password? Click here to reset