Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms

12/24/2020
by   Arda Sahiner, et al.
16

We describe the convex semi-infinite dual of the two-layer vector-output ReLU neural network training problem. This semi-infinite dual admits a finite dimensional representation, but its support is over a convex set which is difficult to characterize. In particular, we demonstrate that the non-convex neural network training problem is equivalent to a finite-dimensional convex copositive program. Our work is the first to identify this strong connection between the global optima of neural networks and those of copositive programs. We thus demonstrate how neural networks implicitly attempt to solve copositive programs via semi-nonnegative matrix factorization, and draw key insights from this formulation. We describe the first algorithms for provably finding the global minimum of the vector output neural network training problem, which are polynomial in the number of samples for a fixed data rank, yet exponential in the dimension. However, in the case of convolutional architectures, the computational complexity is exponential in only the filter size and polynomial in all other parameters. We describe the circumstances in which we can find the global optimum of this neural network training problem exactly with soft-thresholded SVD, and provide a copositive relaxation which is guaranteed to be exact for certain classes of problems, and which corresponds with the solution of Stochastic Gradient Descent in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2020

Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK

We consider the dynamic of gradient descent for learning a two-layer neu...
research
01/07/2021

Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully Polynomial-Time

The training of two-layer neural networks with nonlinear activation func...
research
05/25/2021

Practical Convex Formulation of Robust One-hidden-layer Neural Network Training

Recent work has shown that the training of a one-hidden-layer, scalar-ou...
research
06/05/2023

Does a sparse ReLU network training problem always admit an optimum?

Given a training set, a loss function, and a neural network architecture...
research
12/31/2018

Convex Relaxations of Convolutional Neural Nets

We propose convex relaxations for convolutional neural nets with one hid...
research
05/23/2019

How degenerate is the parametrization of neural networks with the ReLU activation function?

Neural network training is usually accomplished by solving a non-convex ...
research
01/14/2022

A Kernel-Expanded Stochastic Neural Network

The deep neural network suffers from many fundamental issues in machine ...

Please sign up or login with your details

Forgot password? Click here to reset