Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

10/11/2021
by   Tolga Ergen, et al.
29

Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In this paper, we develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group ℓ_1-norm regularization. Consequently, ReLU networks can be interpreted as high dimensional feature selection methods. More importantly, we then prove that the equivalent convex problem can be globally optimized by a standard convex optimization solver with a polynomial-time complexity with respect to the number of samples and data dimension when the width of the network is fixed. Finally, we numerically validate our theoretical results via experiments involving both synthetic and real datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2021

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Despite several attempts, the fundamental mechanisms behind the success ...
research
03/02/2021

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

Batch Normalization (BN) is a commonly used technique to accelerate and ...
research
03/29/2020

High-dimensional Neural Feature using Rectified Linear Unit and Random Matrix Instance

We design a ReLU-based multilayer neural network to generate a rich high...
research
11/20/2022

Convexifying Transformers: Improving optimization and understanding of transformer networks

Understanding the fundamental mechanism behind the success of transforme...
research
09/30/2022

Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

The practice of deep learning has shown that neural networks generalize ...
research
02/02/2022

Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions

We develop fast algorithms and robust software for convex optimization o...
research
12/09/2020

Convex Regularization Behind Neural Reconstruction

Neural networks have shown tremendous potential for reconstructing high-...

Please sign up or login with your details

Forgot password? Click here to reset