Parallel Deep Neural Networks Have Zero Duality Gap

10/13/2021
by   Yifei Wang, et al.
5

Training deep neural networks is a well-known highly non-convex problem. In recent works, it is shown that there is no duality gap for regularized two-layer neural networks with ReLU activation, which enables global optimization via convex programs. For multi-layer linear networks with vector outputs, we formulate convex dual problems and demonstrate that the duality gap is non-zero for depth three and deeper networks. However, by modifying the deep networks to more powerful parallel architectures, we show that the duality gap is exactly zero. Therefore, strong convex duality holds, and hence there exist equivalent convex programs that enable training deep networks to global optimality. We also demonstrate that the weight decay regularization in the parameters explicitly encourages low-rank solutions via closed-form expressions. For three-layer non-parallel ReLU networks, we show that strong duality holds for rank-1 data matrices, however, the duality gap is non-zero for whitened data matrices. Similarly, by transforming the neural network architecture into a corresponding parallel version, the duality gap vanishes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2020

Convex Duality of Deep Neural Networks

We study regularized deep neural networks and introduce an analytic fram...
research
06/06/2018

Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex

Several recently proposed architectures of neural networks such as ResNe...
research
09/12/2018

Linear Algebra and Duality of Neural Networks

Natural for Neural networks bases, mappings, projections and metrics are...
research
03/02/2021

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

Batch Normalization (BN) is a commonly used technique to accelerate and ...
research
12/09/2020

Convex Regularization Behind Neural Reconstruction

Neural networks have shown tremendous potential for reconstructing high-...
research
07/18/2023

Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

The paper uses a frame-theoretic setting to study the injectivity of a R...
research
03/30/2021

A new line-symmetric mobile infinity-pod

We construct parallel manipulators with one degree of freedom and admitt...

Please sign up or login with your details

Forgot password? Click here to reset