Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

06/26/2020
by   Tolga Ergen, et al.
19

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an ℓ_2 norm regularized convex program. We then show that three-layer CNN training problems are equivalent to an ℓ_1 regularized convex program that encourages sparsity in the spectral domain. We also extend these results to multi-layer CNN architectures including three-layer networks with two ReLU layers and deeper circular convolutions with a single ReLU layer. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-Layer Networks

We develop exact representations of two layer neural networks with recti...
research
10/18/2021

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Despite several attempts, the fundamental mechanisms behind the success ...
research
03/02/2021

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

Batch Normalization (BN) is a commonly used technique to accelerate and ...
research
07/18/2023

Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

The paper uses a frame-theoretic setting to study the injectivity of a R...
research
01/06/2022

Efficient Global Optimization of Two-layer ReLU Networks: Quadratic-time Algorithms and Adversarial Training

The non-convexity of the artificial neural network (ANN) training landsc...
research
11/07/2016

Trusting SVM for Piecewise Linear CNNs

We present a novel layerwise optimization algorithm for the learning obj...
research
12/27/2021

Distributionally Robust Bootstrap Optimization

Control architectures and autonomy stacks for complex engineering system...

Please sign up or login with your details

Forgot password? Click here to reset