The Curious Case of Convex Networks

06/09/2020
by   Sarath Sivaprasad, et al.
0

In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input. We show that the convexity constraints can be enforced on both fully connected and convolutional layers, making them applicable to most architectures. The convexity constraints include restricting the weights (for all but the first layer) to be non-negative and using a non-decreasing convex activation function. Albeit simple, these constraints have profound implications on the generalization abilities of the network. We draw three valuable insights: (a) Input Output Convex Networks (IOC-NN) self regularize and almost uproot the problem of overfitting; (b) Although heavily constrained, they come close to the performance of the base architectures; and (c) The ensemble of convex networks can match or outperform the non convex counterparts. We demonstrate the efficacy of the proposed idea using thorough experiments and ablation studies on MNIST, CIFAR10, and CIFAR100 datasets with three different neural network architectures. The code for this project is publicly available at: <https://github.com/sarathsp1729/Convex-Networks>.

READ FULL TEXT

page 5

page 6

research
06/06/2018

Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex

Several recently proposed architectures of neural networks such as ResNe...
research
01/07/2021

Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully Polynomial-Time

The training of two-layer neural networks with nonlinear activation func...
research
05/24/2022

Constrained Monotonic Neural Networks

Deep neural networks are becoming increasingly popular in approximating ...
research
07/12/2021

Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

Generative Adversarial Networks (GANs) are commonly used for modeling co...
research
03/25/2022

Deformable Butterfly: A Highly Structured and Sparse Linear Transform

We introduce a new kind of linear transform named Deformable Butterfly (...
research
11/20/2018

Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training

Despite the recent successes of deep neural networks, the corresponding ...
research
11/06/2020

Deep learning architectures for inference of AC-OPF solutions

We present a systematic comparison between neural network (NN) architect...

Please sign up or login with your details

Forgot password? Click here to reset