Principled Deep Neural Network Training through Linear Programming

10/07/2018
by   Daniel Bienstock, et al.
0

Deep Learning has received significant attention due to its impressive performance in many state-of-the-art learning tasks. Unfortunately, while very powerful, Deep Learning is not well understood theoretically and in particular only recently results for the complexity of training deep neural networks have been obtained. In this work we show that large classes of deep neural networks with various architectures (e.g., DNNs, CNNs, Binary Neural Networks, and ResNets), activation functions (e.g., ReLUs and leaky ReLUs), and loss functions (e.g., Hinge loss, Euclidean loss, etc) can be trained to near optimality with desired target accuracy using linear programming in time that is exponential in the size of the architecture and polynomial in the size of the data set; this is the best one can hope for due to the NP-Hardness of the problem and in line with previous work. In particular, we obtain polynomial time algorithms for training for a given fixed network architecture. Our work applies more broadly to empirical risk minimization problems which allows us to generalize various previous results and obtain new complexity results for previously unstudied architectures in the proper learning setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2021

Neural networks with linear threshold activations: structure and algorithms

In this article we present new results on neural networks with linear th...
research
04/10/2017

On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks

Empirical risk minimization (ERM) is ubiquitous in machine learning and ...
research
06/24/2020

Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks

Deep neural networks (DNNs) are powerful learning machines that have ena...
research
01/22/2019

On Connected Sublevel Sets in Deep Learning

We study sublevel sets of the loss function in training deep neural netw...
research
05/18/2021

The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality

Understanding the computational complexity of training simple neural net...
research
06/09/2015

Learning to Linearize Under Uncertainty

Training deep feature hierarchies to solve supervised learning tasks has...
research
04/14/2023

The R-mAtrIx Net

We provide a novel Neural Network architecture that can: i) output R-mat...

Please sign up or login with your details

Forgot password? Click here to reset