slimTrain – A Stochastic Approximation Method for Training Separable Deep Neural Networks

09/28/2021
by   Elizabeth Newman, et al.
0

Deep neural networks (DNNs) have shown their success as high-dimensional function approximators in many applications; however, training DNNs can be challenging in general. DNN training is commonly phrased as a stochastic optimization problem whose challenges include non-convexity, non-smoothness, insufficient regularization, and complicated data distributions. Hence, the performance of DNNs on a given task depends crucially on tuning hyperparameters, especially learning rates and regularization parameters. In the absence of theoretical guidelines or prior experience on similar tasks, this requires solving many training problems, which can be time-consuming and demanding on computational resources. This can limit the applicability of DNNs to problems with non-standard, complex, and scarce datasets, e.g., those arising in many scientific applications. To remedy the challenges of DNN training, we propose slimTrain, a stochastic optimization method for training DNNs with reduced sensitivity to the choice hyperparameters and fast initial convergence. The central idea of slimTrain is to exploit the separability inherent in many DNN architectures; that is, we separate the DNN into a nonlinear feature extractor followed by a linear model. This separability allows us to leverage recent advances made for solving large-scale, linear, ill-posed inverse problems. Crucially, for the linear weights, slimTrain does not require a learning rate and automatically adapts the regularization parameter. Since our method operates on mini-batches, its computational overhead per iteration is modest. In our numerical experiments, slimTrain outperforms existing DNN training methods with the recommended hyperparameter settings and reduces the sensitivity of DNN training to the remaining hyperparameters.

READ FULL TEXT

page 14

page 15

page 17

page 18

page 19

research
07/26/2020

Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection

Deep neural networks (DNNs) have achieved state-of-the-art performance a...
research
05/23/2018

A Unified Framework for Training Neural Networks

The lack of mathematical tractability of Deep Neural Networks (DNNs) has...
research
04/14/2021

Learning Regularization Parameters of Inverse Problems via Deep Neural Networks

In this work, we describe a new approach that uses deep neural networks ...
research
11/02/2022

POLICE: Provably Optimal Linear Constraint Enforcement for Deep Neural Networks

Deep Neural Networks (DNNs) outshine alternative function approximators ...
research
10/12/2020

TUTOR: Training Neural Networks Using Decision Rules as Model Priors

The human brain has the ability to carry out new tasks with limited expe...
research
03/07/2022

Robust Modeling of Unknown Dynamical Systems via Ensemble Averaged Learning

Recent work has focused on data-driven learning of the evolution of unkn...
research
01/29/2022

A Stochastic Bundle Method for Interpolating Networks

We propose a novel method for training deep neural networks that are cap...

Please sign up or login with your details

Forgot password? Click here to reset