A Convergent ADMM Framework for Efficient Neural Network Training

12/22/2021
by   Junxiang Wang, et al.
9

As a well-known optimization framework, the Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. Recently, it has attracted the attention of deep learning researchers and is considered to be a potential substitute to Gradient Descent (GD). However, as an emerging domain, several challenges remain unsolved, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework to solve a general neural network training problem via ADMM (dlADMM) to address these challenges simultaneously. Specifically, the parameters in each layer are updated backward and then forward so that parameter information in each layer is exchanged efficiently. When the dlADMM is applied to specific architectures, the time complexity of subproblems is reduced from cubic to quadratic via a dedicated algorithm design utilizing quadratic approximations and backtracking techniques. Last but not least, we provide the first proof of convergence to a critical point sublinearly for an ADMM-type method (dlADMM) under mild conditions. Experiments on seven benchmark datasets demonstrate the convergence, efficiency, and effectiveness of our proposed dlADMM algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2019

ADMM for Efficient Deep Learning with Global Convergence

Alternating Direction Method of Multipliers (ADMM) has been used success...
research
02/06/2019

A Convergence Analysis of Nonlinearly Constrained ADMM in Deep Learning

Efficient training of deep neural networks (DNNs) is a challenge due to ...
research
12/14/2021

Efficient differentiable quadratic programming layers: an ADMM approach

Recent advances in neural-network architecture allow for seamless integr...
research
09/06/2020

An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks

In this work, we present a hardware compatible neural network training a...
research
12/16/2020

ADMM and inexact ALM: the QP case

Embedding randomization procedures in the Alternating Direction Method o...
research
06/10/2020

ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

It is hard to train Recurrent Neural Network (RNN) with stable convergen...
research
01/27/2019

Large-Scale Classification using Multinomial Regression and ADMM

We present a novel method for learning the weights in multinomial logist...

Please sign up or login with your details

Forgot password? Click here to reset