ADMM for Efficient Deep Learning with Global Convergence

05/31/2019
by   Junxiang Wang, et al.
0

Alternating Direction Method of Multipliers (ADMM) has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent (SGD) as a deep learning optimizer. However, as an emerging domain, several challenges remain, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework for deep learning via ADMM (dlADMM) to address these challenges simultaneously. The parameters in each layer are updated backward and then forward so that the parameter information in each layer is exchanged efficiently. The time complexity is reduced from cubic to quadratic in (latent) feature dimensions via a dedicated algorithm design for subproblems that enhances them utilizing iterative quadratic approximations and backtracking. Finally, we provide the first proof of global convergence for an ADMM-based method (dlADMM) in a deep neural network problem under mild conditions. Experiments on benchmark datasets demonstrated that our proposed dlADMM algorithm outperforms most of the comparison methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2021

A Convergent ADMM Framework for Efficient Neural Network Training

As a well-known optimization framework, the Alternating Direction Method...
research
04/24/2016

Stochastic Variance-Reduced ADMM

The alternating direction method of multipliers (ADMM) is a powerful opt...
research
05/20/2021

Towards Quantized Model Parallelism for Graph-Augmented MLPs Based on Gradient-Free ADMM framework

The Graph Augmented Multi-layer Perceptron (GA-MLP) model is an attracti...
research
02/06/2019

A Convergence Analysis of Nonlinearly Constrained ADMM in Deep Learning

Efficient training of deep neural networks (DNNs) is a challenge due to ...
research
01/27/2019

Large-Scale Classification using Multinomial Regression and ADMM

We present a novel method for learning the weights in multinomial logist...
research
06/10/2020

ADMMiRNN: Training RNN with Stable Convergence via An Efficient ADMM Approach

It is hard to train Recurrent Neural Network (RNN) with stable convergen...
research
12/14/2021

Efficient differentiable quadratic programming layers: an ADMM approach

Recent advances in neural-network architecture allow for seamless integr...

Please sign up or login with your details

Forgot password? Click here to reset