Message Passing Descent for Efficient Machine Learning

02/16/2021
by   Francesco Concetti, et al.
0

We propose a new iterative optimization method for the Data-Fitting (DF) problem in Machine Learning, e.g. Neural Network (NN) training. The approach relies on Graphical Model (GM) representation of the DF problem, where variables are fitting parameters and factors are associated with the Input-Output (IO) data. The GM results in the Belief Propagation Equations considered in the Large Deviation Limit corresponding to the practically important case when the number of the IO samples is much larger than the number of the fitting parameters. We suggest the Message Passage Descent algorithm which relies on the piece-wise-polynomial representation of the model DF function. In contrast with the popular gradient descent and related algorithms our MPD algorithm rely on analytic (not automatic) differentiation, while also (and most importantly) it descents through the rugged DF landscape by making non local updates of the parameters at each iteration. The non-locality guarantees that the MPD is not trapped in the local-minima, therefore resulting in better performance than locally-updated algorithms of the gradient-descent type. We illustrate superior performance of the algorithm on a Feed-Forward NN with a single hidden layer and a piece-wise-linear activation function.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/05/2018

Lifted Proximal Operator Machines

We propose a new optimization method for training feed-forward neural ne...
research
12/03/2017

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

We consider the problem of learning a one-hidden-layer neural network wi...
research
01/15/2019

Deep Learning-Aided Trainable Projected Gradient Decoding for LDPC Codes

We present a novel optimization-based decoding algorithm for LDPC codes ...
research
06/16/2021

Input Invex Neural Network

In this paper, we present a novel method to constrain invexity on Neural...
research
05/29/2019

How to iron out rough landscapes and get optimal performances: Replicated Gradient Descent and its application to tensor PCA

In many high-dimensional estimation problems the main task consists in m...
research
05/06/2021

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

For fixed training data and network parameters in the other layers the L...

Please sign up or login with your details

Forgot password? Click here to reset