Lifted Bregman Training of Neural Networks

08/18/2022
by   Xiaoyu Wang, et al.
11

We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions. Instead of estimating the parameters with a combination of first-order optimisation method and back-propagation (as is the state-of-the-art), we propose the use of non-smooth first-order optimisation methods that exploit the specific structure of the novel formulation. We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding compared to more conventional training frameworks.

READ FULL TEXT

page 27

page 30

page 31

research
02/21/2023

Unification of popular artificial neural network activation functions

We present a unified representation of the most popular neural network a...
research
05/22/2019

Effect of shapes of activation functions on predictability in the echo state network

We investigate prediction accuracy for time series of Echo state network...
research
12/12/2017

Backpropagation generalized for output derivatives

Backpropagation algorithm is the cornerstone for neural network analysis...
research
09/07/2023

Prime and Modulate Learning: Generation of forward models with signed back-propagation and environmental cues

Deep neural networks employing error back-propagation for learning can s...
research
06/01/2019

Evolution of Novel Activation Functions in Neural Network Training with Applications to Classification of Exoplanets

We present analytical exploration of novel activation functions as conse...
research
03/11/2020

Interpolated Adjoint Method for Neural ODEs

In this paper, we propose a method, which allows us to alleviate or comp...
research
09/01/2015

Learning Deep ℓ_0 Encoders

Despite its nonconvex nature, ℓ_0 sparse approximation is desirable in m...

Please sign up or login with your details

Forgot password? Click here to reset