A Bregman Learning Framework for Sparse Neural Networks

05/10/2021
by   Leon Bungert, et al.
0

We propose a learning framework based on stochastic Bregman iterations to train sparse neural networks with an inverse scale space approach. We derive a baseline algorithm called LinBreg, an accelerated version using momentum, and AdaBreg, which is a Bregmanized generalization of the Adam algorithm. In contrast to established methods for sparse training the proposed family of algorithms constitutes a regrowth strategy for neural networks that is solely optimization-based without additional heuristics. Our Bregman learning framework starts the training with very few initial parameters, successively adding only significant ones to obtain a sparse and expressive network. The proposed approach is extremely easy and efficient, yet supported by the rich mathematical theory of inverse scale space methods. We derive a statistically profound sparse parameter initialization strategy and provide a rigorous stochastic convergence analysis of the loss decay and additional convergence proofs in the convex regime. Using only 3.4 achieve 90.2 network. Our algorithm also unveils an autoencoder architecture for a denoising task. The proposed framework also has a huge potential for integrating sparse backpropagation and resource-friendly training.

READ FULL TEXT

page 23

page 24

research
05/30/2022

Last-iterate convergence analysis of stochastic momentum methods for neural networks

The stochastic momentum method is a commonly used acceleration technique...
research
08/30/2018

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

Stochastic momentum methods have been widely adopted in training deep ne...
research
02/09/2023

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

We provide a new efficient version of the backpropagation algorithm, spe...
research
02/02/2021

Truly Sparse Neural Networks at Scale

Recently, sparse training methods have started to be established as a de...
research
05/06/2023

Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees

In this paper, we present a comprehensive study on the convergence prope...
research
02/27/2018

A Mathematical Framework for Deep Learning in Elastic Source Imaging

An inverse elastic source problem with sparse measurements is of concern...
research
03/02/2023

Dodging the Sparse Double Descent

This paper presents an approach to addressing the issue of over-parametr...

Please sign up or login with your details

Forgot password? Click here to reset