S^2-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning

04/24/2019
by   Yanwei Fu, et al.
4

This paper proposes a novel Stochastic Split Linearized Bregman Iteration (S^2-LBI) algorithm to efficiently train the deep network. The S^2-LBI introduces an iterative regularization path with structural sparsity. Our S^2-LBI combines the computational efficiency of the LBI, and model selection consistency in learning the structural sparsity. The computed solution path intrinsically enables us to enlarge or simplify a network, which theoretically, is benefited from the dynamics property of our S^2-LBI algorithm. The experimental results validate our S^2-LBI on MNIST and CIFAR-10 dataset. For example, in MNIST, we can either boost a network with only 1.5K parameters (1 convolutional layer of 5 filters, and 1 FC layer), achieves 98.40% recognition accuracy; or we simplify 82.5% of parameters in LeNet-5 network, and still achieves the 98.47% recognition accuracy. In addition, we also have the learning results on ImageNet, which will be added in the next version of our report.

READ FULL TEXT

page 1

page 7

page 10

page 11

page 12

page 17

page 18

page 19

research
04/16/2017

Boosting with Structural Sparsity: A Differential Inclusion Approach

Boosting as gradient descent algorithms is one popular method in machine...
research
09/09/2014

Winner-Take-All Autoencoders

In this paper, we propose a winner-take-all method for learning hierarch...
research
08/13/2021

Spatio-Temporal Split Learning

This paper proposes a novel split learning framework with multiple end-s...
research
08/14/2023

A Novel Ehanced Move Recognition Algorithm Based on Pre-trained Models with Positional Embeddings

The recognition of abstracts is crucial for effectively locating the con...
research
09/27/2015

Deep Trans-layer Unsupervised Networks for Representation Learning

Learning features from massive unlabelled data is a vast prevalent topic...
research
03/30/2021

Controlling the False Discovery Rate in Structural Sparsity: Split Knockoffs

Controlling the False Discovery Rate (FDR) in a variable selection proce...
research
05/14/2020

Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?

One of the generally accepted views of modern deep learning is that incr...

Please sign up or login with your details

Forgot password? Click here to reset