Adam Induces Implicit Weight Sparsity in Rectifier Neural Networks

12/19/2018
by   Atsushi Yaguchi, et al.
8

In recent years, deep neural networks (DNNs) have been applied to various machine leaning tasks, including image recognition, speech recognition, and machine translation. However, large DNN models are needed to achieve state-of-the-art performance, exceeding the capabilities of edge devices. Model reduction is thus needed for practical use. In this paper, we point out that deep learning automatically induces group sparsity of weights, in which all weights connected to an output channel (node) are zero, when training DNNs under the following three conditions: (1) rectified-linear-unit (ReLU) activations, (2) an L_2-regularized objective function, and (3) the Adam optimizer. Next, we analyze this behavior both theoretically and experimentally, and propose a simple model reduction method: eliminate the zero weights after training the DNN. In experiments on MNIST and CIFAR-10 datasets, we demonstrate the sparsity with various training setups. Finally, we show that our method can efficiently reduce the model size and performs well relative to methods that use a sparsity-inducing regularizer.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

research
11/01/2020

An Embarrassingly Simple Approach to Training Ternary Weight Networks

Deep neural networks (DNNs) have achieved great successes in various dom...
research
09/17/2020

Holistic Filter Pruning for Efficient Deep Neural Networks

Deep neural networks (DNNs) are usually over-parameterized to increase t...
research
07/06/2018

Sparse Deep Neural Network Exact Solutions

Deep neural networks (DNNs) have emerged as key enablers of machine lear...
research
03/16/2018

TBD: Benchmarking and Analyzing Deep Neural Network Training

The recent popularity of deep neural networks (DNNs) has generated a lot...
research
02/18/2018

Efficient Sparse-Winograd Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are computationally intensive, whic...
research
07/26/2020

Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection

Deep neural networks (DNNs) have achieved state-of-the-art performance a...
research
03/23/2016

Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices

In recent years, deep neural networks (DNN) have demonstrated significan...

Please sign up or login with your details

Forgot password? Click here to reset