Equivalent and Approximate Transformations of Deep Neural Networks

05/27/2019
by   Abhinav Kumar, et al.
0

Two networks are equivalent if they produce the same output for any given input. In this paper, we study the possibility of transforming a deep neural network to another network with a different number of units or layers, which can be either equivalent, a local exact approximation, or a global linear approximation of the original network. On the practical side, we show that certain rectified linear units (ReLUs) can be safely removed from a network if they are always active or inactive for any valid input. If we only need an equivalent network for a smaller domain, then more units can be removed and some layers collapsed. On the theoretical side, we constructively show that for any feed-forward ReLU network, there exists a global linear approximation to a 2-hidden-layer shallow network with a fixed number of units. This result is a balance between the increasing number of units for arbitrary approximation with a single layer and the known upper bound of log(n_0+1) +1 layers for exact representation, where n_0 is the input dimension. While the transformed network may require an exponential number of units to capture the activation patterns of the original network, we show that it can be made substantially smaller by only accounting for the patterns that define linear regions. Based on experiments with ReLU networks on the MNIST dataset, we found that l_1-regularization and adversarial training reduces the number of linear regions significantly as the number of stable units increases due to weight sparsity. Therefore, we can also intentionally train ReLU networks to allow for effective loss-less compression and approximation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2016

Understanding Deep Neural Networks with Rectified Linear Units

In this paper we investigate the family of functions representable by de...
research
07/02/2019

Best k-layer neural network approximations

We investigate the geometry of the empirical risk minimization problem f...
research
10/31/2016

Tensor Switching Networks

We present a novel neural network algorithm, the Tensor Switching (TS) n...
research
07/14/2020

Bounding The Number of Linear Regions in Local Area for Neural Networks with ReLU Activations

The number of linear regions is one of the distinct properties of the ne...
research
10/11/2018

Bayesian neural networks increasingly sparsify their units with depth

We investigate deep Bayesian neural networks with Gaussian priors on the...
research
12/20/2013

On the number of response regions of deep feed forward networks with piece-wise linear activations

This paper explores the complexity of deep feedforward networks with lin...
research
07/23/2020

Hierarchical Verification for Adversarial Robustness

We introduce a new framework for the exact point-wise ℓ_p robustness ver...

Please sign up or login with your details

Forgot password? Click here to reset