Most Activation Functions Can Win the Lottery Without Excessive Depth

05/04/2022
by   Rebekka Burkholz, et al.
0

The strong lottery ticket hypothesis has highlighted the potential for training deep neural networks by pruning, which has inspired interesting practical and theoretical insights into how neural networks can represent functions. For networks with ReLU activation functions, it has been proven that a target network with depth L can be approximated by the subnetwork of a randomly initialized neural network that has double the target's depth 2L and is wider by a logarithmic factor. We show that a depth L+1 network is sufficient. This result indicates that we can expect to find lottery tickets at realistic, commonly used depths while only requiring logarithmic overparametrization. Our novel construction approach applies to a large class of activation functions and is not limited to ReLUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2023

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions

This paper explores the expressive power of deep neural networks for a d...
research
05/04/2022

Convolutional and Residual Networks Provably Contain Lottery Tickets

The Lottery Ticket Hypothesis continues to have a profound practical imp...
research
03/28/2020

Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks

We prove that a single step of gradient decent over depth two network, w...
research
10/05/2014

Understanding Locally Competitive Networks

Recently proposed neural network activation functions such as rectified ...
research
09/30/2018

Deep, Skinny Neural Networks are not Universal Approximators

In order to choose a neural network architecture that will be effective ...
research
01/25/2019

When Can Neural Networks Learn Connected Decision Regions?

Previous work has questioned the conditions under which the decision reg...
research
04/29/2022

Wide and Deep Neural Networks Achieve Optimality for Classification

While neural networks are used for classification tasks across domains, ...

Please sign up or login with your details

Forgot password? Click here to reset