Learning Neural Network Architectures using Backpropagation

11/17/2015
by   Suraj Srinivas, et al.
0

Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2015

Data-free parameter pruning for Deep Neural Networks

Deep Neural nets (NNs) with millions of parameters are at the heart of m...
research
01/31/2022

GENEOnet: A new machine learning paradigm based on Group Equivariant Non-Expansive Operators. An application to protein pocket detection

Nowadays there is a big spotlight cast on the development of techniques ...
research
01/18/2019

Machine Learning with Clos Networks

We present a new methodology for improving the accuracy of small neural ...
research
08/26/2019

Multi-Path Learnable Wavelet Neural Network for Image Classification

Despite the remarkable success of deep learning in pattern recognition, ...
research
03/08/2023

On the Benefits of Biophysical Synapses

The approximation capability of ANNs and their RNN instantiations, is st...
research
05/03/2022

Adaptable Adapters

State-of-the-art pretrained NLP models contain a hundred million to tril...
research
02/17/2020

Identifying Critical Neurons in ANN Architectures using Mixed Integer Programming

We introduce a novel approach to optimize the architecture of deep neura...

Please sign up or login with your details

Forgot password? Click here to reset