Conditional Information Gain Networks

07/25/2018
by   Ufuk Can Biçici, et al.
0

Deep neural network models owe their representational power to the high number of learnable parameters. It is often infeasible to run these largely parametrized deep models in limited resource environments, like mobile phones. Network models employing conditional computing are able to reduce computational requirements while achieving high representational power, with their ability to model hierarchies. We propose Conditional Information Gain Networks, which allow the feed forward deep neural networks to execute conditionally, skipping parts of the model based on the sample and the decision mechanisms inserted in the architecture. These decision mechanisms are trained using cost functions based on differentiable Information Gain, inspired by the training procedures of decision trees. These information gain based decision mechanisms are differentiable and can be trained end-to-end using a unified framework with a general cost function, covering both classification and decision losses. We test the effectiveness of the proposed method on MNIST and recently introduced Fashion MNIST datasets and show that our information gain based conditional execution approach can achieve better or comparable classification results using significantly fewer parameters, compared to standard convolutional neural network baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2017

Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution

We introduce Dynamic Deep Neural Networks (D2NN), a new type of feed-for...
research
02/18/2020

The Tree Ensemble Layer: Differentiability meets Conditional Computation

Neural networks and tree ensembles are state-of-the-art learners, each w...
research
08/01/2017

Natural Language Processing with Small Feed-Forward Networks

We show that small and shallow feed-forward neural networks can achieve ...
research
02/12/2019

Improving learnability of neural networks: adding supplementary axes to disentangle data representation

Over-parameterized deep neural networks have proven to be able to learn ...
research
01/02/2020

Lightweight Residual Densely Connected Convolutional Neural Network

Extremely efficient convolutional neural network architectures are one o...
research
11/08/2020

Exploring End-to-End Differentiable Natural Logic Modeling

We explore end-to-end trained differentiable models that integrate natur...
research
12/18/2019

P-CapsNets: a General Form of Convolutional Neural Networks

We propose Pure CapsNets (P-CapsNets) which is a generation of normal CN...

Please sign up or login with your details

Forgot password? Click here to reset