Using Topological Framework for the Design of Activation Function and Model Pruning in Deep Neural Networks

09/03/2021
by   Yogesh Kochar, et al.
0

Success of deep neural networks in diverse tasks across domains of computer vision, speech recognition and natural language processing, has necessitated understanding the dynamics of training process and also working of trained models. Two independent contributions of this paper are 1) Novel activation function for faster training convergence 2) Systematic pruning of filters of models trained irrespective of activation function. We analyze the topological transformation of the space of training samples as it gets transformed by each successive layer during training, by changing the activation function. The impact of changing activation function on the convergence during training is reported for the task of binary classification. A novel activation function aimed at faster convergence for classification tasks is proposed. Here, Betti numbers are used to quantify topological complexity of data. Results of experiments on popular synthetic binary classification datasets with large Betti numbers(>150) using MLPs are reported. Results show that the proposed activation function results in faster convergence requiring fewer epochs by a factor of 1.5 to 2, since Betti numbers reduce faster across layers with the proposed activation function. The proposed methodology was verified on benchmark image datasets: fashion MNIST, CIFAR-10 and cat-vs-dog images, using CNNs. Based on empirical results, we propose a novel method for pruning a trained model. The trained model was pruned by eliminating filters that transform data to a topological space with large Betti numbers. All filters with Betti numbers greater than 300 were removed from each layer without significant reduction in accuracy. This resulted in faster prediction time and reduced memory size of the model.

READ FULL TEXT

page 1

page 2

research
11/13/2022

Evaluating CNN with Oscillatory Activation Function

The reason behind CNNs capability to learn high-dimensional complex feat...
research
04/13/2020

Topology of deep neural networks

We study how the topology of a data set M = M_a ∪ M_b ⊆R^d, representing...
research
10/15/2020

An Algorithm for Learning Smaller Representations of Models With Scarce Data

We present a greedy algorithm for solving binary classification problems...
research
03/22/2020

TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks

Lightweight or mobile neural networks used for real-time computer vision...
research
09/05/2020

Binary Classification as a Phase Separation Process

We propose a new binary classification model called Phase Separation Bin...
research
07/14/2022

DropNet: Reducing Neural Network Complexity via Iterative Pruning

Modern deep neural networks require a significant amount of computing ti...
research
10/11/2019

The Expressivity and Training of Deep Neural Networks: toward the Edge of Chaos?

Expressivity is one of the most significant issues in assessing neural n...

Please sign up or login with your details

Forgot password? Click here to reset