Characterizing Inter-Layer Functional Mappings of Deep Learning Models

by   Donald Waagen, et al.

Deep learning architectures have demonstrated state-of-the-art performance for object classification and have become ubiquitous in commercial products. These methods are often applied without understanding (a) the difficulty of a classification task given the input data, and (b) how a specific deep learning architecture transforms that data. To answer (a) and (b), we illustrate the utility of a multivariate nonparametric estimator of class separation, the Henze-Penrose (HP) statistic, in the original as well as layer-induced representations. Given an N-class problem, our contribution defines the C(N,2) combinations of HP statistics as a sample from a distribution of class-pair separations. This allows us to characterize the distributional change to class separation induced at each layer of the model. Fisher permutation tests are used to detect statistically significant changes within a model. By comparing the HP statistic distributions between layers, one can statistically characterize: layer adaptation during training, the contribution of each layer to the classification task, and the presence or absence of consistency between training and validation data. This is demonstrated for a simple deep neural network using CIFAR10 with random-labels, CIFAR10, and MNIST datasets.



There are no comments yet.


page 6

page 14

page 20

page 21

page 22

page 23

page 24

page 37


How deep is deep enough? - Optimizing deep neural network architecture

Deep neural networks use stacked layers of feature detectors to repeated...

Is It Time to Redefine the Classification Task for Deep Neural Networks?

Deep neural networks (DNNs) is demonstrated to be vulnerable to the adve...

Investigating spatial scan statistics for multivariate functional data

This paper introduces new scan statistics for multivariate functional da...

Training the Convolutional Neural Network with Statistical Dependence of the Response on the Input Data Distortion

The paper proposes an approach to training a convolutional neural networ...

Intermediate efficiency of some weighted goodness-of-fit statistics

This paper compares the Anderson-Darling and some Eicker-Jaeschke statis...

Two-sample Testing Using Deep Learning

We propose a two-sample testing procedure based on learned deep neural n...

Prevalence of Neural Collapse during the terminal phase of deep learning training

Modern practice for training classification deepnets involves a Terminal...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.