Full-Jacobian Representation of Neural Networks

05/02/2019
by   Suraj Srinivas, et al.
70

Non-linear functions such as neural networks can be locally approximated by affine planes. Recent works make use of input-Jacobians, which describe the normal to these planes. In this paper, we introduce full-Jacobians, which includes this normal along with an additional intercept term called the bias-Jacobians, that together completely describe local planes. For ReLU neural networks, bias-Jacobians correspond to sums of gradients of outputs w.r.t. intermediate layer activations. We first use these full-Jacobians for distillation by aligning gradients of their intermediate representations. Next, we regularize bias-Jacobians alone to improve generalization. Finally, we show that full-Jacobian maps can be viewed as saliency maps. Experimental results show improved distillation on small data-sets, improved generalization for neural network training, and sharper saliency maps.

READ FULL TEXT
research
11/10/2020

Removing Brightness Bias in Rectified Gradients

Interpretation and improvement of deep neural networks relies on better ...
research
11/29/2021

Improving Deep Learning Interpretability by Saliency Guided Training

Saliency methods have been widely used to highlight important input feat...
research
06/12/2022

InBiaseD: Inductive Bias Distillation to Improve Generalization and Robustness through Shape-awareness

Humans rely less on spurious correlations and trivial cues, such as text...
research
09/29/2020

Trustworthy Convolutional Neural Networks: A Gradient Penalized-based Approach

Convolutional neural networks (CNNs) are commonly used for image classif...
research
02/13/2023

Dataset Distillation with Convexified Implicit Gradients

We propose a new dataset distillation algorithm using reparameterization...
research
09/25/2019

Neural networks are a priori biased towards Boolean functions with low entropy

Understanding the inductive bias of neural networks is critical to expla...
research
06/01/2021

Natural Statistics of Network Activations and Implications for Knowledge Distillation

In a matter that is analog to the study of natural image statistics, we ...

Please sign up or login with your details

Forgot password? Click here to reset