Channel-Wise Early Stopping without a Validation Set via NNK Polytope Interpolation

07/27/2021
by   David Bonet, et al.
2

State-of-the-art neural network architectures continue to scale in size and deliver impressive generalization results, although this comes at the expense of limited interpretability. In particular, a key challenge is to determine when to stop training the model, as this has a significant impact on generalization. Convolutional neural networks (ConvNets) comprise high-dimensional feature spaces formed by the aggregation of multiple channels, where analyzing intermediate data representations and the model's evolution can be challenging owing to the curse of dimensionality. We present channel-wise DeepNNK (CW-DeepNNK), a novel channel-wise generalization estimate based on non-negative kernel regression (NNK) graphs with which we perform local polytope interpolation on low-dimensional channels. This method leads to instance-based interpretability of both the learned data representations and the relationship between channels. Motivated by our observations, we use CW-DeepNNK to propose a novel early stopping criterion that (i) does not require a validation set, (ii) is based on a task performance metric, and (iii) allows stopping to be reached at different points for each channel. Our experiments demonstrate that our proposed method has advantages as compared to the standard criterion based on validation set performance.

READ FULL TEXT

page 1

page 5

page 6

page 7

research
03/28/2017

Early Stopping without a Validation Set

Early stopping is a widely used technique to prevent poor generalization...
research
10/18/2021

Channel redundancy and overlap in convolutional neural networks with channel-wise NNK graphs

Feature spaces in the deep layers of convolutional neural networks (CNNs...
research
08/03/2022

Improving Meta-Learning Generalization with Activation-Based Early-Stopping

Meta-Learning algorithms for few-shot learning aim to train neural netwo...
research
08/19/2022

Intersection of Parallels as an Early Stopping Criterion

A common way to avoid overfitting in supervised learning is early stoppi...
research
10/22/2021

Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model

Recent evidence has shown the existence of a so-called double-descent an...
research
02/05/2023

Achieving Robust Generalization for Wireless Channel Estimation Neural Networks by Designed Training Data

In this paper, we propose a method to design the training data that can ...
research
02/10/2021

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

We investigate the adversarial robustness of CNNs from the perspective o...

Please sign up or login with your details

Forgot password? Click here to reset