A Geometric Look at Double Descent Risk: Volumes, Singularities, and Distinguishabilities

06/08/2020
by   Prasad Cheema, et al.
1

The appearance of the double-descent risk phenomenon has received growing interest in the machine learning and statistics community, as it challenges well-understood notions behind the U-shaped train-test curves. Motivated through Rissanen's minimum description length (MDL), Balasubramanian's Occam's Razor, and Amari's information geometry, we investigate how the logarithm of the model volume: log V, works to extend intuition behind the AIC and BIC model selection criteria. We find that for the particular model classes of isotropic linear regression, statistical lattices, and the stochastic perceptron unit, the log V term may be decomposed into a sum of distinct components. These components work to extend the idea of model complexity inherent in AIC and BIC, and are driven by new, albeit intuitive notions of (i) Model richness, and (ii) Model distinguishability. Our theoretical analysis assists in the understanding of how the double descent phenomenon may manifest, as well as why generalization error does not necessarily continue to grow with increasing model dimensionality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2023

Double Descent Demystified: Identifying, Interpreting Ablating the Sources of a Deep Learning Puzzle

Double descent is a surprising phenomenon in machine learning, in which ...
research
08/05/2019

A study in Rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning

The Rashomon effect occurs when many different explanations exist for th...
research
03/04/2020

Optimal Regularization Can Mitigate Double Descent

Recent empirical and theoretical studies have shown that many learning a...
research
11/18/2022

Understanding the double descent curve in Machine Learning

The theory of bias-variance used to serve as a guide for model selection...
research
05/31/2022

VC Theoretical Explanation of Double Descent

There has been growing interest in generalization performance of large m...
research
11/28/2020

Risk-Monotonicity in Statistical Learning

Acquisition of data is a difficult task in many applications of machine ...
research
04/07/2020

A Brief Prehistory of Double Descent

In their thought-provoking paper [1], Belkin et al. illustrate and discu...

Please sign up or login with your details

Forgot password? Click here to reset