Beyond Ridge Regression for Distribution-Free Data

06/17/2022
by   Koby Bibas, et al.
0

In supervised batch learning, the predictive normalized maximum likelihood (pNML) has been proposed as the min-max regret solution for the distribution-free setting, where no distributional assumptions are made on the data. However, the pNML is not defined for a large capacity hypothesis class as over-parameterized linear regression. For a large class, a common approach is to use regularization or a model prior. In the context of online prediction where the min-max solution is the Normalized Maximum Likelihood (NML), it has been suggested to use NML with “luckiness”: A prior-like function is applied to the hypothesis class, which reduces its effective size. Motivated by the luckiness concept, for linear regression we incorporate a luckiness function that penalizes the hypothesis proportionally to its l2 norm. This leads to the ridge regression solution. The associated pNML with luckiness (LpNML) prediction deviates from the ridge regression empirical risk minimizer (Ridge ERM): When the test data reside in the subspace corresponding to the small eigenvalues of the empirical correlation matrix of the training data, the prediction is shifted toward 0. Our LpNML reduces the Ridge ERM error by up to 20 distribution shift compared to recent leading methods for UCI sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2021

The Predictive Normalized Maximum Likelihood for Over-parameterized Linear Regression with Norm Constraint: Regret and Double Descent

A fundamental tenet of learning theory is that a trade-off exists betwee...
research
08/05/2021

Interpolation can hurt robust generalization even when there is no noise

Numerous recent works show that overparameterization implicitly reduces ...
research
05/12/2019

A New Look at an Old Problem: A Universal Learning Approach to Linear Regression

Linear regression is a classical paradigm in statistics. A new look at i...
research
11/20/2020

Efficient Data-Dependent Learnability

The predictive normalized maximum likelihood (pNML) approach has recentl...
research
10/18/2021

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection

Detecting out-of-distribution (OOD) samples is vital for developing mach...
research
05/23/2023

On the Size and Approximation Error of Distilled Sets

Dataset Distillation is the task of synthesizing small datasets from lar...
research
03/13/2018

Takeuchi's Information Criteria as a form of Regularization

Takeuchi's Information Criteria (TIC) is a linearization of maximum like...

Please sign up or login with your details

Forgot password? Click here to reset