Generalization Bounds for Vicinal Risk Minimization Principle

11/11/2018
by   Chao Zhang, et al.
0

The vicinal risk minimization (VRM) principle, first proposed by vapnik1999nature, is an empirical risk minimization (ERM) variant that replaces Dirac masses with vicinal functions. Although there is strong numerical evidence showing that VRM outperforms ERM if appropriate vicinal functions are chosen, a comprehensive theoretical understanding of VRM is still lacking. In this paper, we study the generalization bounds for VRM. Our results support Vapnik's original arguments and additionally provide deeper insights into VRM. First, we prove that the complexity of function classes convolving with vicinal functions can be controlled by that of the original function classes under the assumption that the function class is composed of Lipschitz-continuous functions. Then, the resulting generalization bounds for VRM suggest that the generalization performance of VRM is also effected by the choice of vicinity function and the quality of function classes. These findings can be used to examine whether the choice of vicinal function is appropriate for the VRM-based learning setting. Finally, we provide a theoretical explanation for existing VRM models, e.g., uniform distribution-based models, Gaussian distribution-based models, and mixup models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2019

Diametrical Risk Minimization: Theory and Computations

The theoretical and empirical performance of Empirical Risk Minimization...
research
06/07/2021

Encoding-dependent generalization bounds for parametrized quantum circuits

A large body of recent work has begun to explore the potential of parame...
research
11/12/2017

On the ERM Principle with Networked Data

Networked data, in which every training example involves two objects and...
research
10/22/2018

Adversarial Risk Bounds for Binary Classification via Function Transformation

We derive new bounds for a notion of adversarial risk, characterizing th...
research
03/29/2018

Structural Risk Minimization for C^1,1(R^d) Regression

One means of fitting functions to high-dimensional data is by providing ...
research
08/20/2018

The Mismatch Principle: Statistical Learning Under Large Model Uncertainties

We study the learning capacity of empirical risk minimization with regar...
research
11/15/2022

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Modern deep learning systems are fragile and do not generalize well unde...

Please sign up or login with your details

Forgot password? Click here to reset