Distributionally Robust Optimization and Generalization in Kernel Methods

by   Matthew Staib, et al.

Distributionally robust optimization (DRO) has attracted attention in machine learning due to its connections to regularization, generalization, and robustness. Existing work has considered uncertainty sets based on phi-divergences and Wasserstein distances, each of which have drawbacks. In this paper, we study DRO with uncertainty sets measured via maximum mean discrepancy (MMD). We show that MMD DRO is roughly equivalent to regularization by the Hilbert norm and, as a byproduct, reveal deep connections to classic results in statistical learning. In particular, we obtain an alternative proof of a generalization bound for Gaussian kernel ridge regression via a DRO lense. The proof also suggests a new regularizer. Our results apply beyond kernel methods: we derive a generically applicable approximation of MMD DRO, and show that it generalizes recent work on variance-based regularization.


page 1

page 2

page 3

page 4


Distributional Robustness with IPMs and links to Regularization and GANs

Robustness to adversarial attacks is an important concern due to the fra...

From Smooth Wasserstein Distance to Dual Sobolev Norm: Empirical Approximation and Statistical Applications

Statistical distances, i.e., discrepancy measures between probability di...

Distributional Robustness and Regularization in Reinforcement Learning

Distributionally Robust Optimization (DRO) has enabled to prove the equi...

Distributional Robustness Bounds Generalization Errors

Bayesian methods, distributionally robust optimization methods, and regu...

From Majorization to Interpolation: Distributionally Robust Learning using Kernel Smoothing

We study the function approximation aspect of distributionally robust op...

Distributionally Robust Mean-Variance Portfolio Selection with Wasserstein Distances

We revisit Markowitz's mean-variance portfolio selection model by consid...

Spectrally-truncated kernel ridge regression and its free lunch

Kernel ridge regression (KRR) is a well-known and popular nonparametric ...

Please sign up or login with your details

Forgot password? Click here to reset