Robustified Multivariate Regression and Classification Using Distributionally Robust Optimization under the Wasserstein Metric

06/10/2020
by   Ruidi Chen, et al.
0

We develop Distributionally Robust Optimization (DRO) formulations for Multivariate Linear Regression (MLR) and Multiclass Logistic Regression (MLG) when both the covariates and responses/labels may be contaminated by outliers. The DRO framework uses a probabilistic ambiguity set defined as a ball of distributions that are close to the empirical distribution of the training set in the sense of the Wasserstein metric. We relax the DRO formulation into a regularized learning problem whose regularizer is a norm of the coefficient matrix. We establish out-of-sample performance guarantees for the solutions to our model, offering insights on the role of the regularizer in controlling the prediction error. Experimental results show that our approach improves the predictive error by 7 MLG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2022

Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers

We develop a Distributionally Robust Optimization (DRO) formulation for ...
research
06/07/2017

Outlier Detection Using Distributionally Robust Optimization under the Wasserstein Metric

We present a Distributionally Robust Optimization (DRO) approach to outl...
research
09/27/2021

Distributionally Robust Multi-Output Regression Ranking

Despite their empirical success, most existing listwiselearning-to-rank ...
research
09/27/2021

Distributionally Robust Multiclass Classification and Applications in Deep CNN Image Classifiers

We develop a Distributionally Robust Optimization (DRO) formulation for ...
research
06/24/2020

Distributionally-Robust Machine Learning Using Locally Differentially-Private Data

We consider machine learning, particularly regression, using locally-dif...
research
01/29/2020

Regularization Helps with Mitigating Poisoning Attacks: Distributionally-Robust Machine Learning Using the Wasserstein Distance

We use distributionally-robust optimization for machine learning to miti...
research
07/12/2023

Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

Distribution data refers to a data set where each sample is represented ...

Please sign up or login with your details

Forgot password? Click here to reset