Holistic Robust Data-Driven Decisions

07/19/2022
by   Amine Bennouna, et al.
0

The design of data-driven formulations for machine learning and decision-making with good out-of-sample performance is a key challenge. The observation that good in-sample performance does not guarantee good out-of-sample performance is generally known as overfitting. Practical overfitting can typically not be attributed to a single cause but instead is caused by several factors all at once. We consider here three overfitting sources: (i) statistical error as a result of working with finite sample data, (ii) data noise which occurs when the data points are measured only with finite precision, and finally (iii) data misspecification in which a small fraction of all data may be wholly corrupted. We argue that although existing data-driven formulations may be robust against one of these three sources in isolation they do not provide holistic protection against all overfitting sources simultaneously. We design a novel data-driven formulation which does guarantee such holistic protection and is furthermore computationally viable. Our distributionally robust optimization formulation can be interpreted as a novel combination of a Kullback-Leibler and Levy-Prokhorov robust optimization formulation. Finally, we show how in the context of classification and regression problems several popular regularized and robust formulations reduce to a particular case of our proposed more general formulation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2021

Learning and Decision-Making with Data: Optimal Formulations and Phase Transitions

We study the problem of designing optimal learning and decision-making f...
research
07/27/2022

Data-Driven Sample Average Approximation with Covariate Information

We study optimization for data-driven decision-making when we have obser...
research
11/27/2017

Bootstrap Robust Prescriptive Analytics

We address the problem of prescribing an optimal decision in a framework...
research
12/02/2020

Residuals-based distributionally robust optimization with covariate information

We consider data-driven approaches that integrate a machine learning pre...
research
08/20/2021

Distributionally Robust Learning

This monograph develops a comprehensive statistical learning framework t...
research
09/20/2023

Optimize-via-Predict: Realizing out-of-sample optimality in data-driven optimization

We examine a stochastic formulation for data-driven optimization wherein...
research
12/27/2021

Distributionally Robust Bootstrap Optimization

Control architectures and autonomy stacks for complex engineering system...

Please sign up or login with your details

Forgot password? Click here to reset