Learning and Decision-Making with Data: Optimal Formulations and Phase Transitions

09/14/2021
by   M. Amine Bennouna, et al.
0

We study the problem of designing optimal learning and decision-making formulations when only historical data is available. Prior work typically commits to a particular class of data-driven formulation and subsequently tries to establish out-of-sample performance guarantees. We take here the opposite approach. We define first a sensible yard stick with which to measure the quality of any data-driven formulation and subsequently seek to find an optimal such formulation. Informally, any data-driven formulation can be seen to balance a measure of proximity of the estimated cost to the actual cost while guaranteeing a level of out-of-sample performance. Given an acceptable level of out-of-sample performance, we construct explicitly a data-driven formulation that is uniformly closer to the true cost than any other formulation enjoying the same out-of-sample performance. We show the existence of three distinct out-of-sample performance regimes (a superexponential regime, an exponential regime and a subexponential regime) between which the nature of the optimal data-driven formulation experiences a phase transition. The optimal data-driven formulations can be interpreted as a classically robust formulation in the superexponential regime, an entropic distributionally robust formulation in the exponential regime and finally a variance penalized formulation in the subexponential regime. This final observation unveils a surprising connection between these three, at first glance seemingly unrelated, data-driven formulations which until now remained hidden.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2022

Holistic Robust Data-Driven Decisions

The design of data-driven formulations for machine learning and decision...
research
06/20/2022

Beyond IID: data-driven decision-making in heterogeneous environments

In this work, we study data-driven decision-making and depart from the c...
research
09/20/2023

Optimize-via-Predict: Realizing out-of-sample optimality in data-driven optimization

We examine a stochastic formulation for data-driven optimization wherein...
research
07/27/2022

Data-Driven Sample Average Approximation with Covariate Information

We study optimization for data-driven decision-making when we have obser...
research
06/28/2022

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse

Real-world sequential decision making requires data-driven algorithms th...
research
06/30/2022

A Validity Perspective on Evaluating the Justified Use of Data-driven Decision-making Algorithms

This work seeks to center validity considerations in deliberations aroun...
research
09/18/2021

Scenario adaptive disruption prediction study for next generation burning-plasma tokamaks

Next generation high performance (HP) tokamaks risk damage from unmitiga...

Please sign up or login with your details

Forgot password? Click here to reset