Quality vs. Quantity of Data in Contextual Decision-Making: Exact Analysis under Newsvendor Loss

02/16/2023
by   Omar Besbes, et al.
0

When building datasets, one needs to invest time, money and energy to either aggregate more data or to improve their quality. The most common practice favors quantity over quality without necessarily quantifying the trade-off that emerges. In this work, we study data-driven contextual decision-making and the performance implications of quality and quantity of data. We focus on contextual decision-making with a Newsvendor loss. This loss is that of a central capacity planning problem in Operations Research, but also that associated with quantile regression. We consider a model in which outcomes observed in similar contexts have similar distributions and analyze the performance of a classical class of kernel policies which weigh data according to their similarity in a contextual space. We develop a series of results that lead to an exact characterization of the worst-case expected regret of these policies. This exact characterization applies to any sample size and any observed contexts. The model we develop is flexible, and captures the case of partially observed contexts. This exact analysis enables to unveil new structural insights on the learning behavior of uniform kernel methods: i) the specialized analysis leads to very large improvements in quantification of performance compared to state of the art general purpose bounds. ii) we show an important non-monotonicity of the performance as a function of data size not captured by previous bounds; and iii) we show that in some regimes, a little increase in the quality of the data can dramatically reduce the amount of samples required to reach a performance target. All in all, our work demonstrates that it is possible to quantify in a precise fashion the interplay of data quality and quantity, and performance in a central problem class. It also highlights the need for problem specific bounds in order to understand the trade-offs at play.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2022

Beyond IID: data-driven decision-making in heterogeneous environments

In this work, we study data-driven decision-making and depart from the c...
research
04/10/2022

Worst-case Performance of Greedy Policies in Bandits with Imperfect Context Observations

Contextual bandits are canonical models for sequential decision-making u...
research
07/12/2022

Contextual Bandits with Large Action Spaces: Made Practical

A central problem in sequential decision making is to develop algorithms...
research
10/07/2019

Designing Interfaces to Help Stakeholders Comprehend, Navigate, and Manage Algorithmic Trade-Offs

Artificial intelligence algorithms have been applied to a wide variety o...
research
03/10/2015

Doubly Robust Policy Evaluation and Optimization

We study sequential decision making in environments where rewards are on...
research
01/15/2019

The Bayesian Prophet: A Low-Regret Framework for Online Decision Making

Motivated by the success of using black-box predictive algorithms as sub...
research
05/23/2019

Learning When-to-Treat Policies

Many applied decision-making problems have a dynamic component: The poli...

Please sign up or login with your details

Forgot password? Click here to reset