Optimization Approaches for Counterfactual Risk Minimization with Continuous Actions

04/22/2020
by   Houssam Zenati, et al.
4

Counterfactual reasoning from logged data has become increasingly important for a large range of applications such as web advertising or healthcare. In this paper, we address the problem of counterfactual risk minimization for learning a stochastic policy with a continuous action space. Whereas previous works have mostly focused on deriving statistical estimators with importance sampling, we show that the optimization perspective is equally important for solving the resulting nonconvex optimization problems.Specifically, we demonstrate the benefits of proximal point algorithms and soft-clipping estimators which are more amenable to gradient-based optimization than classical hard clipping. We propose multiple synthetic, yet realistic, evaluation setups, and we release a new large-scale dataset based on web advertising data for this problem that is crucially missing public benchmarks.

READ FULL TEXT
research
02/23/2023

Sequential Counterfactual Risk Minimization

Counterfactual Risk Minimization (CRM) is a framework for dealing with t...
research
09/15/2022

Semi-Counterfactual Risk Minimization Via Neural Networks

Counterfactual risk minimization is a framework for offline policy optim...
research
01/22/2018

Offline A/B testing for Recommender Systems

Before A/B testing online a new version of a recommender system, it is u...
research
06/14/2019

Distributionally Robust Counterfactual Risk Minimization

This manuscript introduces the idea of using Distributionally Robust Opt...
research
04/03/2017

A comparative study of counterfactual estimators

We provide a comparative study of several widely used off-policy estimat...
research
07/23/2019

Off-policy Learning for Multiple Loggers

It is well known that the historical logs are used for evaluating and le...
research
07/12/2023

Budgeting Counterfactual for Offline RL

The main challenge of offline reinforcement learning, where data is limi...

Please sign up or login with your details

Forgot password? Click here to reset