Efficient Policy Learning from Surrogate-Loss Classification Reductions

02/12/2020
by   Andrew Bennett, et al.
11

Recent work on policy learning from observational data has highlighted the importance of efficient policy evaluation and has proposed reductions to weighted (cost-sensitive) classification. But, efficient policy evaluation need not yield efficient estimation of policy parameters. We consider the estimation problem given by a weighted surrogate-loss classification reduction of policy learning with any score function, either direct, inverse-propensity weighted, or doubly robust. We show that, under a correct specification assumption, the weighted classification formulation need not be efficient for policy parameters. We draw a contrast to actual (possibly weighted) binary classification, where correct specification implies a parametric model, while for policy learning it only implies a semiparametric model. In light of this, we instead propose an estimation approach based on generalized method of moments, which is efficient for the policy parameters. We propose a particular method based on recent developments on solving moment problems using neural networks and demonstrate the efficiency and regret benefits of this method empirically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2021

Constrained Classification and Policy Learning

Modern machine learning approaches to classification, including AdaBoost...
research
01/20/2023

Offline Policy Evaluation with Out-of-Sample Guarantees

We consider the problem of evaluating the performance of a decision poli...
research
05/24/2019

Semi-Parametric Efficient Policy Learning with Continuous Actions

We consider off-policy evaluation and optimization with continuous actio...
research
02/25/2017

Adaptive Neural Networks for Efficient Inference

We present an approach to adaptively utilize deep neural networks in ord...
research
01/06/2020

Consistent Batch Normalization for Weighted Loss in Imbalanced-Data Environment

In this study, we consider classification problems based on neural netwo...
research
07/03/2018

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

In this work, we consider the problem of estimating a behaviour policy f...
research
04/06/2020

Comment: Entropy Learning for Dynamic Treatment Regimes

I congratulate Profs. Binyan Jiang, Rui Song, Jialiang Li, and Donglin Z...

Please sign up or login with your details

Forgot password? Click here to reset