Offline Recommender System Evaluation under Unobserved Confounding

09/08/2023
by   Olivier Jeunen, et al.
0

Off-Policy Estimation (OPE) methods allow us to learn and evaluate decision-making policies from logged data. This makes them an attractive choice for the offline evaluation of recommender systems, and several recent works have reported successful adoption of OPE methods to this end. An important assumption that makes this work is the absence of unobserved confounders: random variables that influence both actions and rewards at data collection time. Because the data collection policy is typically under the practitioner's control, the unconfoundedness assumption is often left implicit, and its violations are rarely dealt with in the existing literature. This work aims to highlight the problems that arise when performing off-policy estimation in the presence of unobserved confounders, specifically focusing on a recommendation use-case. We focus on policy-based estimators, where the logging propensities are learned from logged data. We characterise the statistical bias that arises due to confounding, and show how existing diagnostics are unable to uncover such cases. Because the bias depends directly on the true and unobserved logging propensities, it is non-identifiable. As the unconfoundedness assumption is famously untestable, this becomes especially problematic. This paper emphasises this common, yet often overlooked issue. Through synthetic data, we empirically show how naïve propensity estimation under confounding can lead to severely biased metric estimates that are allowed to fly under the radar. We aim to cultivate an awareness among researchers and practitioners of this important problem, and touch upon potential research directions towards mitigating its effects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2020

Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding

When observed decisions depend only on observed features, off-policy pol...
research
02/01/2023

Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders

Offline reinforcement learning is important in domains such as medicine,...
research
03/30/2021

Benchmarks for Deep Off-Policy Evaluation

Off-policy evaluation (OPE) holds the promise of being able to leverage ...
research
04/13/2023

CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

Robots operating in real-world environments must reason about possible o...
research
06/01/2023

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

A prominent challenge of offline reinforcement learning (RL) is the issu...
research
12/19/2022

Policy learning "without” overlap: Pessimism and generalized empirical Bernstein's inequality

This paper studies offline policy learning, which aims at utilizing obse...
research
04/02/2022

Model-Free and Model-Based Policy Evaluation when Causality is Uncertain

When decision-makers can directly intervene, policy evaluation algorithm...

Please sign up or login with your details

Forgot password? Click here to reset